Announcing support for extracting data from identity documents using Amazon Textract

Creating efficiencies in your business is at the top of your list. You want your employees to be more productive, have them focus on high impact tasks, or find ways to implement better processes to improve the outcomes to your customers. There are various ways to solve this problem, and more companies are turning to artificial intelligence (AI) and machine learning (ML) to help. In the financial services sector, there is the creation of new accounts online, or in healthcare there are new digital platforms to schedule and manage appointments, which require users to fill out forms. These can be error prone, time consuming, and certainly improved upon. Some businesses (or organizations) have attempted to simplify and automate this process by including identity document uploads, such as a drivers’ license or passport. However, the technology available is template-based and doesn’t scale well. You need a solution to help automate the extraction of information from identity documents to enable your customers to open bank accounts with ease, or schedule and manage appointments online using accurate information.

Today, we are excited to announce a new API to Amazon Textract called Analyze ID that will help you automatically extract information from identification documents, such as driver’s licenses and passports. Amazon Textract uses AI and ML technologies to extract information from identity documents, such as U.S. passports and driver’s licenses, without the need for templates or configuration. You can automatically extract specific information, such as date of expiry and date of birth, as well as intelligently identify and extract implied information, such as name and address.

We will cover the following topics in this post:

How Amazon Textract processes identity documents
A walkthrough of the Amazon Textract console
Structure of the Amazon Textract AnalyzeID API response
How to process the response with the Amazon Textract parser library

Identity Document processing using Amazon Textract

Companies have accelerated the adoption of digital platforms, especially in light of the COVID-19 pandemic. Organizations are now offering their users the flexibility to use smartphones and other mobile devices for everyday tasks—such as signing up for new accounts, scheduling appointments, completing employment applications online, and many more. Even though your users fill out an online form with personal and demographic information, the process is manual and error-prone, and it can affect the application decision if submitted incorrectly. Some of you have simplified and automated the online application process by asking your users to upload a picture of their ID, and then use market solutions to extract data and prefill the applications automatically. This automation can help you minimize data entry errors and potentially reduce end user abandonments in application completions. However, even the current market solutions are limited in what they can achieve. They often fall short when extracting all of the required fields accurately due to the rich background image on IDs or the inability to recognize names and addresses and the fields associated with them. For example, the Washington State driver license lists home addresses with the key “8”. Another major challenge with the current market solutions is that IDs have a different template or format depending on the issuing country and state, and even those can change from time-to-time. Therefore, the traditional template-based solutions do not work at scale. Even traditional OCR solutions are expensive and slow, especially when combined with human reviews, and they don’t move the needle in digital automation. These approaches provide poor results, thereby inhibiting your organization from scaling and becoming efficient. You need a solution to help automate the extraction of information from identity documents to enable your customers to open bank accounts with ease, or schedule and manage appointments online with accurate information.

To solve this problem, you can now use Amazon Textract’s newly launched Analyze ID API, powered by ML instead of a traditional template matching solution, to process identity documents at scale. It works with U.S. driver’s licenses and passports to extract relevant data, such as name, address, date of birth, date of expiry, place of issue, etc. Analyze ID API returns two categories of data types: (A) Key-value pairs available on IDs, such as Date of Birth, Date of Issue, ID #, Class, Height, and Restrictions. (B) Implied fields on the document that may not have explicit keys, such as Name, Address, and Issued By. The key-value pairs are also normalized into a common taxonomy (for example, Document ID number = LIC# or Passport No.). This lets you easily combine information across many IDs that use different terms for the same concept.

Amazon Textract console walkthrough

Before we get started with the API and code samples, let’s review the Amazon Textract console. The following images show examples of a passport and a drivers’ license document on the Analyze Document output tab of the Amazon Textract console. Amazon Textract automatically and easily extracts key-value elements, such as the type, code, passport number, surname, given name, nationality, date of birth, place of birth, and more fields, from the sample image.

The following is another example with a sample drivers’ license. Analyze ID extracts key-value elements such as class, as well as implied fields such as first name, last name, and address. It also normalizes keys, such as “Document number” from “4d NUMBER” as “820BAC729CBAC”, and “Date of birth” from “DOB” as “03/18/1978”, so that it is standardized across IDs.

AnalyzeID API request

In this section, we explain how to pass the ID image in the request and how to invoke the Analyze ID API. The input document is either in a byte array format or present on an Amazon Simple Storage Service (Amazon S3) object. You pass image bytes to an Amazon Textract API operation by using the Bytes property. For example, you can use the Bytes property to pass a document loaded from a local file system. Image bytes passed by using the Bytes property must be base64 encoded. Your code might not need to encode document file bytes if you’re using an AWS SDK to call Amazon Textract API operations. Alternatively, you can pass images stored in an S3 bucket to an Amazon Textract API operation by using the S3Object property. Documents stored in an S3 bucket don’t need to be base64 encoded.

The following examples show how to call the Amazon Textract AnalyzeID function in Python and use the CLI command.

Sample Python code:

import boto3

textract = boto3.client('textract')

# Call textract AnalyzeId by passing photo on local disk
documentName = "us-driver-license.jpeg"
with open(documentName, 'rb') as document:
    imageBytes = bytearray(document.read())

response = textract.analyze_id(
    DocumentPages=[{"Bytes":imageBytes}]
)

# Call textract AnalyzeId by passing photo on S3
response= textract.analyze_id(
    DocumentPages=[
        {
            "S3Object":{
                "Bucket":"BUCKET_NAME",
                "Name":"PREFIX_AND_FILE_NAME"
            }
        }
    ]
)

Sample CLI command:

aws textract analyze-id --document-pages '[{"S3Object":{"Bucket":"BUCKET_NAME","Name":"PREFIX_AND_FILE_NAME1"}},{"S3Object":{"Bucket":"BUCKET_NAME","Name":"PREFIX_AND_FILE_NAME2"}}]' --region us-east-1

Analyze ID API response

In this section, we explain the Analyze ID response structure using the sample passport image. The following is the sample passport image and the corresponding AnalyzeID response JSON.

Sample abbreviated response

{
  "IdentityDocuments": [
    {
      "DocumentIndex": 1,
      "IdentityDocumentFields": [
        {
          "Type": {
            "Text": "FIRST_NAME"
          },
          "ValueDetection": {
            "Text": "LI",
            "Confidence": 98.9061508178711
          }
        },
        {
          "Type": {
            "Text": "LAST_NAME"
          },
          "ValueDetection": {
            "Text": "JUAN",
            "Confidence": 99.0864486694336
          }
        },
        {
          "Type": {
            "Text": "DATE_OF_ISSUE"
          },
          "ValueDetection": {
            "Text": "09 MAY 2019",
            "NormalizedValue": {
              "Value": "2019-05-09T00:00:00",
              "ValueType": "Date"
            },
            "Confidence": 98.68514251708984
          }
        },
        {
          "Type": {
            "Text": "ID_TYPE"
          },
          "ValueDetection": {
            "Text": "PASSPORT",
            "Confidence": 99.3958740234375
          }
        },
        {
          "Type": {
            "Text": "ADDRESS"
          },
          "ValueDetection": {
            "Text": "",
            "Confidence": 99.62577819824219
          }
        },
        {
          "Type": {
            "Text": "COUNTY"
          },
          "ValueDetection": {
            "Text": "",
            "Confidence": 99.6469955444336
          }
        },
        {
          "Type": {
            "Text": "PLACE_OF_BIRTH"
          },
          "ValueDetection": {
            "Text": "NEW YORK CITY",
            "Confidence": 98.29044342041016
          }
        }
      ]
    }
  ],
  "DocumentMetadata": {
    "Pages": 1
  },
  "AnalyzeIDModelVersion": "1.0"
}

The AnalyzeID JSON output contains AnalyzeIDModelVersion, DocumentMetadata and IdentityDocuments, and each IdentityDocument item contains IdentityDocumentFields.

The most granular level of data in the IdentityDocumentFields response consists of Type and ValueDetection.

Let’s call this set of data an IdentityDocumentField element. The preceding example illustrates an AnalyzeDocument containing the Type with the Text and Confidence, and the ValueDetection which includes the Text, the Confidence, and the optional field NormalizedValue.

In the preceding example, Amazon Textract detected 44 key-value pairs, including PLACE_OF_BIRTH: New York City For the list of fields extracted from identity documents, refer to the Amazon Textract Developer Guide.

In addition to the detected content, the Analyze ID API provides information such as confidence scores for detected elements. It gives you control over how you consume extracted content and integrate it into your applications. For example, you can flag any elements that have a confidence score under a certain threshold for manual review.

The following is the Analyze ID response structure using the sample driving license image:

Sample abbreviated response

{
  "IdentityDocuments": [
    {
      "DocumentIndex": 1,
      "IdentityDocumentFields": [
        {
          "Type": {
            "Text": "FIRST_NAME"
          },
          "ValueDetection": {
            "Text": "GARCIA",
            "Confidence": 99.48689270019531
          }
        },
        {
          "Type": {
            "Text": "LAST_NAME"
          },
          "ValueDetection": {
            "Text": "MARIA",
            "Confidence": 98.49578857421875
          }
        },
        {
          "Type": {
            "Text": "STATE_NAME"
          },
          "ValueDetection": {
            "Text": "MASSACHUSETTS",
            "Confidence": 98.30329132080078
          }
        },
        {
          "Type": {
            "Text": "DOCUMENT_NUMBER"
          },
          "ValueDetection": {
            "Text": "736HDV7874JSB",
            "Confidence": 95.6583251953125
          }
        },
        {
          "Type": {
            "Text": "EXPIRATION_DATE"
          },
          "ValueDetection": {
            "Text": "01/20/2028",
            "NormalizedValue": {
              "Value": "2028-01-20T00:00:00",
              "ValueType": "Date"
            },
            "Confidence": 98.64090728759766
          }
        },
        {
          "Type": {
            "Text": "DATE_OF_ISSUE"
          },
          "ValueDetection": {
            "Text": "03/18/2018",
            "NormalizedValue": {
              "Value": "2018-03-18T00:00:00",
              "ValueType": "Date"
            },
            "Confidence": 98.7216567993164
          }
        },
        {
          "Type": {
            "Text": "ID_TYPE"
          },
          "ValueDetection": {
            "Text": "DRIVER LICENSE FRONT",
            "Confidence": 98.71986389160156
          }
        },
        {
          "Type": {
            "Text": "PLACE_OF_BIRTH"
          },
          "ValueDetection": {
            "Text": "",
            "Confidence": 99.62541198730469
          }
        }
      ]
    }
  ],
  "DocumentMetadata": {
    "Pages": 1
  },
  "AnalyzeIDModelVersion": "1.0"
}

Process Analyze ID response with the Amazon Textract parser library

You can use the Amazon Textract response parser library to easily parse the JSON returned by Amazon Textract AnalyzeID. The library parses JSON and provides programming language specific constructs to work with different parts of the document.

Install the Amazon Textract Response Parser library:

python -m pip install amazon-textract-response-parser

The following example shows how to deserialize Textract AnalyzeID JSON response to an object:

# j holds the Textract response JSON
from trp.trp2_analyzeid import TAnalyzeIdDocumentSchema
t_doc = TAnalyzeIdDocumentSchema().load(json.loads(j))

The following example shows how to serialize a Textract AnalyzeId object to dictionary:

from trp.trp2_analyzeid import TAnalyzeIdDocumentSchema
t_doc = TAnalyzeIdDocumentSchema().dump(t_doc)

Summary

In this post, we provided an overview of the new Amazon Textract AnalyzeID API to quickly and easily retrieve structured data from U.S. government-issued drivers’ licenses and passports. We also described how you can parse the Analyze ID response JSON. For more information, see the Amazon Textract Developer Guide, or check out the developer console and try out Analyze ID API.

About the Authors

Wrick Talukdar is a Senior Solutions Architect with AWS and is based in Calgary, Canada. Wrick works with enterprise AWS customers to transform their business through innovative use of cloud technologies. Outside of work, he enjoys reading and photography.

Lana Zhang is a Sr. Solutions Architect at AWS with expertise in Machine Learning. She is responsible for helping customers architect scalable, secure, and cost-effective workloads on AWS.

AWS Machine Learning Blog