AWS Machine Learning Blog

“ID + Selfie” – Improving digital identity verification using AWS

The COVID-19 global pandemic has accelerated the need to verify and onboard users online across several industries, such as financial services, insurance, and healthcare. When it comes to user experience it is crucial to provide a frictionless transaction while maintaining a high standard for identity verification.  The question is, how do you verify real people in the digital world?

Amazon Rekognition provides pre-trained facial recognition and analysis capabilities for identity verification to your online applications, such as banking, benefits, ecommerce, and much more.

In this post, we present the “ID + Selfie” identity verification design pattern and sample code you can use to create your own identity verification REST endpoint. This is a common design pattern that you can incorporate into existing or new solutions that require face-based identity verification. The user presents a form of identification like a driver’s license or passport. The user than captures a real-time selfie with the application. We then compare the face from the document with the real-time selfie taken on their device.

The Amazon Rekognition CompareFaces API

At the core of the “ID + Selfie” design pattern is the comparison of the face in the selfie to the face on the identification document. For this, we use the Amazon Rekognition CompareFaces API. The API compares a face in the source input image with a face or faces detected in the target input image. In the following example, we compare a sample driver’s license (left) with a selfie (right).

Source Target

The following is an example of the API code:

response = client.compare_faces(SimilarityThreshold=80,
                              SourceImage={'Bytes': s_bytes},
                              TargetImage={'Bytes': t_bytes})

for faceMatch in response['FaceMatches']:
    position = faceMatch['Face']['BoundingBox']
    similarity = str(faceMatch['Similarity'])

Several values are returned in the CompareFaces API response. We focus on the Similarity value returned in FaceMatches to validate the selfie matches the ID provided.

Understanding key tuning parameters

SimilarityThreshold is set to 80% by default and will only return results with a similarity score greater than or equal to 80%. Adjust the value by specifying the SimilarityThreshold parameter.

QualityFilter is an input parameter to filter out detected faces that don’t meet a required quality bar. The quality bar is based on a variety of common use cases. Use QualityFilter to set the quality bar by specifying LOW, MEDIUM, or HIGH. If you don’t want to filter poor quality faces, specify NONE. The default value is NONE.

Solution overview

You can create an “ID + Selfie” API for digital identity verification by deploying the following components:

  • A REST API with a POST method that allows us to send the selfie and identification payload and returns a response, in this case the similarity score
  • A function to receive the payload, convert the images to the proper format, and call the Amazon Rekognition compare_faces API.

We implement Amazon API Gateway for the REST API functionality and AWS Lambda for the function.

The following diagram illustrates the solution architecture and workflow.

The workflow contains the following steps:

  1. The user uploads the required identification document and a selfie.
  2. The client submits the identification document and selfie to the REST endpoint.
  3. The REST endpoint returns a similarity score to the client.
  4. An evaluation is done through business logic in your application. For example, if the similarity score is below 80%, it fails the digital identity check; otherwise it passes the digital identity check.
  5. The client sends the status to the user.

Lambda code

The Lambda function converts the incoming payload from base64 to byte for each image and then sends the source (selfie) and target (identification) to the Amazon Rekognition compare_faces API and returns the similarity score received in the body of the API response. See the following code:

import boto3
import sys
import json
import base64


def lambda_handler(event, context):

  client = boto3.client('rekognition')

  payload_dict = json.loads(json.loads(event['body']))
  selfie = payload_dict['selfie']
  dl = payload_dict['dl']

  # convert text to base64
  s_base64 = dl.encode('utf-8')
  t_base64 = selfie.encode('utf-8')
  #convert base64 to bytes
  s_bytes = base64.b64decode(s_base64)
  t_bytes = base64.b64decode(t_base64)
  response = client.compare_faces(SimilarityThreshold=80,
                                SourceImage={'Bytes': s_bytes},
                                TargetImage={'Bytes': t_bytes})

  for faceMatch in response['FaceMatches']:
      position = faceMatch['Face']['BoundingBox']
      similarity = str(faceMatch['Similarity'])

  return {

    'statusCode': response['ResponseMetadata']['HTTPStatusCode'],

    'body': similarity

  }

Deploy the project

This project is available to deploy through AWS Samples with the AWS Cloud Development Kit (AWS CDK). You can clone the repository and use the following AWS CDK process to deploy to your AWS account.

  1. Set up a user who has permissions to programmatically deploy the solution resources through the AWS CDK.
  2. Set up the AWS Command Line Interface (AWS CLI). For instructions, refer to Configuring the AWS CLI.
  3. If this is your first time using the AWS CDK, complete the prerequisites listed in Working with the AWS CDK in Python.
  4. Clone the GitHub repository.
  5. Create the virtual environment. The command you use depends on your OS:
    1. If using Windows, run the following command in your terminal window from the source of the cloned repository:
      .\.venv\Scripts\activate
    2. If using Mac or Linux, run the following command in your terminal window from the source of the cloned repository:
      .venv/bin/activate
  6. After activating the virtual environment, install the app’s standard dependencies:
    python -m pip install -r requirements.txt
  7. Now that the environment is set up and the requirements are met, we can issue the AWS CDK deployment command to deploy this project to AWS:
    CDK Deploy

Make API calls

We need to send the payload in base64 format to the REST endpoint. We use a Python file to make the API call, which allows us to open the source and target files, convert them to base64, and send the payload to the API Gateway. This code is available in the repository.

Note that the SOURCE and TARGET file locations will be on your local file system, and the URL is the API Gateway URL generated during the creation of the project.

import requests
from base64 import b64encode
from json import dumps

TARGET = '<Selfie>.png'
SOURCE = <ID Image>.png'
URL = "https://<your api gateway>.execute-api.<region>.amazonaws.com/<deployment slot>/ips"
ENCODING = 'utf-8'
JSON_NAME = 'output.json'

# first: reading the binary stuff
with open(SOURCE, 'rb') as source_file:
    s_byte_content = source_file.read()
with open(TARGET, 'rb') as target_file:
    t_byte_content = target_file.read()

# second: base64 encode read data
s_base64_bytes = b64encode(s_byte_content)
t_base64_bytes = b64encode(t_byte_content)

# third: decode these bytes to text
s_base64_string = s_base64_bytes.decode(ENCODING)
t_base64_string = t_base64_bytes.decode(ENCODING)

# make raw data for json
raw_data = {
    " dl ": s_base64_string,
    " selfie ": t_base64_string
}

# now: encoding the data to json
json_data = dumps(raw_data, indent=2)

response = requests.post(url=URL, json=json_data)
response.raise_for_status()

print("Status Code", response.status_code)
print("Body ", response.json())

Clean up

We used the AWS CDK to build this project, so we can open our project locally and issue the following AWS CDK command to clean up the resources:

CDK Destroy

Conclusion

There you have it, the “ID + Selfie” design pattern with a simple API that you can integrate with your application to perform digital identity verification. In the next post in our series, we expand upon this pattern further by extracting text from the identification document and searching a collection of faces to prevent duplication.

To learn more, check out the Amazon Rekognition Developer Guide on detecting and analyzing faces.


About the Authors

Mike Ames is a Principal Applied AI/ML Solutions Architect with AWS. He helps companies use machine learning and AI services to combat fraud, waste, and abuse. In his spare time, you can find him mountain biking, kickboxing, or playing guitar in a 90s metal band.

Noah Donaldson is a Solutions Architect at AWS supporting federal financial organizations. He is excited about AI/ML technology that can reduce manual processes, improve customer experiences, and help solve interesting problems. Outside of work, he enjoys spending time on the ice with his son playing hockey, hunting with his oldest daughter, and shooting hoops with his youngest daughter.