A similarity score is a statistical measure of how likely two faces in an image are the same person, when analyzed by Amazon Rekognition. An image that received a similarity score of 95% for instance, would indicate that amongst all the faces Rekogniton analyzed, this image had a 95% similarity with the face being searched for. A higher similarity score means the more likely the two images are from the same identity. That said, even a 99% similarity does not guarantee it is a positive match.
That is because Rekognition uses what is called a probabilistic system, where determinations cannot be made with absolute precise accuracy, it is instead, a prediction.
This is where the similarity threshold comes into play. A similarity threshold is the lowest similarity score the application using Rekognition is willing to accept as a possible match. The choice of threshold has a fundamental impact on the search results that are returned. The number of misidentifications (sometimes called ‘false positives’) that can be afforded by the customer is a direct result of the threshold setting. A customer will select the appropriate setting based on their needs and use case of the application.
We recommend a 99% threshold setting for use cases where highly accurate face similarity matches are important. In public safety and law enforcement scenarios for example, this is often a key first step to help narrow the field and allow humans to expeditiously review and consider options using their judgment.
On the other hand, many scenarios don’t require human review of Amazon Rekognition responses. For example, secondary factor authentication with an employee badge and a face recognized by Amazon Rekognition with a high (99%) similarity. Or a personal photo collection application, where a few incorrect matches can be tolerated, a lower threshold of 80% may be acceptable. Customers can tune the similarity threshold to the specifics of their use case and needs.