This Guidance helps you deploy search functionality powered by Amazon Kendra. For many organizations, critical business information is scattered across multiple content repositories, making it challenging for employees to access and securely share the right information. Amazon Kendra helps manage access to documents through token-based user access. Amazon Kendra also supports search filtering based on user access tokens and document access control lists (ACLs). Search results return links and a short description to original document repositories. Access control to full documents remain enforced by the access policies of the original repository.
Architecture Diagram
Step 1
Amazon Kendra crawls and indexes documents from an Amazon Simple Storage Service (Amazon S3) bucket and collects the ACLs and document attributes from the metadata files.
Step 2
The Amazon Cognito user pool authenticates registered users. The Amazon Cognito identity pool authorizes the application to use Amazon Kendra and Amazon S3.
Step 3
Configure user access control of the Amazon Kendra index to use the Amazon Cognito user pool as an Open ID provider.
Step 4
AWS Amplify builds and deploys the application code that will be used for web application hosting.
Step 5
Your user authenticates and logs in to the application to perform a query.
Step 6
The application sends the user’s access token (provided by the Amazon Cognito user pool) to the Amazon Kendra index. The Amazon Kendra index decrypts the access token using the Amazon Cognito user pool signing URL and gets parameters such as cognito:username and cognito:groups associated with the user.
Step 7
The Amazon Kendra index filters the search results based on the stored ACLs and the information received in the user access token. These filtered results are returned in response to the query API call that the application makes.
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
Amazon Kendra uses Amazon CloudWatch Logs to give insight into the operation of data sources. Amazon Kendra logs process details for the documents as they are indexed. It logs errors from data sources that occur while documents are being indexed. CloudWatch Logs can be used to monitor, store, and access the log files. CloudWatch Logs Insights and anomaly detection can be used to continuously analyze metrics of systems and applications, determine normal baselines, and surface anomalies with minimal user intervention.
-
Security
Amazon Cognito helps manage, authenticate, and authorize web application end-users. This architecture uses Amazon Cognito’s identity pool to authorize the web application to use only Amazon Kendra and Amazon S3. The web application is not allowed to access any other services. Additionally, this architecture uses AWS CloudFormation templates to deploy resources to the AWS Cloud. These templates reduce the risk of human error associated with manual configuration or management.
-
Reliability
The enterprise version of an Amazon Kendra index by default is highly available within a Region. When you start with Amazon Kendra Enterprise Edition, you get a base capacity of 100,000 searchable documents and up to 8,000 queries per day.
-
Performance Efficiency
Amazon CloudFront reduces latency when delivering web applications. During deployment of this architecture, you should make sure that all the services (e.g., Amazon Kendra, Amazon Cognito, CloudFront) required for the architecture are available in your chosen Region for deployment.
-
Cost Optimization
This architecture uses Amplify which applies serverless technologies to host front-end and back-end services for web applications. Amplify scales up with high volumes of users and then scales back down as user volumes decrease, helping to manage costs.
-
Sustainability
Web applications hosted by Amplify scale based on user demand of the web application, helping ensure the most efficient use of energy resources.
Implementation Resources
A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Related Content
Building a secure search application with access controls using Amazon Kendra
Amazon Kendra is a highly accurate and easy-to-use intelligent search service powered by machine learning (ML). Amazon Kendra supports search filtering based on user access tokens that are provided by your search application, as well as document access control lists (ACLs) collected by the Amazon Kendra connectors.
This post demonstrates token-based user access control in Amazon Kendra with Open ID.
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.