AWS Machine Learning Blog

Category: Amazon Comprehend

Build well-architected IDP solutions with a custom lens – Part 6: Sustainability

An intelligent document processing (IDP) project typically combines optical character recognition (OCR) and natural language processing (NLP) to automatically read and understand documents. Customers across all industries run IDP workloads on AWS to deliver business value by automating use cases such as KYC forms, tax documents, invoices, insurance claims, delivery reports, inventory reports, and more. […]

Principal Financial Group uses AWS Post Call Analytics solution to extract omnichannel customer insights

An established financial services firm with over 140 years in business, Principal is a global investment management leader and serves more than 62 million customers around the world. Principal is conducting enterprise-scale near-real-time analytics to deliver a seamless and hyper-personalized omnichannel customer experience on their mission to make financial security accessible for all. They are […]

Flag harmful content using Amazon Comprehend toxicity detection

Online communities are driving user engagement across industries like gaming, social media, ecommerce, dating, and e-learning. Members of these online communities trust platform owners to provide a safe and inclusive environment where they can freely consume content and contribute. Content moderators are often employed to review user-generated content and check that it’s safe and compliant […]

Build trust and safety for generative AI applications with Amazon Comprehend and LangChain

We are witnessing a rapid increase in the adoption of large language models (LLM) that power generative AI applications across industries. LLMs are capable of a variety of tasks, such as generating creative content, answering inquiries via chatbots, generating code, and more. Organizations looking to use LLMs to power their applications are increasingly wary about data privacy to ensure trust and safety is maintained within their generative AI applications. This includes handling customers’ personally identifiable information (PII) data properly. It also includes preventing abusive and unsafe content from being propagated to LLMs and checking that data generated by LLMs follows the same principles. In this post, we discuss new features powered by Amazon Comprehend that enable seamless integration to ensure data privacy, content safety, and prompt safety in new and existing generative AI applications.

How Reveal’s Logikcull used Amazon Comprehend to detect and redact PII from legal documents at scale

Today, personally identifiable information (PII) is everywhere. PII is in emails, slack messages, videos, PDFs, and so on. It refers to any data or information that can be used to identify a specific individual. PII is sensitive in nature and includes various types of personal data, such as name, contact information, identification numbers, financial information, […]

Automatically redact PII for machine learning using Amazon SageMaker Data Wrangler

Customers increasingly want to use deep learning approaches such as large language models (LLMs) to automate the extraction of data and insights. For many industries, data that is useful for machine learning (ML) may contain personally identifiable information (PII). To ensure customer privacy and maintain regulatory compliance while training, fine-tuning, and using deep learning models, […]

Improve prediction quality in custom classification models with Amazon Comprehend

In this post, we explain how to build and optimize a custom classification model using Amazon Comprehend. We demonstrate this using an Amazon Comprehend custom classification to build a multi-label custom classification model, and provide guidelines on how to prepare the training dataset and tune the model to meet performance metrics such as accuracy, precision, recall, and F1 score.

Generative AI and multi-modal agents in AWS: The key to unlocking new value in financial markets

Multi-modal data is a valuable component of the financial industry, encompassing market, economic, customer, news and social media, and risk data. Financial organizations generate, collect, and use this data to gain insights into financial operations, make better decisions, and improve performance. However, there are challenges associated with multi-modal data due to the complexity and lack […]

Build a classification pipeline with Amazon Comprehend custom classification (Part I)

In first part of this multi-series blog post, you will learn how to create a scalable training pipeline and prepare training data for Comprehend Custom Classification models. We will introduce a custom classifier training pipeline that can be deployed in your AWS account with few clicks.

Visualize an Amazon Comprehend analysis with a word cloud in Amazon QuickSight

Searching for insights in a repository of free-form text documents can be like finding a needle in a haystack. A traditional approach might be to use word counting or other basic analysis to parse documents, but with the power of Amazon AI and machine learning (ML) tools, we can gather deeper understanding of the content. […]