Amazon Transcribe Features

Why Amazon Transcribe?

Amazon Transcribe is an automatic speech recognition service that makes it easy to add speech to text capabilities to any application. Transcribe’s features enable you to ingest audio input, produce easy to read and review transcripts, improve accuracy with customization, and filter content to ensure customer privacy.

Audio inputs

Transcribe is designed to process live and recorded audio or video input to provide high quality transcriptions for search and analysis. We also offer separate APIs that uniquely understand customer calls (Amazon Transcribe Call Analytics) and medical conversations (Amazon Transcribe Medical).

Easy to read transcripts

Amazon Transcribe enables you produce accurate transcripts that are easy to read, review, and integrate into your specific applications. We work to make the output ready for downstream activities such as call transcript analysis, subtitling, and content search.

Customize your output

Accuracy is critical and we provide you many options to customize transcripts to your specific business needs and vernacular. Transcribe also provides up to 10 alternative transcriptions for each sentence, so you can quickly choose the best option that applies to your content and domain. This is useful for human in-the-loop subtitling workflows.

User safety & privacy features

Ensuring customer privacy and safety is critical. When needed, Transcribe enables you to mask or remove words that are sensitive or unsuitable for your audience from transcription results.

Improve contact center productivity with generative call summarization

Automatically create generative AI-powered call summaries to help agents focus on providing excellent customer experiences and increase productivity by eliminating after call work. Managers can quickly review these summaries without reviewing the entire transcript to understand the context of an interaction and investigate any customer issues.

Audio Inputs

Open all

Streaming & batch transcription

You can process your existing audio recordings or stream the audio for real-time transcription. Using a secure connection, you can send a live audio stream to the service, and receive a stream of text in response.

Domain specific models

Select a model that is tuned to telephone calls or multimedia video content. For example, Transcribe adapts to low-fidelity phone audio common in contact centers.

Automatic language identification

With Amazon Transcribe, you can automatically identify the languages spoken in an audio file or streaming media without having to specify a language code. Amazon Transcribe will identify the dominant language spoken or if the audio contains multiple languages, it can identify all languages spoken and transcribe the speech accordingly. This is useful when your customers might be switching between languages or your media library contains audio files in different languages. You can also use this feature for media content classification and verify that the main spoken language in your videos and podcasts is correctly label.

Easy to read transcripts

Open all

Punctuation & number normalization

Amazon Transcribe automatically adds punctuation and number formatting, so that the output closely matches the quality of manual transcription at a fraction of the time and expense. Numbers are also transcribed into digits or “normal form” instead of words.

Timestamp generation

Amazon Transcribe returns a timestamp for each word, so that you can easily find a word or phrase in the original recording or add subtitles to video.

Recognize multiple speakers

Speaker changes are automatically recognized and attributed in the text to capture scenarios like telephone calls, meetings, and television shows accurately. To learn more about speaker identification.

Channel identification

Contact centers can submit a single audio file to Amazon Transcribe, and the service will identify produce a single transcript annotated by channel labels automatically.

Customization

Open all

Custom vocabulary

With custom vocabulary, you can add new words to the base vocabulary to generate more accurate transcriptions for domain-specific words and phrases like product names, technical terminology, or names of individuals.

Custom language models

When needed, you can build and train your own custom language model (CLM) for your use case and domain by submitting a corpus of text data to Amazon Transcribe. CLM is a suitable feature for enhancing speech recognition accuracy with your own data.

Privacy and security

Open all

Vocabulary filtering

You can specify a list of words to remove from transcripts with vocabulary filtering. For example, you can specify a list of profane or offensive words and Amazon Transcribe removes them from transcripts automatically.

Automatic content redaction / PII redaction

When instructed, Amazon Transcribe can help customers identify and redact sensitive personally identifiable information (PII) from the supported language transcripts. This allows contact centers to easily review and share the transcripts for customer experience insight and agent training.

Data Protection

Secure data at rest using Amazon S3 key (SSE-S3) or specify your own AWS Key Management Service key. Amazon Transcribe uses TLS (Transport Layer Security) 1.2, a cryptographic protocol that enables authenticated connections and secure data transport over the internet via HTTP, with AWS certificates to encrypt data in transit. This includes streaming transcriptions.

Toxic audio content detection

Amazon Transcribe Toxicity Detection uses Machine Learning to keep audio conversations civil and constructive to encourage a safe and inclusive online environment. Toxic audio content is flagged into one of several categories for human moderators to easily identify and take appropriate action.

Transcribe Call Analytics

Open all

Improve contact center productivity with call summarization

Generate call summaries to help agents focus on providing excellent customer experiences and increase productivity post-call by automatically capturing key parts of the customer conversation (e.g. issue, outcomes, or action items). Managers can quickly review these summaries without reviewing the entire transcript to understand the context of an interaction and investigate any customer issues.

Extract detailed call analytics & conversation insights

Using the power of machine learning, you can quickly apply speech-to-text and natural language processing capabilities to uncover valuable conversation insights. You can then integrate insights such as customer and agent sentiment, detected issues, and speech characteristics like non-talk time, interruptions, and talk-speed into your inbound and outbound call analytics applications. This can help your supervisors more readily identify potential customer issues, agent coaching opportunities, and call trends.

Improve compliance & monitoring with automated call categorization

Monitor your calls at scale to track compliance with company policies or regulatory requirements. Build and train your own custom categories based on your specified criteria (e.g. words/phrases or conversation characteristics). For example, you can setup category labels to see what percentage of calls are upsells or account cancellation.

Produce rich call transcripts

Give your agents access to the conversation details from past interactions. The turn-by-turn transcripts provide insights such as customer sentiment, detected issues and interruptions.

Transcribe Medical

Open all

Dictation mode

Accurately transcribe single-speaker audio commonly found in medical dictation use cases. Learn more »

Conversational mode

Accurately transcribe multi-speaker conversational audio consisting of clinicians and/or patients alike. Learn more »

Medical specialties

Transcribe speech to text across a diverse range of medical specialties. Learn more

Batch API

Transcribe recorded medical audio files at scale with high concurrency. Learn more

Custom vocabulary

Boost transcription accuracy by using custom vocabulary for potentially out-of-lexicon terminology. Learn more

Speaker diarization

Separate speech from different speakers within any mono-channel audio. Learn more »

Getting started with Transcribe

Pricing

Learn more about product pricing

Learn more

Console

Select your cookie preferences

Amazon Transcribe Features

Why Amazon Transcribe?

Audio inputs

Easy to read transcripts

Customize your output

User safety & privacy features

Improve contact center productivity with generative call summarization

Page Topics

Audio Inputs

Streaming & batch transcription

Domain specific models

Automatic language identification

Easy to read transcripts

Punctuation & number normalization

Timestamp generation

Recognize multiple speakers

Channel identification

Customization

Custom vocabulary

Custom language models

Privacy and security

Vocabulary filtering

Automatic content redaction / PII redaction

Data Protection

Toxic audio content detection

Transcribe Call Analytics

Improve contact center productivity with call summarization

Extract detailed call analytics & conversation insights

Improve compliance & monitoring with automated call categorization

Produce rich call transcripts

Transcribe Medical

Dictation mode

Conversational mode

Medical specialties

Batch API

Custom vocabulary

Speaker diarization

Getting started with Transcribe

Learn more about product pricing

Try it out in the console

Ending Support for Internet Explorer