Amazon Polly - AI Voice Generator
Deploy high-quality, natural-sounding human voices in dozens of languagesWhat is Amazon Polly?
Amazon Polly is a fully-managed service that generates voice on demand, converting any text to an audio stream. Using deep learning technologies to convert articles, web pages, PDF documents, and other text-to-speech (TTS). Polly provides dozens of lifelike voices across a broad set of languages for you to build speech-activated applications that engage and convert. Meet diverse linguistic, accessibility, and learning needs of users across geographies and markets. Powerful neural networks and generative voice engines work in the background, synthesizing speech for you. Integrate the Amazon Polly API into your existing applications to become voice-ready quickly.
Use cases
Capabilities
Amazon Polly has a variety of capabilities including some listed below
Lifelike voices
Deliver conversational user experiences in consistently fast response times
When requesting Amazon Polly output, you can choose from dozens of lifelike voices and various languages. Each voice is created using native speakers, with voice-to-voice variations even within the same language. Most languages include one or more male and female voices, so you can choose the best fit for your use case.
Customizable output
Customize and control speech output as needed
Amazon Polly allows you to create custom text-to-speech output that attracts and holds your audience's attention. Use custom lexicons to modify the pronunciation of acronyms, company names, internal terminology, or any other words you choose. Amazon Polly’s Speech Synthesis Markup Languages (SSML) tags also allow you to adjust emphasis, intonation, phrasing, and style. Generate voice AI output that best suits your business.
Gen AI power
Access built-in gen AI capabilities at a fraction of the cost
Amazon Polly supports multiple voice engines that you can choose from to convert text-to-speech. The engine deploys a billion-parameter transformer to generate voices in an incremental, streamable manner. This AI voice generator creates synthetic speech that is assertive, emotionally engaged, and highly colloquial, similar to a real human voice.
Control and security
Securely store and redistribute speech in standard formats
Store your text-to-speech output in standard audio files like MP3 and OGG for redistribution, analysis, archiving, or any other use case at no extra cost. Cache your files for faster retrieval if needed. Your content's security, trust, and privacy are AWS’s highest priorities. Amazon Polly does not retain the content of your text submissions.