Front-End Web & Mobile
Text to Speech on Android Using AWS Amplify
AWS Amplify offers many categories that focus on making specific uses cases easier to implement using a variety of AWS Services under the hood. The Amplify Predictions category enables you to integrate machine learning into your application without any prior machine learning experience.
In this blog post, you will learn how to use the Predictions category to implement text to speech in an Android app.
Creating the App
Start by creating a new Android phone project and select Empty Compose Activity:
Name the project and set the Minimum SDK to API 24
or higher:
Initializing Amplify
In Android Studio, open the Terminal and create a new Amplify project by running the following command:
amplify init
Select the default value for each of the prompts or make adjustments as you see fit. The values I entered are listed in the snippet below:
? Enter a name for the project TextToSpeechBlog
The following configuration will be applied:
Project information
| Name: TextToSpeechBlog
| Environment: dev
| Default editor: Visual Studio Code
| App type: android
| Res directory: app/src/main/res
? Initialize the project with the above configuration? Yes
Using default provider awscloudformation
? Select the authentication method you want to use: AWS profile
? Please choose the profile you want to use default
You will see the following output in the Terminal if you created the initial project successfully:
✅ Initialized your environment successfully.
Next, add the Predictions category by running the following command:
amplify add predictions
The Predictions category requires the Auth category to manage the permissions of who is able to access the Predictions resources. Enter the following values when prompted:
? Please select from one of the categories below Convert
? You need to add auth (Amazon Cognito) to your project in order to add storage for user files. Do you want to add auth now? Yes
? Do you want to use the default authentication and security configuration? Default configuration
? How do you want users to be able to sign in? Username
? Do you want to configure advanced settings? No, I am done.
? What would you like to convert? Generate speech audio from text
? Provide a friendly name for your resource speechGeneratorce5ed73c
? What is the source language? US English
? Select a speaker Joanna - Female
? Who should have access? Auth and Guest users
If you have successfully configured the Auth and Predictions categories, you will see the following output:
✅ Successfully updated auth resource locally.
Successfully added resource speechGeneratorce5ed73c locally
Push the Auth and Predictions configurations up to the cloud by running the following command:
amplify push -y
The
-y
flag allows you to push your configuration without needing to confirm the changes that will be applied to your Amplify project.
You will see the following output when your resources have successfully been configured:
✔ All resources are updated in the cloud
Installing Dependencies
Now that the Amplify backend is configured, it’s time to add Amplify as a dependency for the Android project. Add the following code to the app build.gradle
file:
// 1
implementation 'com.amplifyframework:aws-auth-cognito:2.0.0'
implementation 'com.amplifyframework:aws-predictions:2.0.0'
// 2
implementation "androidx.compose.material:material-icons-extended:$compose_ui_version"
- Both the
aws-auth-cognito
andaws-predictions
packages are needed to configure the respective plugins with Amplify. material-icons-extended
provides more material icons which will be used as part of the UI
Then click Sync Now to install the dependencies. You should see the following output in the Build section:
BUILD SUCCESSFUL in 1s
Building the UI
The Android project is ready to work with the Amplify resources. Open MainActivity.kt
and replace its contents with the following:
class MainActivity : ComponentActivity() {
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContent {
AmplifyTextToSpeechTheme {
Surface(
modifier = Modifier.fillMaxSize(),
color = MaterialTheme.colors.background
) {
TextToSpeechScreen {}
}
}
}
}
}
@Composable
fun TextToSpeechScreen(message: (String) -> Unit) {
val messageState = remember { mutableStateOf("") }
Column(
verticalArrangement = Arrangement.Center,
horizontalAlignment = Alignment.CenterHorizontally,
modifier = Modifier.fillMaxSize()
) {
TextField(value = messageState.value, onValueChange = { messageState.value = it })
IconButton(onClick = { message(messageState.value) }, modifier = Modifier.size(100.dp)) {
Icon(
Icons.Filled.VolumeUp,
contentDescription = "Volume",
modifier = Modifier.fillMaxSize(0.75f)
)
}
}
}
The snippet above creates a simple Jetpack Compose UI that consists of a TextField
used to capture user input and an IconButton
that will be used to trigger the speech to text functionality.
Adding Plugins
Before any Amplify categories can be used by the app, they must be configured first. Create the following function in the MainActivity:
private fun configureAmplify() {
try {
Amplify.addPlugin(AWSCognitoAuthPlugin())
Amplify.addPlugin(AWSPredictionsPlugin())
Amplify.configure(applicationContext)
Log.i("AmplifyProject", "Amplify Configured")
} catch (error: Exception) {
Log.e("AmplifyProject", "Failed Configure", error)
}
}
The configureAmplify
method will attempt to add the AWSCognitoAuthPlugin
and AWSPredictionsPlugin
to Amplify which enables their respective APIs. If there is an issue with the configuration, it will be logged in Logcat.
Next, call configureAmplify()
in the onCreate
method before any Amplify resources are used:
... // super.onCreate(savedInstanceState)
configureAmplify()
... // setContent {
Build and run and you will see the following output:
I/AmplifyProject: Amplify Configured
Handling Audio
Now you can add the logic for playing audio from an InputStream
. Add the following method to MainActivity
:
private val mp = MediaPlayer()
private fun playAudio(data: InputStream) {
val mp3File = File(cacheDir, "audio.mp3")
try {
FileOutputStream(mp3File).use { out ->
val buffer = ByteArray(8 * 1024)
var bytesRead: Int
while (data.read(buffer).also { bytesRead = it } != -1) {
out.write(buffer, 0, bytesRead)
}
mp.reset()
mp.setOnPreparedListener { obj: MediaPlayer -> obj.start() }
mp.setDataSource(FileInputStream(mp3File).fd)
mp.prepareAsync()
}
} catch (error: IOException) {
Log.e("MyAmplifyApp", "Error writing audio file.")
}
}
When playAudio
is passed an InputStream
, the data will be written to an MP3 file and passed to a FileOutputStream
to be read by the MediaPlayer
.
Next, create a function that will use the Amplify Predictions API to convert a String
into an InputStream
:
private fun readMessage(message: String) {
Amplify.Predictions.convertTextToSpeech(
message,
{ playAudio(it.audioData) },
{ Log.e("AmplifyProject", "Error", it) }
)
}
The Amplify APIs follow a consistent pattern of selecting a category for the required use-case and offering the different methods relevant to the category. In this case, convertTextToSpeech
is passed a String
, which is then processed by machine learning resources under the hood to generate a phrase. The first block then passes the audioData
to the playAudio
function to have the statement read aloud by the device.
Lastly, update the TextToSpeechScreen
block to call readMessage
when the user taps the speaker button:
TextToSpeechScreen {
readMessage(it)
}
Build and run. You will now be able to enter a message into the text field and press the button to hear your message read aloud. 🎉
Conclusion
Just like all Amplify categories, the Amplify Predictions category makes it easy to use and implement AWS resources into your Android projects. As you use Amplify to build your next project, be sure to reach out on the GitHub repository, or through the Amplify Discord server under the #android-help channel to help us prioritize features and enhancements.
Clean Up
Now that you’ve finished this walkthrough, you can delete the backend resources to avoid incurring unexpected costs using the command amplify delete
.