Android

 

Give Your Android App a Voice with Speech Recognition

Voice technology has progressed by leaps and bounds in recent years. From voice-controlled assistants such as Google Assistant and Amazon Alexa to voice-operated apps, the demand for voice interaction is ever-growing. Android, being a front-runner in mobile technology, offers a powerful Speech Recognition API that developers can use to build voice-controlled interfaces. In this post, we’ll delve into Android’s Speech Recognition capabilities, providing examples and insights into how you can integrate this feature into your apps.

Give Your Android App a Voice with Speech Recognition

1. What is Speech Recognition?

Speech Recognition, also referred to as Automatic Speech Recognition (ASR), involves the conversion of spoken language into text. It is the technology behind voice typing on a smartphone or voice commands on smart devices.

2. Android Speech Recognition API

Android’s Speech Recognition API is a part of the Android `Speech` package. It allows developers to incorporate speech-to-text capabilities in their applications without requiring internet connectivity. The Android platform offers the `RecognizerIntent` class, which communicates with the speech recognition service.

2.1 Setting up the Basics

  1. Permissions: Firstly, you need to declare permissions in your AndroidManifest.xml.
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
```xml
<uses-permission android:name="android.permission.RECORD_AUDIO" />
```
```xml <uses-permission android:name="android.permission.RECORD_AUDIO" /> ```
```xml
<uses-permission android:name="android.permission.RECORD_AUDIO" />
```
  1. Intent for Speech Recognition: Use the `RecognizerIntent` to invoke the speech recognition.
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
```java
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Speak now!");
startActivityForResult(intent, REQUEST_CODE);
```
```java Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Speak now!"); startActivityForResult(intent, REQUEST_CODE); ```
```java
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Speak now!");
startActivityForResult(intent, REQUEST_CODE);
```
  1. Retrieve Results: Handle the speech recognition results in `onActivityResult()`.
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
```java
@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
super.onActivityResult(requestCode, resultCode, data);
if (requestCode == REQUEST_CODE && resultCode == RESULT_OK) {
ArrayList<String> matches = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
if (matches != null && matches.size() > 0) {
String result = matches.get(0);
// Use the result in your application
}
}
}
```
```java @Override protected void onActivityResult(int requestCode, int resultCode, Intent data) { super.onActivityResult(requestCode, resultCode, data); if (requestCode == REQUEST_CODE && resultCode == RESULT_OK) { ArrayList<String> matches = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS); if (matches != null && matches.size() > 0) { String result = matches.get(0); // Use the result in your application } } } ```
```java
@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
    super.onActivityResult(requestCode, resultCode, data);

    if (requestCode == REQUEST_CODE && resultCode == RESULT_OK) {
        ArrayList<String> matches = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
        if (matches != null && matches.size() > 0) {
            String result = matches.get(0);
            // Use the result in your application
        }
    }
}
```

3. Advanced Features and Tips

  1. Continuous Speech Recognition: Android doesn’t natively support continuous speech recognition in background services. But, third-party libraries like CMU Sphinx can help. It’s beneficial for applications that require ongoing listening.
  1. Multiple Language Support: The `RecognizerIntent` supports multiple languages. Specify the desired language using the `EXTRA_LANGUAGE` parameter.
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
```java
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "es-ES"); // For Spanish
```
```java intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "es-ES"); // For Spanish ```
```java
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "es-ES");  // For Spanish
```
  1. Offline Recognition: While Android does support offline recognition, the user needs to download the necessary language packs from Settings.

4. Example: Building a Voice-controlled Note-taking App

Here’s a simple demonstration on how to build a voice-controlled note-taking app using the Android Speech Recognition API.

4.1. Setup Layout

Create a layout with a Button to start listening and a TextView to display the transcribed text.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
```xml
<Button
android:id="@+id/btn_listen"
android:text="Start Listening"
android:layout_width="wrap_content"
android:layout_height="wrap_content" />
<TextView
android:id="@+id/tv_transcription"
android:layout_width="match_parent"
android:layout_height="wrap_content" />
```
```xml <Button android:id="@+id/btn_listen" android:text="Start Listening" android:layout_width="wrap_content" android:layout_height="wrap_content" /> <TextView android:id="@+id/tv_transcription" android:layout_width="match_parent" android:layout_height="wrap_content" /> ```
```xml
<Button
    android:id="@+id/btn_listen"
    android:text="Start Listening"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content" />

<TextView
    android:id="@+id/tv_transcription"
    android:layout_width="match_parent"
    android:layout_height="wrap_content" />
```

4.2. Handle Button Click

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
```java
Button listenButton = findViewById(R.id.btn_listen);
listenButton.setOnClickListener(v -> {
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Start speaking...");
startActivityForResult(intent, REQUEST_CODE);
});
```
```java Button listenButton = findViewById(R.id.btn_listen); listenButton.setOnClickListener(v -> { Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Start speaking..."); startActivityForResult(intent, REQUEST_CODE); }); ```
```java
Button listenButton = findViewById(R.id.btn_listen);
listenButton.setOnClickListener(v -> {
    Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
    intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Start speaking...");
    startActivityForResult(intent, REQUEST_CODE);
});
```

4.3. Handle Results

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
```java
@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
super.onActivityResult(requestCode, resultCode, data);
if (requestCode == REQUEST_CODE && resultCode == RESULT_OK) {
ArrayList<String> matches = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
if (matches != null && matches.size() > 0) {
String transcription = matches.get(0);
TextView transcriptionView = findViewById(R.id.tv_transcription);
transcriptionView.setText(transcription);
}
}
}
```
```java @Override protected void onActivityResult(int requestCode, int resultCode, Intent data) { super.onActivityResult(requestCode, resultCode, data); if (requestCode == REQUEST_CODE && resultCode == RESULT_OK) { ArrayList<String> matches = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS); if (matches != null && matches.size() > 0) { String transcription = matches.get(0); TextView transcriptionView = findViewById(R.id.tv_transcription); transcriptionView.setText(transcription); } } } ```
```java
@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
    super.onActivityResult(requestCode, resultCode, data);

    if (requestCode == REQUEST_CODE && resultCode == RESULT_OK) {
        ArrayList<String> matches = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
        if (matches != null && matches.size() > 0) {
            String transcription = matches.get(0);
            TextView transcriptionView = findViewById(R.id.tv_transcription);
            transcriptionView.setText(transcription);
        }
    }
}
```

With these simple steps, you’ve created a basic voice-controlled note-taking app! From here, you can extend the functionalities such as saving notes, categorizing them, or even sharing them via different mediums.

Conclusion

The Android Speech Recognition API provides developers with a powerful tool to create voice-controlled interfaces, enriching the user experience. The opportunities are endless—from simple transcription services to advanced voice-controlled applications. With a touch of creativity, the sky is the limit when it comes to integrating voice commands into Android applications. Embrace the power of voice and craft exceptional user experiences!

Previously at
Flag Argentina
Brazil
time icon
GMT-3
Skilled Android Engineer with 5 years of expertise in app development, ad formats, and enhancing user experiences across high-impact projects