This post is about how to implement speech to text feature in an android application using SpeechRecognizer. Speech to text means that anything that the user says is converted into text.
SpeechRecognizer
SpeechRecognizer class provides access to the speech recognition service. This service allows access to the speech recognizer.
This class’s methods must be invoked only from the main application thread.
Creating new Project
1. Create a new project by going to File ⇒ New Android Project, fill the required details and then click on finish.
2. Open AndroidManifest.xml file and add RECORD_AUDIO permission as shown below:
AndroidManifest.xml
<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
package="com.c1ctech.speechtotextdemo">
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<application
android:allowBackup="true"
android:icon="@mipmap/ic_launcher"
android:label="@string/app_name"
android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true"
android:theme="@style/Theme.SpeechToTextDemo">
<activity
android:name=".MainActivity"
android:exported="true">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
</activity>
</application>
</manifest>
The Layout File
3. The below layout file consist of an ImageButton for the mic icon and an EditText to show the text that is converted from the speech.
activity_main.xml
<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="match_parent"
tools:context=".MainActivity">
<TextView
android:id="@+id/tv_speech_to_text"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:layout_alignParentTop="true"
android:layout_marginTop="160dp"
android:gravity="center"
android:text="Speech to Text"
android:textColor="@color/purple_700"
android:textSize="30sp"
android:textStyle="bold" />
<RelativeLayout
android:id="@+id/rl"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:layout_below="@+id/tv_speech_to_text"
android:layout_marginLeft="10dp"
android:layout_marginTop="30dp"
android:layout_marginRight="10dp">
<EditText
android:id="@+id/edtSpeechText"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:layout_centerInParent="true"
android:layout_marginRight="15dp"
android:layout_toLeftOf="@id/imgBtnMic"
android:hint="Text output of recorded audio"
android:padding="10dp" />
<ImageButton
android:id="@+id/imgBtnMic"
android:layout_width="40dp"
android:layout_height="40dp"
android:layout_alignParentRight="true"
android:backgroundTint="#F9F8FA"
android:paddingRight="10dp"
android:src="@drawable/ic_mic_off" />
</RelativeLayout>
</RelativeLayout>
Requesting Permission at Runtime
4. From Android Marshmallow, we must have to take RECORD_AUDIO permission from the user at runtime.
if(isPermissionGranted())
{ //request permission if it is not granted by user.
requestPermission();
}
private void requestPermission() {
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) {
ActivityCompat.requestPermissions(this, new String[]{Manifest.permission.RECORD_AUDIO}, PERMISSION_RECORD_AUDIO_REQUEST);
}
}
private boolean isPermissionGranted() {
return ContextCompat.checkSelfPermission(this, Manifest.permission.RECORD_AUDIO) != PackageManager.PERMISSION_GRANTED;
}
@Override
public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
super.onRequestPermissionsResult(requestCode, permissions, grantResults);
if (requestCode == PERMISSION_RECORD_AUDIO_REQUEST && grantResults.length > 0) {
//when user allows the permission.
if (grantResults[0] == PackageManager.PERMISSION_GRANTED)
Toast.makeText(this, "Permission Granted", Toast.LENGTH_SHORT).show();
}
}
Creating a SpeechRecognizer
5. To implement speech to text functionality, we need to create a SpeechRecognizer instance. We also need an Intent to listen to the speech.
private SpeechRecognizer speechRecognizer;
//creating SpeechRecognizer instance
speechRecognizer = SpeechRecognizer.createSpeechRecognizer(this);
final Intent speechRecognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
speechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
speechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
Setting RecognitionListener to SpeechRecognizer
6. The setRecognitionListener() will receive all the callbacks from the created SpeechRecognizer.
speechRecognizer.setRecognitionListener(new RecognitionListener() {
@Override
public void onReadyForSpeech(Bundle bundle) {
}
//The user has started to speak.
@Override
public void onBeginningOfSpeech() {
editText.setText("");
editText.setHint("Listening...");
}
@Override
public void onRmsChanged(float v) {
}
@Override
public void onBufferReceived(byte[] bytes) {
}
@Override
public void onEndOfSpeech() {
}
@Override
public void onError(int i) {
}
//called when recognition results are ready.
@Override
public void onResults(Bundle bundle) {
micButton.setImageResource(R.drawable.ic_mic_off);
ArrayList<String> data = bundle.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
editText.setText(data.get(0));
}
@Override
public void onPartialResults(Bundle bundle) {
}
@Override
public void onEvent(int i, Bundle bundle) {
}
});
Adding Touch Listener to Button
- ACTION_DOWN (Button pressed gesture has started): starts listening for speech.
- ACTION_UP (Button pressed gesture has finished): stops listening for speech.
micButton.setOnTouchListener(new View.OnTouchListener() {
@Override
public boolean onTouch(View view, MotionEvent motionEvent) {
if (motionEvent.getAction() == MotionEvent.ACTION_UP) {
speechRecognizer.stopListening();
}
if (motionEvent.getAction() == MotionEvent.ACTION_DOWN) {
micButton.setImageResource(R.drawable.ic_mic);
speechRecognizer.startListening(speechRecognizerIntent);
}
return false;
}
});
Complete MainActivity Code
7. Given below is the complete MainActivity code.
package com.c1ctech.speechtotextdemo;
import androidx.appcompat.app.AppCompatActivity;
import android.os.Bundle;
import androidx.annotation.NonNull;
import androidx.core.app.ActivityCompat;
import androidx.core.content.ContextCompat;
import android.Manifest;
import android.content.Intent;
import android.content.pm.PackageManager;
import android.os.Build;
import android.speech.RecognitionListener;
import android.speech.RecognizerIntent;
import android.speech.SpeechRecognizer;
import android.view.MotionEvent;
import android.view.View;
import android.widget.*;
import android.widget.Toast;
import java.util.ArrayList;
import java.util.Locale;
public class MainActivity extends AppCompatActivity {
public static final Integer PERMISSION_RECORD_AUDIO_REQUEST = 1;
private SpeechRecognizer speechRecognizer;
private EditText editText;
private ImageButton micButton;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
if (isPermissionGranted()) {
requestPermission();
}
editText = findViewById(R.id.edtSpeechText);
micButton = findViewById(R.id.imgBtnMic);
speechRecognizer = SpeechRecognizer.createSpeechRecognizer(this);
final Intent speechRecognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
speechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
speechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
speechRecognizer.setRecognitionListener(new RecognitionListener() {
@Override
public void onReadyForSpeech(Bundle bundle) {
}
@Override
public void onBeginningOfSpeech() {
editText.setText("");
editText.setHint("Listening...");
}
@Override
public void onRmsChanged(float v) {
}
@Override
public void onBufferReceived(byte[] bytes) {
}
@Override
public void onEndOfSpeech() {
}
@Override
public void onError(int i) {
}
@Override
public void onResults(Bundle bundle) {
micButton.setImageResource(R.drawable.ic_mic_off);
ArrayList<String> data = bundle.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
editText.setText(data.get(0));
}
@Override
public void onPartialResults(Bundle bundle) {
}
@Override
public void onEvent(int i, Bundle bundle) {
}
});
micButton.setOnTouchListener(new View.OnTouchListener() {
@Override
public boolean onTouch(View view, MotionEvent motionEvent) {
//ACTION_UP: A pressed gesture has finished.
if (motionEvent.getAction() == MotionEvent.ACTION_UP) {
speechRecognizer.stopListening();
}
//ACTION_DOWN: A pressed gesture has started.
if (motionEvent.getAction() == MotionEvent.ACTION_DOWN) {
micButton.setImageResource(R.drawable.ic_mic);
speechRecognizer.startListening(speechRecognizerIntent);
}
return false;
}
});
}
@Override
protected void onDestroy() {
super.onDestroy();
speechRecognizer.destroy();
}
private void requestPermission() {
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) {
ActivityCompat.requestPermissions(this, new String[]{Manifest.permission.RECORD_AUDIO}, PERMISSION_RECORD_AUDIO_REQUEST);
}
}
private boolean isPermissionGranted() {
return ContextCompat.checkSelfPermission(this, Manifest.permission.RECORD_AUDIO) != PackageManager.PERMISSION_GRANTED;
}
@Override
public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
super.onRequestPermissionsResult(requestCode, permissions, grantResults);
if (requestCode == PERMISSION_RECORD_AUDIO_REQUEST && grantResults.length > 0) {
if (grantResults[0] == PackageManager.PERMISSION_GRANTED)
Toast.makeText(this, "Permission Granted", Toast.LENGTH_SHORT).show();
}
}
}
When you run the app it will look like this as shown below: