File Exchange

image thumbnail


Automatic speech-to-text conversion


Updated 04 Oct 2019

View License

Automate labeling and tagging of speech recordings, assess the performance of DSP pipelines for voice and speech enhancement, run text analytics on voice recordings, and more.

This entry enables you to convert sampled speech recordings available as MATLAB vectors into strings using a single function call. Starting from MATLAB release R2019b, this also enables you to perform speech transcription interactively using Audio Labeler app.

You will need a license of Audio Toolbox, an internet connection, and an active subscription to a speech-to-text service of your choice – Google™ Cloud Speech-to-Text API, IBM™ Watson Speech to Text API, or Microsoft™ Azure Speech Services API.

Please check out the Examples tab for detailed instructions on how to get started.

Cite As

MathWorks Audio Toolbox Team (2019). speech2text (, MATLAB Central File Exchange. Retrieved .

Comments and Ratings (45)

Hi Dan, boost works fine for me with v1p1beta1 version. Please try a different audio file and/or compare your results with that from Google's web app:
Also, please remember, all 3 cloud service providers have a limit on the length of the audio file for synchronous requests. Google limit is ~1 minute:
If you still cannot resolve, MathWorks techical support might be able to help you:

Dan Waisel

Hi Raja, thanks, I'm now working with v1p1beta1 version.
Still 'boost' does not give me the wanted results, I'm not sure why yet.

Hi Dan, good question. You can set the sub-fields using MATLAB structure array, and it would look something like this:

speechObject = speechClient('Google', 'speechContexts', struct('phrases', 'Weather is hot', 'boost', 2))
speechObject = speechClient('Google', 'speechContexts', struct('phrases', ["Weather is hot","celsius","rain"], 'boost', 5))

By the way, I just made an update providing a way to customize the recognize URL in Google client. You can now modify the default recognize url using your JSON file.

Follow these steps to try the v1p1beta1 version of Google Cloud Speech-to-Text API:
1. Download the latest version of speech2text.
2. In your Google_Credentials_Speech2text.json file, set the "recognizeUrl" attribute to "" (Google might change this URL when they release a new version).
3. Try using the function with speechContexts and boost as per above syntax.

speech2text does not guarantee all functionalities in the v1p1beta1 version or any future beta version of Google Cloud Speech-to-Text API. This update just provides a way to customize the default recognition URL.

Hope this helps. Thanks for using this function.

Dan Waisel

Hi Raja, thank you for this information!
When boost is supported, what would be the proper way to set its value? Unlike the other properties, it's a sub-filed of another property 'speechContext'. Providing 'speechContext' with json string as follows: speechClient('Google','speechContexts',[{"phrases": ["Weather is hot"],"boost": 2}]) works in Matlab?(Only after you add support for this of course)
Looking forward to your reply and getting better understanding of this powerful function and API.

Hi Dan, interesting question! Thanks for posting your question here.
"boost" is a new property in v1p1beta1 version of Google Cloud Speech-to-Text API and is not supported by speech2text yet. speech2text currently uses the v1 version of the API. That's probably why the service ignores the "boost" option. We are likely to add support for the new API only after the beta version is released.

See the difference between the two APIs here:

Apologies if this doesn't solve your current use case.

Dan Waisel


I'm using Google's speech client and trying to set options in the request:
This line isn't working as I'd expected. The 'languageCode','he-IL','enableWordTimeOffsets',true options works fine but it seems to ignore the 'boost' option that is a sub attribute of 'speechContexts', as explained in Google's API:
"config": {
"sampleRateHertz": 8000,
"speechContexts": [{
"phrases": ["Weather is hot"],
"boost": 2

I would appreciate if someone could explain how this options should be set.
Dan Waisel
Tel-Aviv University

Hi John, Good question. I haven't reviewed this in detail yet, so apologies if his isn't 100% useful. I seem to remember that unless you specify a different location, the file is saved by default in C:\Users\<username>\AppData\Roaming\gcloud. If the client is Google, the file name should be Google_Credentials_Speech2text.json. You could also try the following in MATLAB >> which Google_Credentials_Speech2text.json


How do you update the JSON file? I used the wrong file on my initial run when the program promoted me and now it will not allow me to change the location of the json file.

Thanks a lot Gabriele!
Guess i will go for a C# interface to google streaming and most likely will import some of the matlab skills to that application.

Hi Danni, unfortunately we are unable to make the source code of speech2text available at this point. In any case the modifications to support the streaming speech-to-text interfaces wouldn't be trivial. If you wanted to try to develop your own MATLAB wrapper for a particular web-based service, you first want to closely review the published web API of the service that you are interested in. To script and automate the requests using MATLAB, key building blocks would be the MATLAB HTTP ( and JSON ( interfaces. Good luck!

Thanks a lot Gabriele, clear. Is there any chance to get an idea about the under the hood of the speech2text such that i can modify it to work in a streaming mode? Thanks a lot Dani

Hi Danni, speech2text itself was not designed to support streaming use and it doesn't leverage any purpose-built streaming interface from the cloud services provided. In some cases (e.g. breaking up sentences on the fly using some kind of VAD) passing the segments to speech2text may still yield acceptable results. However, besides the added latency, the cloud services will transcribe each segment in isolation.

Hi Gabriele, Is it possible to use the speech2text environment to run the Google API in a steaming model? meaning transmitting samples from the microphone to Google and receiving back the results in a real time?

Hi Piyush, please review the "Examples" tab of this page - "Perform Speech-to-Text using 3rd party Speech APIs" should already include all detailed steps and code samples. If you feel anything is missing, we would be greatful if you could tell us in detail what that is. Thanks in advance!

Can anyone tell step-wise , how to create reqyuired objects and json files to use google voice api?
So that I can use the speech2text?

Feifan Jia

Hi Adnan,
Please take a look at main example provided. Beyond the instructions on how to get things installed, under "Perform Speech-to-Text Transcription" you will find code that shows how to load a pre-recorded speech segment from a file and how to use speech2text to get a transcription.

how to used Speech2text algorithm in matlab first time ? could answer please urgently
I have a audio file but i want to translate it into a text any idea please

Never mind, I have realised my mistake and corrected it. I needed to add the path to the downloaded speec2text folder, not the compiled object.
Many thanks. :)

Hello Gabriele, thank you for your quick reply.
I have tried adding the file manually following Raja's post, however either I have not understood or it has not worked as the problem remains. I tried both:
" addpath('C:\Program Files\MATLAB\R2019a\toolbox\audio\audio\compiled\') " and
" addpath(genpath'C:\Program Files\MATLAB\R2019a\toolbox\audio\audio\compiled\')) ". Have I misunderstood something?

Hello Oliver, thank you for reaching out.
We have identified an issue with the add-on installation, which prevents the submission folders from being added to the MATLAB search path.
We will aim to fix the issue in the upcoming update. In the meantime, please add manually all submission folders to the MATLAB path (add the top-level folder and include all subfolders). You may refer to Raja's post here below for more info on this topic.

I am trying to use the speech2text but keep getting the error below.
"Error using speechClient
Unable to access speech2text. Make sure the file is
installed. Go to File Exchange to download. For more
information, click here."
I have confirmed the speech2text add on is installed and "which speech2text" returns the sensible answer "C:\Program Files\MATLAB\R2019a\toolbox\audio\audio\compiled\speech2text.p". Does anyone have any ideas why this isn't working?

Hello.. I keep getting an error.. I'm Using the Google Speech Recognition API but everytime I try to run it I got this error message:
Error using coder.internal.error (line 14)
Unable to access speech2text. Make sure the file is
installed. Go to File Exchange to download. For more
information, click here.

Error in speechClient

Error in speechtest (line 1)
speechObject =

Hello Grayson,

The downloaded speech2text files may not be on your MATLAB Search path (

Please addpath your downloaded speech2text folder ( or cd to it before running the speech2text commands.

Hope this helps.

Hi there,
I am trying to use Google's Speech recognition API, but every time I try to make a speechClient I get this error:
Error using coder.internal.error (line 14)
Unable to access speech2text. Make sure the file is installed. Go to File Exchange to download. For more information, click here.

Error in speechClient

I've checked multiple times and speech2text is definitely installed. I also do have the audio system toolbox installed. Any idea what I'm doing wrong?
Thanks in advance.

Hello Oliver,

It looks like you are using the ‘languageCode’ to pass the model name for IBM, but you would need to pass it using ‘model’, something like:

transcriber = speechClient('IBM','model','es-ES_NarrowbandModel');

This is the expected Name-Value as mentioned in the IBM documentation -

Hope this helps!

Oliver Sell


I am using the IBM Watson Speech API. When I run this function I get the following error:

'Bad Request' 400 'This 8000hz audio input requires a narrow band model. See https://<STT_API_ENDPOINT>/v1/models for a list of available models.'

I tried 'en-US_NarrowbandModel' as 'languageCode' but it still does not work. How can I pass the variable 'model' 'en-US_NarrowbandModel' or change the model?

Thank you in advance.

Priyal Goel

Hello, I get the following error when I run:
[y,fs] = audioread('youre-on-the-right-track.wav');
speechObject = speechClient('Microsoft','recognition','interactive','language','en-US');
tableOut = speech2text(speechObject,y,fs)

Output argument "tableOut" (and maybe others) not assigned during call to "speechClient/speechTotext".

Error in speech2text

Error in sampleTesting (line 4)
tableOut = speech2text(speechObject,y,fs)

I'm using the Microsoft Azure Bing API
Can someone please help me with this?

@Sumit Mondal - Thank you for reporting this. It looks like this error was triggered by the absence of a license for Audio System Toolbox, which is required by speech2text. The lack of clarity of the actual error message will be fixed in an upcoming update.

The errors I get when trying to run the speech2text() function are the following :
Unable to find message key 'noAudio' in catalog 'signal:sigtools'.

Error in speechClient.checkoutASTLicense

Error in speechClient/speechTotext

Error in speech2text

Does anyone know what the problem could be ?

Is there any way to enable word time onsets and offsets in the Google API? See:

On the frequently encountered error "Expected input to be a vector" - Please note that the second input argument y of the speech2text function needs to be either a column or a row vector, i.e. an array having one of its dimensions equal to 1. It is very common for audio recordings to be stored in stereo format, so you may want to check the size of your audio array before using speech2text, for example by looking at your MATLAB workspace. If your audio array has multiple channels (typically resulting in a number of columns greater than 1), you need to select only one of them. Good options for stereo signals include either the left channel, i.e. y = readAudio(:,1), the right channel, i.e. y = readAudio(:,2), or their average across channels, i.e. y = mean(readAudio,2)

thank you I have emailed you @gabriele

@Sunaina Aytan - Thank you for your getting in touch. Please send more information on the error you are getting, including full reproduction steps, using

hi i keep getting this error please help

Error using speechClient/speechTotext
Expected input to be a vector.

Error in speech2text

sneha madre

Sam, you are saving the JSON file incorrectly. The contents of JSON file for IBM should only contain the "username" and "password" obtained from your IBM Speech API account. Also, don't forget to include the parenthesis - "{" at the beginning and, "}" at the end of your JSON file.

In the downloaded folder, you should see 'writing_IBM_JSON.png' inside the HTML sub-folder. This image will help you with writing the JSON file for the IBM API.

Hope this helps!

Sam Cocks

When creating the .json file using IBM, I get an error on the jsondecode function where it is reading the save information of the file first:

Error using jsondecode
JSON syntax error at line 1, column 1 (character 1): expected value but found 'MATLAB'.

Opening my json file in matlab gives this as my first row:

MATLAB 5.0 MAT-file, Platform: MACI64, Created on: Wed

Am I saving my json incorrectly?


another issue. when I try to process larger files, matlab returns The connection to URL '' timed out after 10 seconds. Set HTTPOptions.Timeout to a higher value..but I don't have the option to set the timeout value as they are protected .m files..


Arh I figured it out. this happened when google api can't detect any speech..


Hiya, I've got this error when I run:

[samples,fs] = audioread('handel.wav');
speechObject = speechClient('Google','languageCode','en-US');
tableOut = speech2text(speechObject,samples,fs)

Reference to non-existent field 'results'.

Error in speechClient/showOutput

Error in speechClient/googleAPI

Error in speechClient/speechTotext

Error in speech2text

Error in speech2textconvert (line 8)
tableOut = speech2text(speechObject,samples,fs)

Excellent One.



Allow specifying a custom recognize URL for Google client. This provides a way to use beta versions of Google Cloud Speech-to-Text API.


Prevent adding the setup script to MATLAB path


Typo fix


Added support for interactive speech to text transcription using Audio Labeler in MATLAB release R2019b

Addressed compatibility issues in older MATLAB releases (R2017a and R2017b)

Added support for new authentications schemes for IBM and Microsoft APIs.

Corrected path update on install

Improved handling of errors and lack of data in responses when using Microsoft API.

Updates for changes to IBM API

Added files under Files/en to enable cmd line help for p-coded files.

Added HTTPTimeOut option to allow using longer speech recordings.
Added error message to better handle a scenario where an HTTP request is successful but the API does not return any transcription data

MATLAB Release Compatibility
Created with R2019b
Compatible with R2017a to any release
Platform Compatibility
Windows macOS Linux