extract
Extract audio features
Syntax
Description
extracts features from all of the audio files in the features
= extract(aFE
,ds
)audioDatastore
object ds
.
specifies options using one or more name-value arguments. For example,
features
= extract(aFE
,ds
,Name=Value
)extract(aFE,ds,UseParallel=true)
reads the data and extracts features
in parallel.
Examples
Extract and Normalize Audio Features
Read in an audio signal.
[audioIn,fs] = audioread("Counting-16-44p1-mono-15secs.wav");
Create an audioFeatureExtractor
to extract the centroid of the Bark spectrum, the kurtosis of the Bark spectrum, and the pitch
of an audio signal.
aFE = audioFeatureExtractor("SampleRate",fs, ... "SpectralDescriptorInput","barkSpectrum", ... "spectralCentroid",true, ... "spectralKurtosis",true, ... "pitch",true)
aFE = audioFeatureExtractor with properties: Properties Window: [1024x1 double] OverlapLength: 512 SampleRate: 44100 FFTLength: [] SpectralDescriptorInput: 'barkSpectrum' FeatureVectorLength: 3 Enabled Features spectralCentroid, spectralKurtosis, pitch Disabled Features linearSpectrum, melSpectrum, barkSpectrum, erbSpectrum, mfcc, mfccDelta mfccDeltaDelta, gtcc, gtccDelta, gtccDeltaDelta, spectralCrest, spectralDecrease spectralEntropy, spectralFlatness, spectralFlux, spectralRolloffPoint, spectralSkewness, spectralSlope spectralSpread, harmonicRatio, zerocrossrate, shortTimeEnergy To extract a feature, set the corresponding property to true. For example, obj.mfcc = true, adds mfcc to the list of enabled features.
Call extract
to extract the features from the audio signal. Normalize the features by their mean and standard deviation.
features = extract(aFE,audioIn); features = (features - mean(features,1))./std(features,[],1);
Plot the normalized features over time.
idx = info(aFE); duration = size(audioIn,1)/fs; subplot(2,1,1) t = linspace(0,duration,size(audioIn,1)); plot(t,audioIn) subplot(2,1,2) t = linspace(0,duration,size(features,1)); plot(t,features(:,idx.spectralCentroid), ... t,features(:,idx.spectralKurtosis), ... t,features(:,idx.pitch)); legend("Spectral Centroid","Spectral Kurtosis", "Pitch") xlabel("Time (s)")
Extract Features from Data Set
Create an audio datastore that points to audio samples included with Audio Toolbox®.
folder = fullfile(matlabroot,"toolbox","audio","samples"); ads = audioDatastore(folder);
Create an audioFeatureExtractor
object to extract the mel spectrum, Bark spectrum, ERB spectrum, and linear spectrum from each audio file. Use the default analysis window and overlap length for the spectrum extraction.
aFE = audioFeatureExtractor(SampleRate=44.1e3, ... melSpectrum=true, ... barkSpectrum=true, ... erbSpectrum=true, ... linearSpectrum=true);
Call extract
to extract the features from each audio file in the datastore. Specify SampleRateMismatchRule
as "resample"
to resample the audio files in the datastore if they do not match 44.1 kHz, the sample rate of the audioFeatureExtractor
object. If you have Parallel Computing Toolbox™, specify UseParallel
as true
to read the files and extract the features in parallel.
specs = extract(aFE,ads,SampleRateMismatchRule="resample",UseParallel=true);
The specs
variable is a numFiles-by-1 cell array, where numFiles is the number of files in the datastore. Each element of the cell array is a numHops-by-numFeatures-by-numChannels array, where the number of hops and number of channels depends on the length and number of channels of the audio file, and the number of features is the requested number of features from the audio data.
numFiles = numel(specs)
numFiles = 37
[numHops1,numFeaturesFile1,numChanelsFile1] = size(specs{1})
numHops1 = 1053
numFeaturesFile1 = 620
numChanelsFile1 = 1
[numHops2,numFeaturesFile2,numChanelsFile2] = size(specs{2})
numHops2 = 1724
numFeaturesFile2 = 620
numChanelsFile2 = 4
Input Arguments
aFE
— Input object
audioFeatureExtractor
object
audioFeatureExtractor
object.
audioIn
— Input audio
column vector | matrix
Input audio, specified as a column vector or matrix of independent channels (columns).
Data Types: single
| double
ds
— Audio datastore
audioDatastore
object
Audio datastore to extract features from, specified as an audioDatastore
object.
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: extract(aFE,ds,SampleRateMismatchRule="resample")
UseParallel
— Read data and extract features in parallel
false
(default) | true
Read data and extract features from the audioDatastore
in
parallel. If you specify true
, extract
reads
the data and extracts features using a pool of parallel workers. For more information
on parallel pools, see parpool
(Parallel Computing Toolbox).
This functionality requires Parallel Computing Toolbox™.
Data Types: logical
SampleRateMismatchRule
— Behavior when sample rate does not match
"error"
(default) | "warn"
| "resample"
Behavior of the extract
function when the sample rate of an
audio file in the audioDatastore
does not match the sample rate set
on the audioFeatureExtractor
object, specified as
"error"
, "warn"
, or
"resample"
.
"error"
— Error immediately if there is a sample rate mismatch."warn"
— Use the sample rate of theaudioFeatureExtractor
object and display a warning if the sample rate of any file does not match."resample"
— If there is a mismatch, resample the audio data to match the sample rate of theaudioFeatureExtractor
object.
Data Types: char
| string
Output Arguments
features
— Extracted audio features
vector | matrix | 3-D array | cell array
Extracted audio features, returned as an L-by-M-by-N array, where:
L –– Number of feature vectors (hops)
M –– Number of features extracted per analysis window
N –– Number of channels
If the input is an audioDatastore
object,
extract
returns a cell array where each cell corresponds to an
audio file and contains the extracted features from that file.
Data Types: single
| double
Extended Capabilities
GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.
Version History
Introduced in R2019bR2023b: Extract features from audio signals stored in audioDatastore
Pass an audioDatastore
to
extract
to extract features from all audio files in the
datastore.
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)