# melSpectrogram

Mel spectrogram

## Syntax

## Description

specifies options using one or more name-value arguments.`S`

= melSpectrogram(`audioIn`

,`fs`

,`Name=Value`

)

`melSpectrogram(___)`

plots the mel spectrogram on a
surface in the current figure.

## Examples

### Calculate Mel Spectrogram

Use the default settings to calculate the mel spectrogram for an entire audio file. Print the number of bandpass filters in the filter bank and the number of frames in the mel spectrogram.

[audioIn,fs] = audioread('Counting-16-44p1-mono-15secs.wav'); S = melSpectrogram(audioIn,fs); [numBands,numFrames] = size(S); fprintf("Number of bandpass filters in filterbank: %d\n",numBands)

Number of bandpass filters in filterbank: 32

`fprintf("Number of frames in spectrogram: %d\n",numFrames)`

Number of frames in spectrogram: 1551

Plot the mel spectrogram.

melSpectrogram(audioIn,fs)

### Calculate Mel Spectrums of 2048-Point Windows

Calculate the mel spectrums of 2048-point periodic Hann windows with 1024-point overlap. Convert to the frequency domain using a 4096-point FFT. Pass the frequency-domain representation through 64 half-overlapped triangular bandpass filters that span the range 62.5 Hz to 8 kHz.

[audioIn,fs] = audioread('FunkyDrums-44p1-stereo-25secs.mp3'); S = melSpectrogram(audioIn,fs, ... 'Window',hann(2048,'periodic'), ... 'OverlapLength',1024, ... 'FFTLength',4096, ... 'NumBands',64, ... 'FrequencyRange',[62.5,8e3]);

Call `melSpectrogram`

again, this time with no output arguments so that you can visualize the mel spectrogram. The input audio is a multichannel signal. If you call `melSpectrogram`

with a multichannel input and with no output arguments, only the first channel is plotted.

melSpectrogram(audioIn,fs, ... 'Window',hann(2048,'periodic'), ... 'OverlapLength',1024, ... 'FFTLength',4096, ... 'NumBands',64, ... 'FrequencyRange',[62.5,8e3])

### Get Filter Bank Center Frequencies and Analysis Window Time Instants

`melSpectrogram`

applies a frequency-domain filter bank to audio signals that are windowed in time. You can get the center frequencies of the filters and the time instants corresponding to the analysis windows as the second and third output arguments from `melSpectrogram`

.

Get the mel spectrogram, filter bank center frequencies, and analysis window time instants of a multichannel audio signal. Use the center frequencies and time instants to plot the mel spectrogram for each channel.

[audioIn,fs] = audioread('AudioArray-16-16-4channels-20secs.wav'); [S,cF,t] = melSpectrogram(audioIn,fs); S = 10*log10(S+eps); % Convert to dB for plotting for i = 1:size(S,3) figure(i) surf(t,cF,S(:,:,i),'EdgeColor','none'); xlabel('Time (s)') ylabel('Frequency (Hz)') view([0,90]) title(sprintf('Channel %d',i)) axis([t(1) t(end) cF(1) cF(end)]) end

## Input Arguments

`audioIn`

— Audio input

column vector | matrix

Audio input, specified as a column vector or matrix. If specified as a matrix, the function treats columns as independent audio channels.

**Data Types: **`single`

| `double`

`fs`

— Input sample rate (Hz)

positive scalar

Input sample rate in Hz, specified as a positive scalar.

**Data Types: **`single`

| `double`

### Name-Value Arguments

Specify optional pairs of arguments as
`Name1=Value1,...,NameN=ValueN`

, where `Name`

is
the argument name and `Value`

is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.

*
Before R2021a, use commas to separate each name and value, and enclose*
`Name`

*in quotes.*

**Example: **`FFTLength=1024`

`Window`

— Window applied in time domain

`hamming(round(fs*0.03),'periodic')`

(default) | vector

Window applied in time domain, specified as a real vector. The number of elements
in the vector must be in the range
[1,`size(`

]. The number of elements
in the vector must also be greater than `audioIn`

,1)`OverlapLength`

.

**Data Types: **`single`

| `double`

`OverlapLength`

— Analysis window overlap length (samples)

`round(0.02*``fs`

)

(default) | integer in the range `[0, (numel(``Window`

) -
1)]

`fs`

)`Window`

) -
1)]Analysis window overlap length in samples, specified as an integer in the range
`[0, (numel(`

.`Window`

) - 1)]

**Data Types: **`single`

| `double`

`FFTLength`

— Number of DFT points

`numel(``Window`

)

(default) | positive integer

`Window`

)Number of points used to calculate the DFT, specified as a positive integer
greater than or equal to the length of `Window`

. If unspecified,
`FFTLength`

defaults to the length of
`Window`

.

**Data Types: **`single`

| `double`

`NumBands`

— Number of mel bandpass filters

`32`

(default) | positive integer

Number of mel bandpass filters, specified as a positive integer.

**Data Types: **`single`

| `double`

`FrequencyRange`

— Frequency range over which to compute mel spectrogram (Hz)

`[0 ``fs`

/2]

(default) | two-element row vector

`fs`

/2]Frequency range over which to compute the mel spectrogram in Hz, specified as a
two-element row vector of monotonically increasing values in the range ```
[0,
```

. `fs`

/2]

**Data Types: **`single`

| `double`

`SpectrumType`

— Type of mel spectrogram

`"power"`

(default) | `"magnitude"`

Type of mel spectrogram, specified as `"power"`

or
`"magnitude"`

.

**Data Types: **`char`

| `string`

`WindowNormalization`

— Apply window normalization

`true`

(default) | `false`

Apply window normalization, specified as `true`

or
`false`

. When `WindowNormalization`

is set to
`true`

, the power (or magnitude) in the mel spectrogram is
normalized to remove the power (or magnitude) of the time domain
`Window`

.

**Data Types: **`char`

| `string`

`FilterBankNormalization`

— Type of filter bank normalization

`"bandwidth"`

(default) | `"area"`

| `"none"`

Type of filter bank normalization, specified as `"bandwidth"`

,
`"area"`

, or `"none"`

.

**Data Types: **`char`

| `string`

`MelStyle`

— Mel style

`"oshaughnessy"`

(default) | `"slaney"`

Mel style, specified as `"oshaughnessy"`

or
`"slaney"`

.

**Data Types: **`char`

| `string`

`ApplyLog`

— Apply logarithm

`false`

(default) | `true`

Apply base 10 logarithm to the returned mel spectrogram, specified as
`true`

or `false`

.

**Data Types: **`logical`

## Output Arguments

`S`

— Mel spectrogram

column vector | matrix | 3-D array

Mel spectrogram, returned as a column vector, matrix, or 3-D array. The dimensions
of `S`

are
*L*-by-*M*-by-*N*, where:

*L*is the number of frequency bins in each mel spectrum.`NumBands`

and`fs`

determine*L*.*M*is the number of frames the audio signal is partitioned into.`size(`

, the length of`audioIn`

,1)`Window`

, and`OverlapLength`

determine*M*.*N*is the number of channels such that*N*=`size(`

.`audioIn`

,2)

Trailing singleton dimensions are removed from the output
`S`

.

**Data Types: **`single`

| `double`

`F`

— Center frequencies of mel bandpass filters (Hz)

row vector

Center frequencies of mel bandpass filters in Hz, returned as a row vector with
length `size(`

.`S`

,1)

**Data Types: **`single`

| `double`

`T`

— Location of each window of audio (s)

row vector

Location of each analysis window of audio in seconds, returned as a row vector
length `size(`

. The location corresponds to
the center of each window.`S`

,2)

**Data Types: **`single`

| `double`

## Algorithms

The `melSpectrogram`

function follows the general algorithm to compute
a mel spectrogram as described in [1].

In this algorithm, the audio input is first buffered into frames of
`numel(`

number of samples. The frames are
overlapped by `Window`

)`OverlapLength`

number of samples. The specified
`Window`

is applied to each frame, and then the frame is converted to
frequency-domain representation with `FFTLength`

number of points. The
frequency-domain representation can be either magnitude or power, specified by
`SpectrumType`

. If `WindowNormalization`

is set to
`true`

, the spectrum is normalized by the window. Each frame of the
frequency-domain representation passes through a mel filter bank. The spectral values output
from the mel filter bank are summed, and then the channels are concatenated so that each frame
is transformed to a `NumBands`

-element column vector.

### Filter Bank Design

The mel filter bank is designed as half-overlapped triangular filters equally spaced on
the mel scale. `NumBands`

controls the number of mel bandpass filters.
`FrequencyRange`

controls the band edges of the first and last filters
in the mel filter bank. `FilterBankNormalization`

specifies the type of
normalization applied to the individual bands.

The mel scale can be in the O'Shaughnessy style, which follows [2], or the Slaney style, which follows [3].

## References

[1] Rabiner, Lawrence R., and Ronald
W. Schafer. *Theory and Applications of Digital Speech Processing*. Upper
Saddle River, NJ: Pearson, 2010.

[2] O'Shaughnessy, Douglas.
*Speech Communication: Human and Machine*. Reading, MA: Addison-Wesley
Publishing Company, 1987.

[3] Slaney, Malcolm. "Auditory Toolbox: A MATLAB Toolbox for Auditory Modeling Work." Technical Report, Version 2, Interval Research Corporation, 1998.

## Extended Capabilities

### C/C++ Code Generation

Generate C and C++ code using MATLAB® Coder™.

The `melSpectrogram`

function supports optimized code generation
using single instruction, multiple data (SIMD) instructions. For more information about SIMD
code generation, see Generate SIMD Code from MATLAB Functions for Intel Platforms (MATLAB Coder).

### GPU Code Generation

Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

### GPU Arrays

Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

## Version History

**Introduced in R2019a**

### R2024b: `WindowLength`

has been removed

The `WindowLength`

parameter has been removed from the
`melSpectrogram`

function. Use the `Window`

parameter instead.

In releases prior to R2020b, you could only specify the length of a time-domain window. The window was always designed as a periodic Hamming window. You can replace instances of the code

S = melSpectrogram(audioin,fs,WindowLength=1024);

`S = melSpectrogram(audioIn,fs,Window=hamming(1024,"periodic"));`

### R2024a: Apply logarithm to mel spectrogram

Set the `ApplyLog`

name-value argument to `true`

to
apply a base 10 logarithm to the spectrogram.

### R2023b: Support for Slaney-style mel scale

Set the `MelStyle`

name-value argument to `"slaney"`

to use the Slaney-style mel scale.

### R2023a: Generate optimized C/C++ code for computing mel spectrogram

`melSpectrogram`

supports optimized C/C++ code generation using
single instruction, multiple data (SIMD) instructions.

### R2020b: `WindowLength`

will be removed in a future release

The `WindowLength`

parameter will be removed from the
`melSpectrogram`

function in a future release.

## See Also

`spectrogram`

| `mfcc`

| `gtcc`

| `mdct`

| `audioFeatureExtractor`

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)