Extract VGGish features
returns VGGish feature embeddings over time for the audio input embeddings
= vggishFeatures(audioIn
,fs
)audioIn
with sample rate fs
. Columns of the input are treated as individual
channels.
specifies options using one or more embeddings
= vggishFeatures(audioIn
,fs
,Name,Value
)Name,Value
pair arguments. For
example, embeddings = vggishFeatures(audioIn,fs,'ApplyPCA',true)
applies
a principal component analysis (PCA) transformation to the audio embeddings.
This function requires both Audio Toolbox™ and Deep Learning Toolbox™.
[1] Gemmeke, Jort F., Daniel P. W. Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R. Channing Moore, Manoj Plakal, and Marvin Ritter. 2017. “Audio Set: An Ontology and Human-Labeled Dataset for Audio Events.” In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 776–80. New Orleans, LA: IEEE. https://doi.org/10.1109/ICASSP.2017.7952261.
[2] Hershey, Shawn, Sourish Chaudhuri, Daniel P. W. Ellis, Jort F. Gemmeke, Aren Jansen, R. Channing Moore, Manoj Plakal, et al. 2017. “CNN Architectures for Large-Scale Audio Classification.” In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 131–35. New Orleans, LA: IEEE. https://doi.org/10.1109/ICASSP.2017.7952132.
audioFeatureExtractor
| classifySound
| vggish
| yamnet
| yamnetGraph