i want to use LSTM based audio network to work with Live audio

Question

Arslan Munim on 27 Jul 2022

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/1768630-i-want-to-use-lstm-based-audio-network-to-work-with-live-audio

Commented: Arslan Munim on 28 Sep 2022

Hello Matlab team,

I am using this example to work with my audio data set https://www.mathworks.com/matlabcentral/fileexchange/74611-fault-detection-using-deep-learning-classification#examples_tab dataset is trained but I want to make the application live with PC, forexample I have a mic and make an application to use my trained model to predict the output.

Can you guide me or help me with that?

Regards,

Arslan Munaim

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

jibrahim on 27 Jul 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1768630-i-want-to-use-lstm-based-audio-network-to-work-with-live-audio#answer_1016040

Open in MATLAB Online

Hi Arslan,

There is a function in that repo (streamingClassifier) that should get the job done in conjunction with an audio device reader:

% Create a microphone object
adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
% These statistic value should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    % Pass to network
    scores = streamingClassifier(frame,M,S);
    % Use the scores any way you want
end

5 Comments
Show 3 older commentsHide 3 older comments

Arslan Munim on 28 Jul 2022

Edited: Arslan Munim on 28 Jul 2022

Hi jibrahim,

Thanks for your reply, I tried using streamingClassifier. however I am trying to use extract function instead of extractFeatures function (because of dependenices issues) however with extract function I can only use one feature at a time. however I trained network with 11 features.

Can you please how i can use extract function in streamingClassifier? I am attaching code for your reference:

windowLength = 512;

overlapLength = 0;

aFE = audioFeatureExtractor('SampleRate',44100, ...

'Window',hamming(windowLength,'periodic'),...

'OverlapLength',overlapLength,...

'spectralCentroid',true, ...

'spectralCrest',true,...

'spectralDecrease',true, ...

'spectralEntropy',true,...

'spectralFlatness',true,...

'spectralFlux',true,...

'spectralKurtosis',true,...

'spectralRolloffPoint',true,...

'spectralSkewness',true,...

'spectralSlope',true,...

'spectralSpread',true);

features = extract(aFE , audioIn)

%%%%%%%%%features = extractFeatures(audioIn);

% Normalize

features = ((features - M')./S');

[net, scores] = predictAndUpdateState(net,features);

jibrahim on 28 Jul 2022

Open in MATLAB Online

Hi Arslan,

The extract function should also return 11 features. For example, if you replace the eixsting function extractFeatures with this modified function, things should work the same:

function featureVector = extractFeatures2(x)
%#codegen
persistent afe
if isempty(afe)
    windowLength = 512;
    overlapLength = 0;
    afe = audioFeatureExtractor('SampleRate',44100, ...
        'Window',hamming(windowLength,'periodic'),...
        'OverlapLength',overlapLength,...
        'spectralCentroid',true, ...
        'spectralCrest',true,...
        'spectralDecrease',true, ...
        'spectralEntropy',true,...
        'spectralFlatness',true,...
        'spectralFlux',true,...
        'spectralKurtosis',true,...
        'spectralRolloffPoint',true,...
        'spectralSkewness',true,...
        'spectralSlope',true,...
        'spectralSpread',true);
end
featureVector = extract(afe,x);
end

The size of featureVector will be 1-by-11, each element in the vector representing one of your features.

Notice I declared afe as persistent. This is to ensure the audio feature extractor is not recreated every time you call this function in your loop. the extractor goes through some one-time setup computations when you first call it. No need to waste time repeating those.

jibrahim on 2 Aug 2022

Open in MATLAB Online

Hi Arslan,

Since you trained the network with a sample rate of 16e3, you will have to perform sample-rate conversion from 44100 kHz to 16 kHz. This code is a possible implementation, where you essentially feed the network frames of length 512 sampled at 16 kHz, just like the original code:

% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,...
                              Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D; % get as close to desired frame size
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    % Convert to 16 KHz
    frame = src(frame); 
    % Save to buffer
    write(buff,frame)
    while buff.NumUnreadSamples >= 512
        frame = read(buff,512);
        % Pass to network
        scores = streamingClassifier(frame,M,S);
        % Use the scores any way you want
    end
end

Note that you can also potentially feed the network longer frames. That should also work, and is probably more efficient, as the network will run faster if you give it a long input (as opposed to multiple short ones):

% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    % Convert to 16 KHz
    frame = src(frame); 
    % Save to buffer
    write(buff,frame)
    N = buff.NumUnreadSamples;
    L = floor(N/512);
    if L>0
        frame = read(buff,512*L);
        % Pass to network
        scores = streamingClassifier(frame,M,S);
        % Use the scores any way you want
    end
end

If you can't change the frame size on the microphone, then you can handle that using another buffer, for example:

% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=22000);
buffSRC = dsp.AsyncBuffer;
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
    % Read a frame of data from microphone
    frame = adr();
    write(buffSRC,frame);
    frame = read(buffSRC,frameLength);
    % Convert to 16 KHz
    frame = src(frame); 
    % Save to buffer
    write(buff,frame)
    N = buff.NumUnreadSamples;
    L = floor(N/512);
    if L>0
        frame = read(buff,512*L);
        % Pass to network
        scores = streamingClassifier(frame,M,S);
        % Use the scores any way you want
    end
end

Arslan Munim on 9 Aug 2022

Hi jibrahim,

Thankyou for your support, it was very helpful.

Now I want to use multiple mics for prediction can you please give me some idea how i can use streaming classifier with 3 or 4 mics of the predicition.

Thanks and have a nice day.

Regards,

Arslan

Sign in to comment.

Answer 2

jibrahim on 9 Aug 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1768630-i-want-to-use-lstm-based-audio-network-to-work-with-live-audio#answer_1023635

Hi Arslan,

audioDeviceReader supports multi-mic devices. Use the ChannelMappingSource and ChannelMapping properties to map between device input channels and the output data.

This network was trained on mono data, so, to adapt it to multi-channel data, you either have to retrain your network for multi-channel data, or somehow combine your input channels into one channel (by a weighted sum, or selecting a particular channel, etc) and proceed like above.

23 Comments
Show 21 older commentsHide 21 older comments

Arslan Munim on 17 Aug 2022

Edited: Walter Roberson on 19 Aug 2022

Open in MATLAB Online

Hi jibrahim,

I try to read data from multiple mic but it is giving me this error everytime i try to use multiple mic, I am trying to read frame from each Microphone and send that data to streaming classifier to predict the output but it giving me error always on frame1 = adr1()

Error using audioDeviceReader/setup

A given audio device may only be opened once.

Error in audioDeviceReader/setupImpl

Error in multipleMic (line 10)

frame1 = adr1() - Show complete stack trace

adr1 = audioDeviceReader(SampleRate=44.1e3,SamplesPerFrame=22000, Device="Microphone (4- USB PnP Sound Device)",BitDepth="16-bit integer");
adr2 = audioDeviceReader(SampleRate=44.1e3,SamplesPerFrame=22000, Device="Microphone (USB PnP Sound Device)",BitDepth="16-bit integer");
% These statistic value should come from your training...
% M = 0;
% S = 1;
while 1
    % Read a frame of data from microphone
    frame1 = adr1()
    frame2 = adr2()  
    % Pass to network
    [class] = streamingClassifier2(frame1,frame2,M,S)
    % Use the scores any way you want
end
function [class] = streamingClassifier2(frame1,frame2,M,S)
% This is a streaming classifier function 
persistent net; 
if isempty(net)
    net = coder.loadDeepLearningNetwork('net.mat');
end
% Extract features using function
%features = extract(aFE , audioIn)
features1 = extractFeatures2(frame1);
features2 = extractFeatures2(frame2);
% Normalize 
features1 = ((features1 - M)./S).';
features2 = ((features2 - M)./S).';
% Classify
[class] = classify(net,{features1,features2});
%[net, scores] = classify(net,feature)
end

jibrahim on 20 Aug 2022

OK, this helps. You will need other hardware (one device, multiple mics) for the system to recognize it. You could also give the UDP idea a shot, see how viable that is.

Arslan Munim on 28 Sep 2022

Hi again,

I am trying to train my network, with lowering BitsPerSample to 8 before it was 16 BitsPerSample. Every time i try to start training model it throw warning (given below) and terminates.

I try it with different sample rate but it gives same error everytime. I tried to change my layer structure, changing InitialLearnRate',0.001 but still i am getting same warning.

Warning: Training stopped at iteration 1 because training loss is NaN. Predictions using the output network might contain NaN values.

Model:

layers = [ ...

sequenceInputLayer(size(trainingFeatures{1},1))

lstmLayer(100,"OutputMode","sequence")

dropoutLayer(0.1)

lstmLayer(100,"OutputMode","last")

fullyConnectedLayer(5)

softmaxLayer

classificationLayer];

miniBatchSize = 30;

validationFrequency = floor(numel(trainingFeatures)/miniBatchSize);

options = trainingOptions("adam", ...

"MaxEpochs",100, ...

"MiniBatchSize",miniBatchSize, ...

"Plots","training-progress", ...

"Verbose",false, ...

"Shuffle","every-epoch", ...

"LearnRateSchedule","piecewise", ...

"LearnRateDropFactor",0.1, ...

"LearnRateDropPeriod",20,...

'InitialLearnRate',0.001,...

'ValidationData',{validationFeatures,adsValidation.Labels}, ...

'ValidationFrequency',validationFrequency);

Regards,

Arslan

Sign in to comment.

i want to use LSTM based audio network to work with Live audio

0 Comments
Show -2 older commentsHide -2 older comments

Answers (2)

5 Comments
Show 3 older commentsHide 3 older comments

23 Comments
Show 21 older commentsHide 21 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

i want to use LSTM based audio network to work with Live audio

0 Comments Show -2 older commentsHide -2 older comments

Answers (2)

5 Comments Show 3 older commentsHide 3 older comments

23 Comments Show 21 older commentsHide 21 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

5 Comments
Show 3 older commentsHide 3 older comments

23 Comments
Show 21 older commentsHide 21 older comments