Why are 8 STFT vectors used for the predictor input, in the "Denoise Speech Using Deep Learning Networks" example ?

Question

Daniel Graham on 23 Aug 2021

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/1439159-why-are-8-stft-vectors-used-for-the-predictor-input-in-the-denoise-speech-using-deep-learning-netw

Answered: Sahil Jain on 1 Sep 2021

In the MATLAB example of denoising speech with deep learning, I have a hard time in grasping why they used 8 STFT segments for their predictor input.

it's been stated and underlined in this section;

Please does anyone get it?

1 Comment
Show -1 older commentsHide -1 older comments

Daniel Graham on 25 Aug 2021

Please anyone with an idea? I have been desperately searching for an answer to this, yet found none.

Sign in to comment.

Sign in to answer this question.

Answer 1

Sahil Jain on 1 Sep 2021

1
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1439159-why-are-8-stft-vectors-used-for-the-predictor-input-in-the-denoise-speech-using-deep-learning-netw#answer_778684

Hi Daniel. The example states "The predictor input consists of 8 consecutive noisy STFT vectors, so that each STFT output estimate is computed based on the current noisy STFT and the 7 previous noisy STFT vectors". This may have been done because the authors of this approach believe that taking into account the noisy STFT vectors of the current segment and the noisy STFT vectors of the previous 7 segments would lead to better performance. I would suggest going through the research articles mentioned in the references at the end of the example to further understand the motivation for doing this. Also, you can try training the network using only the current segment as input and see how it performs in comparison to using 8 segments.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Why are 8 STFT vectors used for the predictor input, in the "Denoise Speech Using Deep Learning Networks" example ?

1 Comment
Show -1 older commentsHide -1 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Why are 8 STFT vectors used for the predictor input, in the "Denoise Speech Using Deep Learning Networks" example ?

1 Comment Show -1 older commentsHide -1 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments