Remove echo (convolution) from an .mp3 audio file

9 views (last 30 days)
I have degraded voice signals. Degradations include additive noise, echo (convolution), loss of spectral resolution, among others. I've already improved it with a non-ideal low-pass filter, and it eliminated the white noise, but it still has echo. I've tried several techniques, but they worsen the audio, and I don't see any improvement. Can someone help me?
  3 Comments
Valeria de los Angeles
Valeria de los Angeles on 14 May 2025
Edited: Walter Roberson on 14 May 2025
As such, I haven't found a technique. I used the IIR filter, which models the delay as: w(n)+α⋅w(n−Δ)=y(n) and did something like this:
% === Estimate echo delay (Δ) ===
% Autocorrelation to find peaks
[Rmm, lags] = xcorr(y, y);
[~, dl] = findpeaks(Rmm, 'MinPeakHeight', 0.22); % Estimate delay
if empty (dl)
warning('Could not estimate echo delay.');
dl = round(0.2 * fs); % Default: 200 ms
else
dl = abs(lags(dl(1))); % First significant delay
end
% === Echo cancellation with IIR filter ===
alpha = 0.8; % Adjust according to echo strength
y_no_echo = filter(1, [1 zeros(1, dl-1) alpha], y);
% Update the signal to continue filtered processing
y = y_no_echo;
but I'm really lost.
Mathieu NOE
Mathieu NOE on 15 May 2025
could you please share the audio file and the code pls ?

Sign in to comment.

Answers (1)

William Rose
William Rose on 16 May 2025
I recommend using the autocorrelation function to estimate the echo effects, as @Walter Roberson said.
To demonstrate, I made a short sound file (daisybell.wav, attached). Script addEcho.m, attached, adds reverberant echos to the recording, saves the result, and plays the original and the version with reverb.
Script removeEcho.m removes the reverb from a file, but you have to know the delay time ('delay') and the echo amplitude ('a'). In this case we do know a and delay, and the result sonds a lot like the original.
In real life, you will have to estimate a and delay. The autocorrelation for this file is of little help. The spikes in the autocorrelation that are due to echos are hidden in a complicated signal. Many of the other local maxima in the autocorrelation may be due to the formant frequencies of the singer's voice.
If you can record a very short loud sound, you can estimate a and delay from the autocorrelation. Therefore I reocrded a single hand clap (clap.wav), to which reverberation has been artificially added with addEcho.m (clapEcho.wav). Clapping two wood block together several times might make closer-to-ideal impulses, from which to estimate the impulse response. We use clapEcho.wav to estimate the echo delay and amplitude:
[y,Fs]=audioread('clapEcho.wav');
acfy=autocorr(y(:,1),NumLags=Fs); % estimate 1-second-long autocorrelation function
delaytime=(0:Fs)/Fs; % autocorrelation lag (s)
plot(delaytime,acfy,'-r.')
xlabel('Lag (s)'); title('ACF(clapEcho.wav)'); grid on
The commands above produce the plot below:
The autocorrelaton function has a spike with amplitude 0.4 at lag=0.200 s. We ignore the later spikes because we assume they are reverberant copies of the initial echo. (Their timing and their ampltudes indicate that they are.) Therefore we estimate a=0.4 and delay=0.2 s. We put those values into removeEcho.m in order to deconvolve recordings made in the same reverberant environment, such as daisybellEcho.wav.
removeEcho.m takes about 15 seconds to deconvolve a 5 second recording, on my machine. Then it plays the original and deconvolved signals.
soundFiles.zip contains recordings clap.wav and daisybell.wav, and the recordings with echos added by addEcho.m (clapEcho.wav,...), and the recordings with echos removed by deconvolution with removeEcho.m (clapEchoDeconv.wav,...).

Categories

Find more on Audio I/O and Waveform Generation in Help Center and File Exchange

Products


Release

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!