# How to identify datasegments that correlate to a given frequency

6 views (last 30 days)
Katy Weihrich on 12 Apr 2021
Answered: Mathieu NOE on 13 Apr 2021
I have a dataset collected over time (50 data points per min) and extracted a frequency (0.8 Hz) at which the data pattern repeats. However, sometimes the dataset includes both periodic and non-periodic behavior. Below, I created some data that looks about the way my data does, just with a lot less noise.
data_01 = sin(2*pi*1.25*(0:1/50:15)');
data_02 = sin(2*pi*3*(0:1/50:15)');
data_03 = [mean([data_01(1:250) data_02(1:250)]'), data_01(250:end)']
findpeaks(data_03,'MinPeakDistance',0.79*50); hold on; plot(data_03(1:250))
I would like to use the frequency I identified for my data of interest (0.8 Hz) to identify which parts of my data are best represented by the frequency (blue) and which do not fit the frequency as well (red). In the end, I would like a value for how well the data around each peak correlates to the 0.8 Hz frequency. If I could have a value for each data point that would be better. I need a way to distinguish between the first peaks on the red plot and the last peaks on the blue plot.
Thank you!

Mathieu NOE on 13 Apr 2021
hello Katy
I built this on your code, so first option is to see if the time interval between consecutive peaks are similar or not , you can see that we can discriminate the non periodic vs the periodic segments of data
another approach is to use the short term fourier transform (stft) to do a refined frequency analysis , so here again we know when we have many frequencies vs the case of one single dominant frequency
FYI you speak about a signal at 0.8 Hz but your code shows 1.25 Hz; I guess you wanted to say the period is 0.8 s;
all the best
clearvars
close all
fs = 50;
dt = 1/fs;
data_01 = sin(2*pi*1.25*(0:1/50:15)');
data_02 = sin(2*pi*3*(0:1/50:15)');
data_03 = [mean([data_01(1:250) data_02(1:250)]'), data_01(250:end)'];
samples = length(data_03);
time = (0:samples-1)*dt;
% findpeaks(data_03,'MinPeakDistance',0.79*50); hold on; plot(data_03(1:250))
figure(1);
[pks,locs] = findpeaks(data_03);
hold on;
plot(data_03,'b')
plot(locs,pks,'dk')
plot(data_03(1:250),'r')
hold off;
time_between_peaks = diff(locs)/fs;
freq_between_peaks = 1./time_between_peaks;
time_between_peaks = [0, time_between_peaks];
freq_between_peaks = [0, freq_between_peaks];
figure(2);
subplot(211),plot(time(locs),time_between_peaks);
title('time between peaks');
ylabel('s')
subplot(212),plot(time(locs),freq_between_peaks);
title('freq between peaks');
ylabel('Hz')
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
figure(3);
nfft = 150;
[S,F,T] = myspecgram(data_03, fs, nfft, 0.975); % overlap is 97.5% of nfft here
imagesc(T,F,(abs(S)))
colorbar('vert');
set(gca,'YDir','Normal')
xlabel('Time (secs)')
ylabel('Freq (Hz)')
title('Short-time Fourier Transform spectrum')
colormap('jet');
function [fft_specgram,freq_vector,time] = myspecgram(signal, Fs, nfft, Overlap)
% FFT peak spectrogram of signal (example sinus amplitude 1 = 0 dB after fft).
% signal - input signal,
% Fs - Sampling frequency (Hz).
% nfft - FFT window size
% Overlap - buffer overlap % (between 0 and 0.95)
signal = signal(:);
samples = length(signal);
% fill signal with zeros if its length is lower than nfft
if samples<nfft
s_tmp = zeros(nfft,1);
s_tmp((1:samples),:) = signal;
signal = s_tmp;
samples = nfft;
end
% window : hanning
window = hanning(nfft);
window = window(:);
% compute fft with overlap
offset = fix((1-Overlap)*nfft);
spectnum = 1+ fix((samples-nfft)/offset); % Number of windows
% % for info is equivalent to :
% noverlap = Overlap*nfft;
% spectnum = fix((samples-noverlap)/(nfft-noverlap)); % Number of windows
% main loop
fft_specgram = [];
for ci=1:spectnum
start = (ci-1)*offset;
sw = signal((1+start):(start+nfft)).*window;
fft_specgram = [fft_specgram abs(fft(sw))*4/nfft]; % X=fft(x.*hanning(N))*4/N; % hanning only
end
% one sidded fft spectrum % Select first half
if rem(nfft,2) % nfft odd
select = (1:(nfft+1)/2)';
else
select = (1:nfft/2+1)';
end
fft_specgram = fft_specgram(select,:);
freq_vector = (select - 1)*Fs/nfft;
% time vector
% time stamps are defined in the middle of the buffer
time = ((0:spectnum-1)*offset + round(offset/2))/Fs;
end