Fundamental question about extracting a portion of repeating data over a large data set.

9 views (last 30 days)
So I am going to do my best to explain this... but I have a sensor that is pusling on and off at a certain time interval Pt. After each pulse the signal decays for a period of time. I would like to extract and average the final 5 seconds for each pulse. The data is in the format of a 2 set of vectors. One being the signal, and the secod being time. I am a electrochemist, and dont have much formal training with programming, so must of the last week has been simply reading these forums trying to get a basic understanding. A brief example of the data format is below:
Signal Time
100 1
50 2
25 3
12.5 4
6.25 5
3.125 6
ect...
This goes for a few 100 seconds and i would like to only extract the average of the final 5 seconds. This then repeats for a few hundred cycles.
  10 Comments
David Probst
David Probst on 6 Oct 2022
I dont need to see the drift, I would like the average of each individual spikes or decyas last 5 seocnds. Then to take those points export all those individually into a new array against time. Here is an exmaple of a text file, sorry for the lack of description.
I attahced a text file, and the hope (ideally) would be to have a new text file of the last 5 second avverage for each decay curve.
David Probst
David Probst on 6 Oct 2022
I guess another way to say this would be the "drift" but its not a drift but a change in analyte concentration over time for a sensor. So we want to quantify this by averaging the "end point" of each decay. I normally do this manually on excel but we want to run this for extneded periods and once the files get to large this obvisouly becomes very very challneing to not crash excel haha.

Sign in to comment.

Accepted Answer

William Rose
William Rose on 6 Oct 2022
This version reads data from the file you posted.
The columns are time, sensor reading.
The first data point is at time t=3600 (seconds?) (i.e. 1 hour).
dt=0.100000 for segment 1, so I assume dt=0.100000 for all segments. But there is a large time gap between segments.
The first eight segments have 601 samples, so I assume every segment has 601 samples.
data=importdata('OCPSen1.txt');
N=length(data.data); %number of data points
t=data.data(:,1); %time values (unevenly sampled)
t=t-t(1); %remove the initial time offset
y=data.data(:,2); %sensor readings
t1=t(t<61); %times in segment 1 (times<61)
dt=(t1(end)-t1(1))/(length(t1)-1); %seg.1 sampling interval
%fprintf('Segment 1: dt=%.6f sec',dt); %display seg.1 sampling interval
Tdur=max(t); %total duration
Npc=601; %number of point per cycle
Nc=floor(N/Npc); %number of complete cycles
tr=reshape(t(1:Npc*Nc),[Npc,Nc]); %reshape t into columns for segments
yr=reshape(y(1:Npc*Nc),[Npc,Nc]); %reshape y into columns for segments
tend=tr(end-round(5/dt)+1:end,:); %times: last 5 s of each segment
yend=yr(end-round(5/dt)+1:end,:); %y values: last 5 s of each segment
tendmean=mean(tend); %mean time of ends
yendmean=mean(yend); %mean y value of ends
%% Plot results
subplot(211), plot(t,y,'-b') %plot all the data
hold on; plot(tendmean,yendmean,'r*') %plot means of ends of segments
grid on; xlabel('Time (s)'); ylabel('Y')
subplot(234), plot(tr(:,1),yr(:,1),'b.') %plot data segment 1
hold on; plot(tendmean(1),yendmean(1),'r*') %plot mean of end of seg.1
grid on; xlabel('T'); title('Segment 1')
subplot(235), plot(tr(:,2),yr(:,2),'b.') %plot data segment 2
hold on; plot(tendmean(2),yendmean(2),'r*') %plot mean of end of seg.2
grid on; xlabel('T'); title('Segment 2')
subplot(236), plot(tr(:,Nc),yr(:,Nc),'b.') %plot last data segment
hold on; plot(tendmean(Nc),yendmean(Nc),'r*') %plot mean of end of last seg.
grid on; xlabel('T'); title('Last Segment')
Try it.
  4 Comments
David Probst
David Probst on 6 Oct 2022
Thank you for this help, i think i understand the general process. The key peice i was missing was the "floor" function, whenever i tried using the reshape i was being told that the arrays were uneven whihc i think was due to having not even cycles exaclty near the end. the floor isolates this to apply for only complete cycles which solves the last array being uneven.
William Rose
William Rose on 6 Oct 2022
@David Probst, you're welcome. You may email with followup quesitons by clicking on my WR icon in one of these message. You will notice that the pop-up which appears has a clickable envelope in the top right. One can choose in the profile whther to accept secure email. The envelope is visible for users who say yes.
You are right that floor() is useful for rounding down. Once I had Nc from floor(), I used it to reshape only the first Nc*Npc elements of y(). When I tried to reshape all of y(), reshape() had an error, due to the leftover bits.

Sign in to comment.

More Answers (2)

William Rose
William Rose on 6 Oct 2022
Edited: William Rose on 6 Oct 2022
[Edit: fix typo in my comments: change "No" to "Now". Code is OK, I think.]
I will make some data, assuming Pt is constant, and assuming the cycles start at time 0:
dt=0.5; Tmax=200; %adjust as needed
Pt=20.9; %adjust as needed
t=0:dt:Tmax; %elapsed time
tcycle=mod(t,Pt); %time after each reset
tau=4; %decay time
y=exp(-tcycle/tau)+randn(size(t))/5; %decay, with noise
Let's plot the data, versus time and versus cycle time.
subplot(211); plot(t,y,'-b.')
grid on; xlabel('Elapsed Time')
subplot(212); plot(tcycle,y,'b.')
grid on; xlabel('Cycle Time')
Now let's average all the values that occur in the last 5 seconds of each cycle.
y5mean=mean(y(tcycle>=Pt-5))
y5mean = -0.0027
The line above uses the indexing ability of Matlab. It finds the mean of all the y values whose tcycle value is >= Pt-5.
  1 Comment
David Probst
David Probst on 6 Oct 2022
okay, so im trying to ensure I understand this correct (again i aplogize if im way off, im a grad student in electrochem and just started getting into this this week for some work).
The number of cycles is t, the total time divided by the pulse time (pt). you are taking this and usnig the y5mean to find the mean of the function y (the decay curve) when the final Pt -5 being the 20.9-5 , so 15.9 through the end of the pulse?
Is this correct in how im interpretting this?

Sign in to comment.


William Rose
William Rose on 6 Oct 2022
Adjusting my answer in light of your comments and in light of the figures you posted.
In the second figure you posted, the spikes at the end of the recording have slightly higher tops and lower baseline than at the start. The simulated data below reproduces those features.
dt=0.1; Tmax=1000; %sampling interval, total duration
Pt=60; %cycle time
t=0:dt:Tmax; %elapsed time
tau=20; %decay time
y=(0.28+t/2e4).*exp(-tcycle/tau)-t/3e4-.2; %decay, with noise
Plot the data, versus time.
plot(t,y,'-b')
grid on; xlabel('Time (s)')
Make an array with a separate column for each cycle. Discard any partial cycle at the end.
Nc=floor(Tmax/Pt); %number of complete cycles
Npc=Pt/dt; %number of points per cycle
yr=reshape(y(1:Nc*Npc),[Npc,Nc]); %reshape the data into columns for cycles
yend=yr(end-round(5/dt)+1:end,:); %array with last 5 seconds of each column
Now we can find the mean of the last 5 seconds of data, for every cycle.
yendmn=mean(yend); %vector: mean of each column
tendmn=(1:Nc)*Pt-2.5; %vector: mid-time of each mean
Plot the mean values.
hold on;
plot(tendmn,yendmn,'r+');
In the code above, I find the mean of the last 5 seconds separately for every cycle, since I assume you want to understand and perhaps compensate for the baseline drift which is evident in the second figure you posted.

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!