How do I find the corresponding date/time value of the maximum of my data in a table?

83 views (last 30 days)
I have a table of data. I want to find out the max values of Melatonin and Cortisol values (two columns in the table) but at the same time I want the corresponding date & time for this max value from the Date/Time column in the table. This is what I am doing but this only gives me the max value of Mel and Cort. How can I find the corresponding date/time as well?
T1_Smooth_Mel=groupsummary(T,{'Participant_ID','Sample_ID'},'max',{'Smooth_Melatonin_concentration'});
T1_Smooth_Cort=groupsummary(T,{'Participant_ID','Sample_ID'},'max',{'Smooth_Cortisol_concentration'});
Any help would be hugely appreciated. Thanks.

Accepted Answer

Peter Perkins
Peter Perkins on 30 Jul 2021
This is a pretty common question, so I'm gonna show a bunch of stuff in hopes others can find it useful.
Let's start with a simple timetable. A timetable, because you have timestamps.
>> Time = datetime(2021,7,1:10)';
>> X = rand(10,1); Y = rand(10,1); G = randi(2,10,1);
>> tt = timetable(Time,X,Y,G)
tt =
10×3 timetable
Time X Y G
___________ _______ _______ _
01-Jul-2021 0.81472 0.15761 2
02-Jul-2021 0.90579 0.97059 1
03-Jul-2021 0.12699 0.95717 2
04-Jul-2021 0.91338 0.48538 2
05-Jul-2021 0.63236 0.80028 2
06-Jul-2021 0.09754 0.14189 2
07-Jul-2021 0.2785 0.42176 2
08-Jul-2021 0.54688 0.91574 1
09-Jul-2021 0.95751 0.79221 2
10-Jul-2021 0.96489 0.95949 1
In its simplest form, groupsummary applies a function to variables in the timetable one at a time. That function must return one value per group. You can also use a function that is applied to more than one variable at a time, but the function must still return one value. Here's a function that takes two inputs, finds the max in one, and returns the corresponding value from the other:
function tmax = myfun1(t,x)
[~,imax] = max(x);
tmax = t(imax);
end
This can't be an anonymous function defined on the fly, because it requires two statements. One more wrinkle is that groupsummary does not currently allow the row times as an input to your function, so to work around that, we can do an on-the-fly conversion to a table with a time variable (this sounds expensive, but actually is not).
>> tMax = groupsummary(timetable2table(tt),"G",@myfun1,{"Time" "X"});
>> tMax.Properties.VariableNames(end) = "TMax"
tMax =
2×3 table
G GroupCount TMax
_ __________ ___________
1 3 10-Jul-2021
2 7 09-Jul-2021
The one value returned by this function is the time at which the maximum occurs, so we can use that as in index into the original data to get the maximum value itself.
>> tMax.XMax = tt.X(tMax.TMax) % index into timetable by time
tMax =
2×4 table
G GroupCount TMax XMax
_ __________ ___________ _______
1 3 10-Jul-2021 0.96489
2 7 09-Jul-2021 0.95751
You have two variables, and the idea extends. Apply myfun1 to Time and X, and then to Time and Y:
>> tMax = groupsummary(timetable2table(tt),"G",@myfun1,{["Time" "Time"] ["X" "Y"]});
>> tMax.Properties.VariableNames(3:end) = ["TMax_X" "TMax_Y"];
>> tMax.XMax = tt.X(tMax.TMax_X);
>> tMax.YMax = tt.Y(tMax.TMax_Y)
tMax =
2×6 table
G GroupCount TMax_X TMax_Y XMax YMax
_ __________ ___________ ___________ _______ _______
1 3 10-Jul-2021 02-Jul-2021 0.96489 0.97059
2 7 09-Jul-2021 03-Jul-2021 0.95751 0.95717
You have a time variable in your data, but suppose you did not. In that case, you can temporarily add a variable that identifies rows in the original data and use it like the time variable.
>> t = table(X,Y,G);
>> t.TempRowInds = (1:height(t))'
t =
10×4 table
X Y G TempRowInds
_______ _______ _ ___________
0.81472 0.15761 2 1
0.90579 0.97059 1 2
0.12699 0.95717 2 3
0.91338 0.48538 2 4
0.63236 0.80028 2 5
0.09754 0.14189 2 6
0.2785 0.42176 2 7
0.54688 0.91574 1 8
0.95751 0.79221 2 9
0.96489 0.95949 1 10
>> tMax = groupsummary(t,"G",@myfun2,{["TempRowInds" "TempRowInds"] ["X" "Y"]})
>> tMax.Properties.VariableNames(3:end) = ["RowMax_X" "RowMax_Y"];
>> tMax.XMax = t.X(tMax.RowMax_X);
>> tMax.YMax = t.Y(tMax.RowMax_Y)
tMax =
2×6 table
G GroupCount RowMax_X RowMax_Y XMax YMax
_ __________ ________ ________ _______ _______
1 3 10 2 0.96489 0.97059
2 7 9 3 0.95751 0.95717
>> t.TempRowInds = []; % don't forget to remove the temp var
where myfun2 is
function rowmax = myfun2(row,x)
[~,imax] = max(x);
rowmax = row(imax);
end
Getting further into advanced manoeuvers, you can use two functions, one to compute the max value, one to find the index at which it occurs.
tMax = groupsummary(timetable2table(tt),"G",{@(t,x)max(x) @myfun2},{["Time" "Time"] ["X" "Y"]});
tMax.Properties.VariableNames(3:end) = ["TMax_X" "XMax" "TMax_Y" "YMax"]
tMax =
2×6 table
G GroupCount TMax_X XMax TMax_Y YMax
_ __________ _______ _______ _______ _______
1 3 0.96489 0.96489 0.97059 0.95949
2 7 0.95751 0.95751 0.95717 0.79221
This is fewer lines of code, but notice that it has to go through each group twice for each variable. Can we avoid that? Also, can we avoid have to regroup for each variable? Yes we can: rowfun.
tMax = rowfun(@myfun3,timetable2table(tt),"InputVariables",["Time" "X" "Y"],"GroupingVariables","G","OutputVariableNames",["TMax_X" "XMax" "TMax_Y" "YMax"])
tMax =
2×6 table
G GroupCount TMax_X XMax TMax_Y YMax
_ __________ ___________ _______ ___________ _______
1 3 10-Jul-2021 0.96489 02-Jul-2021 0.97059
2 7 09-Jul-2021 0.95751 03-Jul-2021 0.95717
where
function [tmax_x,xmax,tmax_y,ymax] = myfun3(t,x,y)
[xmax,imax] = max(x);
tmax_x = t(imax);
[ymax,imax] = max(y);
tmax_y = t(imax);
end
And of course the TempRowInds strategy works here too.

More Answers (1)

dpb
dpb on 30 Jul 2021
That's what the optional return index value of max is for...
[mx_Smooth_Mel,ixMel]=max(T.Smooth_Melatonin_concentration);
[mx_Smooth_Cort,ixCort]=max(T.Smooth_Cortisol_concentration);
Then
Tmaxes=[T([ixMel;ixCort],:);
will give you a 2-row table of all the data corresponding to those two locations

Categories

Find more on Tables in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!