Duplicate x,y in table; get min/max of variables to new table
Show older comments
I have a large dataset, small subset attached. There are duplicate x,y values. I need a cleaned table with "var1_max, var1_min, var2_max, var2_min" etc. for every duplicate "x,y"
I'm just not sure where to start setting up the problem. Thanks for any pointers to get going. Using Matlab R2018a, no fancy add-ons/packages

file_in = 'example.csv';
data_in = readtable(file_in);
%%table_out.Properties.VariableNames = {'id', 'x', 'y', 'var1_max', 'var1_min', ...
%% 'var2_max', 'var2_min', 'var3_max', 'var3_min', 'var4_max', 'var4_min', ...
%% 'var5_max', 'var5_min', 'var6_max', 'var6_min', 'var7_max', 'var7_min'};
3 Comments
dpb
on 2 Nov 2019
Are groups by x and y together or by each unique x, y?
Cris LaPierre
on 2 Nov 2019
Not sure I understand completely. I have the same questions a dpb. It looks like you've highlighted groups, but rows 12-14 only have duplicate x values, not x and y.
For the min/max values you want, do you want the overall min/max or just the min/max within the group (those that have the same x,y values)?
For example, rows 2 and 3 above do not have values for var1-3. What values should they have?
Rows 6-10 do have values there. What should the min/max be?
And finally, 12-14 have same x but different y. How should this be handled?
Accepted Answer
More Answers (0)
Categories
Find more on Data Preprocessing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!