Is there an efficient way to sort specific ranges/subsections of rows independently in a table by a particular column?

11 views (last 30 days)
I want to sort sets of rows independently within a table e.g. a table such as
A 5
C 3
D 1
B 4
E 2
B 3
C 1
A 2
such that the first 5 rows and the last 3 rows are sorted separately in ascending order by column 2 so that the resulting table produced would be
D 1
E 2
C 3
B 4
A 5
C 1
A 2
B 3
Using nested loops is highly inefficient and I've been unsuccessful using sortrows(). Thanks for any suggestions.
  2 Comments
Cris LaPierre
Cris LaPierre on 1 Feb 2019
Is there any sort of logic to where the splits occur? Based on what you've stated, it seems like the dividing was done arbitrarily. If you can articulate the criteria, we might be able to help come up with a solution.
Mark Bodner
Mark Bodner on 1 Feb 2019
Edited: Walter Roberson on 1 Feb 2019
Sorry that I left out the critical point on the splits. They are identified by another column that uniquely identifies all the rows that are grouped together. So for example from the previous example I gave, there is another column with a unique identifier (e.g in column 3) that determines all the rows of a subgroup. There is no overlap between subgroups, but beyond that the number of rows in any subgroup is random.
C 3 X
D 1 X
B 4 X
E 2 X
B 3 Y
C 1 Y
A 2 Y
and the desired output is then for this example is of the form
D 1 X
E 2 X
C 3 X
B 4 X
A 5 X
C 1 Y
A 2 Y
B 3 Y

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 1 Feb 2019
splitapply() provided that you do not mind that disconnected regions with the same identifier code will be brought together
  10 Comments
Walter Roberson
Walter Roberson on 2 Mar 2020
splitapply will not sort the entries. Use a function that returns a cell array around the inputs that you are given.
Oh I see, you are confused by the line of code that I had given other person,
temp = splitapply(@(varargin) {sortrows(table(varargin{:}),2)}, T, G);
Their particular request was to sort values, so there had to be a sorting call for their purposes. For your purposes just leave out the sorting step
temp = splitapply(@(varargin) {table(varargin{:})}, T, G);
PS
PS on 2 Mar 2020
Thank you so much.
It seems I am doing something wrong then, because I am using the same code, I will look into my data again.
But thank you so much for the help.

Sign in to comment.

More Answers (1)

Cris LaPierre
Cris LaPierre on 1 Feb 2019
Edited: Cris LaPierre on 1 Feb 2019
Took a stab. Here are my assumptions
  1. Groups do not overlap
  2. There are the same number of elements in each group with the exception of the final group
These allow me to create a new variable to represent the groups. I can then use sortrows to use the new group variable for the primary sort and your numeric variable for the secondary.
% Create your table
var1 = categorical(["A","C","D","B","E","B","C","A"])';
var2 = [5 3 1 4 2 3 1 2]';
T = table(var1,var2)
% Determine number of elements in a group and the number of groups in the table
cnt = length(unique(T.var1))
grps = ceil(height(T)/cnt)
% Create the grouping variable.
% Start by making an array with a row for each element in a group, and a column for each group
var3 = ones(cnt,1)*[1:grps];
% Use linear indexing to convert the array to a column vector
var3 = var3(:);
% Before adding to the table, make column same height as table
var3 = var3(1:height(T))
% Add new grouping variable to the table
T=addvars(T,var3)
% Sort using grouping variable as primary sort (3), and 2nd column for secondary sort(2)
T = sortrows(T,[3 2])
T =
'D' 1 1
'E' 2 1
'C' 3 1
'B' 4 1
'A' 5 1
'C' 1 2
'A' 2 2
'B' 3 2
  3 Comments
Mark Bodner
Mark Bodner on 1 Feb 2019
Attached is some actual sample data. The first row identifies all of the data from a single subgroup (all rows with the sam PIN number which are contiguous). I'm trying to order every row in each subgroup in ascending order according to column 3. I've tried using splitapply() but haven't been able to make that work (I couldn't even get the function to recognize the columns of data!). Any suggestions?
Cris LaPierre
Cris LaPierre on 1 Feb 2019
Assuming the same number of members in each group was an assumption I made to help create the grouping variable. If your data has a grouping variable then there can be a different number in each group.
With the dataset you shared, I'd do the following
opts = spreadsheetImportOptions('NumVariables',3);
opts = setvaropts(opts,3,'TreatAsMissing','<missing>'); % C232
opts = setvartype(opts,[1 2],'categorical');
opts = setvartype(opts,3,'double');
data = readtable('sample.xlsx',opts,'ReadVariableNames',false);
dataSorted = sortrows(data,[1,3]);
Note that the groups are placed in alphabetical order. Not sure if one of your design criteria was that they don't change order. If so, you can add the following after the readtable command but before the sortrows to preserve the order of the PIN numbers.
[~,ic,~] = unique(data.Var1);
data.Var1 = categorical(data.Var1,data.Var1(sort(ic)),'Ordinal',true);

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!