Seperating data into matrices based on names

Hi,
I have a matrix of raw data stored in data.rawdata. Each row corresponds to a measurement which is labeled and the labels are stored at data.label. Labels could be repeated. So I want to seperate the data into different groups based on labels which I have been doing with the followid code:
data.labels=labels;
data.rawdata=rawdata;
groups=unique(data.labels);
for i=1:length(groups);
measurements{i,:}=data.rawdata((data.labels)==groups(i)),:);
end
I want to use the measurements that are seperated to save tt pp and pr results into seperate cells but brace indexing is not supported for this format inside the brackets...
for i=1:length(groups)
[tt{i,:},pp{i,:},pr{i,:}]=pca(measurements{i,:},10);
end
How can I do the pca calculations on different groups seperately? And how can I store the tt, pp, and pr seperately for different measurement groups?
I could do all this manually but the point is that I want to speed up the calculations by automizing
Another question which is of less importance:
is it possible to save the data in a cell array with different names with a code like: (I know this doesn'T work in the current format'
measurements.gourps(i)=data.rawdata((data.labels)==groups(i)),:);
I am looking forward to answers.
Thanks!

1 Comment

For the first part, you can directly use splitapply() instead of for loop to split your matrix; however, for this purpose, your code is also fine.
Now the error is actually caused by the left-hand side of
[tt{i,:},pp{i,:},pr{i,:}]=pca(measurements{i,:},10);
Can you make sure that the variables tt, pp, pr are not already defined with some other datatype.

Sign in to comment.

 Accepted Answer

I'm not sure what type do you use for data. Is that a table or a struct? I prefer using table. You can do findgroups first and a splitapply second on data, if it is a table.
[G, res] = findgroups(data(:, "labels"));
s = splitapply(@(x) {pca(x)}, data.rawdata, G);
the result s will be a cell array of length equaling to your number of unique labels.
If you data is a struct, you can use struct2table to convert it to table. prerequisite: data.labels is column vector with length equal to the row of data.rawdata.

8 Comments

The splitapply() will only capture the first output of pca(). The OP's code show that he want three outputs of the pca() function.
s = splitapply(@(x){myPca(x)}, data.rawdata, G);
function y = myPca(x)
[a, b, c] = pca(x);
y{1} = a;
y{2} = b;
y{3} = c;
end
Hi Thank you both for your comments.
Peng Li, I use struct format. after converting the format to table and running this
[G, res] = findgroups(data(:, "labels"));
I gut an error saying:
% Table variable subscripts must be real positive ...
% integers, logicals, character vectors, or cell arrays of character vectors.
I have overcome the issue using this text which is not as elegant as your version and could be shorter...
for i=1:length(groups)
[tt,pp,pr]=pca(measurements{i,:},10)
pcscores{i,:}=tt;
pcloadings{i,:}=pp;
prexplained{i,:}=pr;
end
Ameer Hamza and Peng Li, the labels are not integers... would I still be ablt to use splitapply?
ardeshir, You can use splitapply() without converting to table. Try
[G, ~] = findgroups(data.labels);
s = splitapply(@(x) {pca(x)}, data.rawdata, G);
function y = myPca(x)
[a, b, c] = pca(x);
y{1} = a;
y{2} = b;
y{3} = c;
end
it doesn't matter if labels are integers or not. in above script, replace pca in splitapply with myPca in order for it to run.
re your post on error regarding [G, res] = findgroups(data(:, "labels"));
after you convert from struct to table, what table variable names you have? Supposedly, labels should be a variable name. you can try [G, res] = findgroups(data.labels);
Thank you Peng Li and Ameer Hamza! your answers worked and it helped me a lot!

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!