Efficient Way To Split Dataset Into Subsets
3 views (last 30 days)
Show older comments
Hello,
I need to split a large dataset (DxN numeric array) into multiple subsets. I can use the code below (where groupIDs is an Nx1 matrix of integer IDs - the group to which each datapoint belongs).
groups = unique(groupIDs);
for i = 1:numel(groups)
tempData = data(:,groupIDs==groups(i));
%do work on tempData
end
However, 90% of the run time of the above code is spent just creating tempData! That amounts to over a minute every time I want to do this. Is there a more efficient way to split data by groupIDs? I tried splitapply() but it doesn't seem to be any faster.
Are there any matlab gurus out there that know a trick? Thanks!
5 Comments
Jos (10584)
on 24 Nov 2017
12Gb? That is quite a lot. If this doesn't fit in memory, swapping to disk is the likely bottleneck ...
Answers (0)
See Also
Categories
Find more on Scope Variables and Generate Names in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!