Group means- Large Data

1 view (last 30 days)
Bus141
Bus141 on 29 Sep 2015
Edited: David J. Mack on 4 Dec 2015
I am trying to get grouped means similar to the following example. I have only know of the following two methods. I have a large data set and have to repeat this command a large number of times so I am concerned about speed. Are there any quicker methods? Thanks for any help!
x=repmat(1:10,1,100)';
x(:,2:100)=rand(1000,99);
%Method 1: Groupstats
tic
meantest=grpstats(x(:,2:100),x(:,1));
toc
%Method 2: Logical Indexing
meantest2=zeros(10,99);
tic
for i=1:10
g=x(:,1)==i;
meantest2(i,:)=mean(x(g,2:end));
end
toc
  1 Comment
John D'Errico
John D'Errico on 29 Sep 2015
This is not even remotely a "large" data set.

Sign in to comment.

Answers (1)

John D'Errico
John D'Errico on 29 Sep 2015
Edited: John D'Errico on 29 Sep 2015
On my cpu, here were the times reported for your two solutions.
Elapsed time is 0.005852 seconds.
Elapsed time is 0.004096 seconds.
So I tried consolidator (from the file exchange.)
tic
[~,meantest3] = consolidator(x(:,1),x(:,2:100),@mean);
toc
Elapsed time is 0.002943 seconds.
It has been around for a while, but still pretty fast.
  3 Comments
David J. Mack
David J. Mack on 4 Dec 2015
Hi John & Bus141!
Since you seem to be stuck in some argument, I recommend this article on Stackoverflow concerning a similar problem:
The accumarray solution is much faster than GRPSTATS - at least for remotely "large" arrays as mine (~1000000000 x 10) - which is similar to John's CONSOLIDATOR solution but using a built-in function.
Hope that helps, Greetings, David

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!