How to find the median of nonzero elements in each row of sparse matrix

3 views (last 30 days)
Hi, everyone
I have been harassed for a long time, any help will be greatly appreciated. I am looking for a computationally cheap way to find the median of nonzero elements in each row of a sparse matrix S=sparse([],[],[],32768,240,000,50*240000); There are nonzero elements assigned to S, which is not shown here.
Here is how I do to find the median, but I am not satisfied with the efficiency
[row,col,v] = find(S);
M=sum(S~=0,2);
temp1=[row,col,v];
temp2=find(M);
temp3=zeros(size(temp2));
for j=1:length(temp2)
temp3(j)=median(temp1(temp1(:,1)==temp2(j),3));
end
Also, it's much worse to replace zero with NaN, then use nanmedian or median(___,nanflag). It's not practicable at all due to assign many NaN to a large sparse matrix.
Is there any other more efficient way to implement this? I am think about a way without using loop.
Thank you very much for any of your time and energy.
  1 Comment
James Tursa
James Tursa on 29 Nov 2016
How much time does it take for you? How much improvement were you hoping for? It might make sense to transpose the matrix first, to turn all of the row data into column data so each original row data set is contiguous in memory, and then work with the columns. E.g., maybe employ a mex routine to get the median of the columns.

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 29 Nov 2016
Note: I have no experience with sparse matrices.
Saying that, just looking at the beginning of your code, it's trivial to go from the first line to the median in just one more line, using accumarray. I've not tried to understand what your code is doing, but it looks convoluted.
[row, ~, v] = find(S); %find values and corresponding row number of sparse matrix
rowmedian = accumarray(row, v, [], @median);
%if you want rowmedian the same height as S, replace [] by size(S, 1)
Note that giving meaningful names to your variables would greatly help others in understand your code ... and yourself in 6 months time when you go back over it. Numbered temp variables is really unhelpfully.
  2 Comments
Albert
Albert on 29 Nov 2016
Dear Guillaume
Thank you very much. 'accumarray' is the MATLAB built-in function which I had been looking for for a long time. It solved my problem. 'accumarray' is very useful!
I think you must be very good at MATLAB. Would you mind helping me with another trouble? Let's say we have 3D matrix A(m,n,240000), B(l,q,240000)
could you get rid of the loop in the following code?
for i=1:240000
C(i)=A(:,:,i)\B(:,:,i)
end
The reason why I am trying to do so is to parallelize these computation, which could save me a lot of time. I am building a 3D image reconstruction MATLAB code, current it takes about 3 hours to finish running 100 iterations with 100,000 particles. My goal is to reduce the running time to several tens of minutes.
Thank you again.
James Tursa
James Tursa on 30 Nov 2016
Edited: James Tursa on 30 Nov 2016
How large are m and n? (P.S. It would be best if you opened up a new Question for this)

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!