average and reshape a sparse array

1 view (last 30 days)
Suresh
Suresh on 6 Dec 2017
Edited: Matt J on 8 Dec 2017
Hi, I am looking on how to average a 2-D sparse array along the 2nd dimension and reshape? for example: Iqt = ones(1e6,1e4); The code snippet below fails as it says reshape of sparse ND arrays are not supported. Any ideas, may be it is simple, not getting it. I want to average 10 successive elements along the 2nd dim, so the result would be Iqt_new = ones(1e6,1e3) bin_size=10;
A = reshape(mean(reshape(Iqt,size(Iqt,1),bin_size,size(Iqt,2)/bin_size),2),size(Iqt,1),size(Iqt,2)/bin_size);
Thanks. Suresh

Answers (5)

Walter Roberson
Walter Roberson on 7 Dec 2017
sparse array cannot be reshaped into 3D.
result = (Iqt(:,1:10:end) + Iqt(:,2:10:end) + Iqt(:,3:10:end) + Iqt(:,4:10:end) + Iqt(:,5:10:end) + Iqt(:,6:10:end) + Iqt(:,7:10:end) + Iqt(:,8:10:end) + Iqt(:,9:10:end) + Iqt(:,10:10:end))/10
or you could try:
[I, J, s] = find(Iqt);
result = sparse(I, ceil(J/10), s./10 )

Suresh
Suresh on 7 Dec 2017
Hi Walter, Thank you for your response. I tried the find and sparse code and that seems to work. I am trying to figure out it gives me what I want. I am having trouble understanding the code.
To be sure, what I want is, Iqt is a sparse array of dims: [M,1000] and I define a bin size of 10: then 10 consecutive elements along the 2nd dim will be averaged to give me a new M vector and likewise for the remaining data in groups of 10 are averaged.
Could you please clarify? Thanks. Suresh
  2 Comments
Walter Roberson
Walter Roberson on 7 Dec 2017
When you have sparse(I, J, V) then sparse totals all of the entries that have the same (I, J) pair. To create a mean of those values, you could either divide the resulting sparse array by 10 after it is created, or you could divide the V values by 10 before doing the totaling.
All of the items in columns 1 to 10 are to be averaged together in the same bin; all of the items in column 11 to 20 are to be averaged together in the second bin, and 21 to 30 into the third bin, and so on. So for a given column number ending in 0, the bin is the column number divided by 10, and for the 9 columns before that it should be the same bin. The easiest way to calculate this is to take ceil(column number / 10) . For example, ceil(24/10) is ceil(2.4) is 3, so original column 24 should go in bin #3.
So we calculate the appropriate bin number for each (I,J,S) by ceil(J/10). And as described before, we deal with the mean() part by dividing S by 10 before we do the summing.
Matt J
Matt J on 7 Dec 2017
Suresh responded:
Thanks a lot Walter for the detailed explanation. I assume this will work for any arbitrary value in place of 10 as long as the 2nd dim is an exact multiple of that number.
Suresh

Sign in to comment.


Matt J
Matt J on 7 Dec 2017
Edited: Matt J on 7 Dec 2017
The approach you have taken could be made to work with my ndSparse class:
Iqt=sprand(1e6,1e4,.0001) ; %Example
Iqt=ndSparse(Iqt); %convert to ndSparse
A = reshape(mean(reshape(Iqt,size(Iqt,1),bin_size,size(Iqt,2)/bin_size),2),size(Iqt,1),size(Iqt,2)/bin_size);
A=sparse2d(A); %convert back to regular 2D sparse matrix
However, because of the peculiarities of how sparse matrices work, it is probably going to be much more efficient to do what you are attempting via matrix multiplication.
A=Iqt*kron(speye(1e3), ones(10,1)/10);
  3 Comments
Matt J
Matt J on 8 Dec 2017
Edited: Matt J on 8 Dec 2017
kron(speye(1e3), ones(10,1)/10) creates a matrix whose columns have the value 1/10 in the right places so as to average together groups of 10 adjacent columns when multiplied with Iqt. You can see this maybe by viewing the entries of a smaller example and with bin_size=5 instead of 10,
>> kron(eye(3), ones(5,1)/5)
I found this to be about as fast as Walter's approach, but maybe a little more compact code-wise.
Matt J
Matt J on 8 Dec 2017
Edited: Matt J on 8 Dec 2017
I am still trying to figure out the syntax of kron in this case. If you can suggest for the above dimensions, that would clear the confusion in my mind I think.
A=Iqt*kron(speye(2400/5), ones(5,1)/5);

Sign in to comment.


Suresh
Suresh on 8 Dec 2017
Hi Matt, Walter, I tried the ndSparse way and it seems to work too. I did not do performance testing between the different methods: 1. Walter's find and sparse code 2. convert to ndsparse and then reshape normally 3. kron based which I did not understand how it works.
Any thoughts from either of you. Suresh

Suresh
Suresh on 8 Dec 2017
Edited: Matt J on 8 Dec 2017
It turns out that the code that uses "find" and "sparse" is not fail safe. I lose the dimensionality in the process. For example, my exact case was:
Iqt = ones(802896,2400);
I wanted to average my 2nd dim in groups of bin_size=5
[I, J, s] = find(Iqt);
A = sparse(I, ceil(J/bin_size), s./bin_size);
when I did this, I ended up with A matrix of size 802889 x 2400/5 So I lost 7 elements in the first dimension.
I switched to using ndSparse to do a regular reshape and then I am going back to native sparse after that.
I am still trying to figure out the syntax of kron in this case. If you can suggest for the above dimensions, that would clear the confusion in my mind I think. Thanks. Suresh

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!