Building sparse matrix inside parfor

12 views (last 30 days)
I'm building a large sparse matrix in smaller pieces. Unfortunately the pieces overlap a bit. At the moment I'm building each piece to match the final size and after the loop I sum the pieces together. I have tried following approaches
1) I tried summing up sparse matrixes inside parfor. Bad idea. Produces full matrix.
2) Build index vectors for each piece of the matrix and combine the index vectors inside parfor. Then use one sparse command after the loop to build the final matrix. This, unfortunately, is rather slow. The reason might be the repetitive entries that the sparse command needs to sum up.
3) Build sparse matrix of each piece and store them in cell array inside parfor. Then sum up the sparse matrixes inside regular for loop. This is the best so far; fast and reliable. (See the pseudocode below.)
4) This is the problematic case: Build sparse vectors out of each piece and store them in cell array. Then sum up the sparse vectors inside regular for loop, and reshape to matrix. Unfortunately for larger systems it crashes with
Error using parallel_function (line 598)
Error during serialization
Error stack:
remoteParallelFunction.m at 31
As a for loop it runs just fine.
Below is some pseudocode to shed light on what I'm doing:
First option 3) that always works.
Aset = cell(1,Nsets) ;
parfor S=1:Nsets
% Do lots of stuff to get iind, jind, Aval
Aset{S} = sparse( iind, jind, Aval, Ndof, Ndof );
end
A = Aset{1};
for S=2:Nsets
A = A + Aset{S} ;
end
Option 4) that gives the error:
Aset = cell(1,Nsets) ;
parfor S=1:Nsets
% Do lots of stuff to get iind, jind, Aval
matind = iind +Ndof*( jind-1 );
Aset{S} = sparse( matind, ones(size(matind)), Aval, Ndof*Ndof, 1 ) ;
end
A = Aset{1};
for S=2:Nsets
A = A + Aset{S} ;
end
A = reshape(A,Ndof,Ndof) ;
Any ideas why option 4 crashes? How should I do this to gain speed?
The size of the final matrix, i.e. Ndof, is few millions. Number of matrix pieces, i.e. Nsets, is 10 to 30. For option 3 it takes roughly 30 seconds to sum the matrixes of size Ndof=4000000.

Accepted Answer

Sean de Wolski
Sean de Wolski on 29 Dec 2011
NEW:
I think I have a possible workaround. Instead of using sparse on each iteration, build up the matind and aval vectors in the parfor loop and call sparse once at the end:
Nsets = 12;
Ndof = 1e6;
matind = cell(12,1);
Aval = cell(12,1);
parfor S=1:Nsets
% Do lots of stuff to get iind, jind, Aval
iind = ceil(Ndof*rand(Ndof,1)) ;
jind = ceil(Ndof*rand(Ndof,1)) ;
Aval{S} = 100*randn(Ndof,1);
matind{S} = iind +Ndof*( jind-1 );
end
matind = vertcat(matind{:});
Aval = vertcat(Aval{:});
A = sparse(matind,ones(numel(matind),1),Aval,Ndof*Ndof,1);
A = reshape(A,Ndof,Ndof);
spy(A)
  1 Comment
Mika
Mika on 30 Dec 2011
Thank you Sean. This does the trick for me. It is still slower than I had hoped for. With parfor I'm able to build the pieces to the matrix very fast. Unfortunately knowing the values and indices to matrix is far from the actual matrix, the final sparse takes a while to execute. I guess I need to think something completely different to make it considerably faster.
Also, as you said, this is a work around. The original problem still exists. It might be worth investigating more. Someone else might hit it too.

Sign in to comment.

More Answers (2)

fvff
fvff on 25 Nov 2014
Edited: fvff on 25 Nov 2014
Three years too late! There is a workaround to parfor expanding the sparse matrix by using a function handle. See code below for an example.
m = 1e5;
n = 1e5;
A = sparse(m,n);
fcn = @plus;
parfor k = 1:100
i = randi(m,10);
j = randi(n,10);
s = randn(10);
A = fcn(A, sparse(i,j,s,m,n));
end
Hope it helps!
  3 Comments
SE
SE on 3 Aug 2018
Just logged in to tell you that you're a lifesaver! What strange behaviour... I suppose the symbolic addition is what wants inputs to be full rather than sparse? Thanks again!
Fintan Healy
Fintan Healy on 12 Jan 2025
m = 1e5;
n = 1e5;
A = sparse(m,n);
parfor k = 1:100
i = randi(m,10);
j = randi(n,10);
s = randn(10);
A = A + sparse(i,j,s,m,n);
end
as of 2024b the "fcn" wrapper is no longer required, and this was the fastest option for me.

Sign in to comment.


Mika
Mika on 29 Dec 2011
Sean,
Here's a sample that crashes on my machines (imac and ubuntu linux). I tried both R2011a and R2011b.
matlabpool local 4
Nsets = 12;
Ndof = 1e6 ;
Aset = cell(1,Nsets) ;
parfor S=1:Nsets
% Do lots of stuff to get iind, jind, Aval
iind = ceil(Ndof*rand(Ndof,1)) ;
jind = ceil(Ndof*rand(Ndof,1)) ;
Aval = 100*randn(Ndof,1) ;
matind = iind +Ndof*( jind-1 );
Aset{S} = sparse( matind, ones(size(matind)), Aval, Ndof*Ndof, 1 ) ;
end
A = Aset{1};
for S=2:Nsets
A = A + Aset{S} ;
end
A = reshape(A,Ndof,Ndof);
spy(A)
matlabpool close
  3 Comments
Friedrich
Friedrich on 29 Dec 2011
This seems like a bug to me. There isn't any reason why it shouldn't work. You have a limit of 2gb on 64bit and 600mb on 32bit regarding the amount of data which can be transfered from MATLAB to the workers and back. You are far away from that limit. Since it works with small values it should work with bigger one too.
Mika
Mika on 29 Dec 2011
Memory related problem seems likely. The funny thing is that is runs fine in serial. I mean, if you change 'parfor' to 'for' the results that I get seem right.

Sign in to comment.

Categories

Find more on Creating and Concatenating Matrices in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!