Sparse Matrix build efficiency
5 views (last 30 days)
Show older comments
Hello everyone.
I want to build a nxn matrix A as sparse for efficiency( time of the inversion(A\)). i)I would like to ask is it ok if I define it with the command A=sparse([..],[..],[..],n,n) with one or a few non zero elements at the beginning and then add the rest of the elements with commands like the following: A(..:..,..:..)=..; or will it cause a problem with time efficiency like when we change the size of a regular matrix or vector inside a loop and we get the warning "consider preallocating for speed"?
I can't directly compare it in terms of time efficiency with the case of building the three necessary vectors [..] from the beginning and use only the command sparse, because I have to change many things in my code to do that.
ii) Is it nessesary or possible in MATLAB to store a matrix in skyline format along with the corresponding vector of the positions of the diagonal elements?
Thank you in advance
0 Comments
Accepted Answer
John D'Errico
on 7 Jun 2017
Edited: John D'Errico
on 7 Jun 2017
In general, DON'T DO IT! That is, don't build a sparse matrix one element at a time.
Even if you use spalloc, the matrix will be inefficient to build.
For example, I'll build a sparse matrix one element at a time below.
N = 100000;
M = spalloc(1000,1000,N);
tic
k = N + 1;
for i = 1:N
k = k - 1;
M(k) = i;
end
toc
Elapsed time is 6.069033 seconds.
So I stuffed the matrix to its allocated capacity, at 10% non-zeros. I did it by inserting elements from the back end first, so every time I put a new element in, I believe the sparse insert forces the elements to be re-shuffled in memory.
Next, I'll stuff them in the same order they will actually be stored in memory. So no reshuffling is necessary.
N = 100000;
M = spalloc(1000,1000,N);
tic
k = 0;
for i = 1:N
k = k + 1;
M(k) = i;
end
toc
Elapsed time is 0.196214 seconds.
This time it was MUCH faster. The problem is, if you do the stuffing in random order, it will almost always be true that SOME amount of reshuffling must be done. So most of the time, the time will be like the first call.
N = 100000;
ind = randperm(1e6,N);
M = spalloc(1000,1000,N);
tic
for i = 1:N
k = ind(i);
M(k) = i;
end
toc
Elapsed time is 2.980270 seconds.
So it took 6 seconds for the worst case. and half that for the random sequence of elements. But if I built it in the best possible order, it took 0.2 seconds.
Again, the point is, DON'T DO IT!
Instead, build the array using sparse, in ONE single call. This will force you to create a list of non-zero elements upfront. Preallocate your matrices. Never grow them dynamically.
tic
[rind,cind] = ind2sub([1000,1000],ind);
M = sparse(rind,cind,1:N,1000,1000);
toc
Elapsed time is 0.036842 seconds.
You should notice that this call took significantly less time than the best possible case to allocate that sparse matrix before (using a loop).
If you are going to use sparse matrices, then learn to use the sparse matrix tools. These tools are of huge benefit, but if you use them poorly, you will still get poor performance.
Ok, yes, you THINK you cannot use sparse. In fact, you are simply wrong in that. You can build the elements in advance, storing them as a list of row and column indices, plus a list of values. If you know in advance how many elements you will have, just preallocate those vectors as FULL vectors, stuffing them one at a time as you compute the elements. When the last element is created, call sparse, ONCE. ONE TIME.
If you don't know how many elements you will have at the end, I provide code to build arrays incrementally here, found on the file exchange for download.
More Answers (2)
Steven Lord
on 7 Jun 2017
Inverting a matrix is generally not a good idea, and it's especially not recommended with a sparse matrix.
>> rng default
>> A = sprand(100, 100, 0.1);
>> invA = inv(A);
>> nnz(A)
ans =
951
>> nnz(invA)
ans =
10000
While invA is stored in the sparse storage format, it's not actually sparse; all its elements are nonzero.
If you're trying to invert the matrix to solve a system of equations A*x = b use the backslash operator x = A\b or, if your matrix is REALLY large, try one of the functions in the "Iterative Methods and Preconditioners" section of this documentation page.
Getting back to your question about efficiently creating a sparse matrix, if you must create it with a few elements to start then add in additional elements iteratively I would use the spalloc function first.
Christine Tobler
on 7 Jun 2017
i) You should expect constructing a sparse matrix in a loop by adding elements using indexing A(i, j) = s to be slower than constructing the three vectors i, j, s for the whole matrix, and then calling sparse once. But before making a large change to your code, you may want to profile your code, to check whether the construction of the matrix is the bottleneck.
ii) It's not possible or necessary in MATLAB to represent a matrix in the skyline format. If you are using backslash (x = A \ b), this will detect if a sparse matrix A has a banded structure.
Also, as Steve mentioned, you should definitely not invert a sparse matrix - the result will be nearly dense. To solve a linear system, use backslash instead.
0 Comments
See Also
Categories
Find more on Resizing and Reshaping Matrices in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!