Why sparse function is slow?

12 views (last 30 days)
Lantao Yu
Lantao Yu on 21 Sep 2020
Commented: Lantao Yu on 21 Sep 2020
I recently generated a sparse matrix using function: sparse. When I do the profiling, I found the vast majority of the runtime is spent on calling function sparse, which is pretty shocking to me.
To find out if generating a sparse matrix is slow across all the programming languages. I use scipy.sparse.coo_matrix in python to perform the same task. What suprised me is that scipy.sparse.coo_matrix has 10X speed of that of Matlab's sparse function.
Matlab demo Code:
RowInd = repmat(randperm(262144),81,1);
RowInd = RowInd(1:260100*81) ;
ColInd = repmat(randperm(262144),81,1);
ColInd = ColInd(1:260100*81);
Val = randn(260100*81,1);
tStart = tic;
L=sparse(RowInd,ColInd,Val, 262144, 262144 ,260100*81);
tEnd = toc(tStart);
disp(['Runtime of generating a sparse matrix in Matlab:', num2str(tEnd), ' second.']);
Python demo Code:
import numpy as np
import scipy.sparse
import scipy.sparse.linalg
from time import time
if __name__ == "__main__":
nz_indsRow = np.tile(np.random.permutation(262144), 81)
nz_indsRow = nz_indsRow[:260100*81]
nz_indsCol = np.tile(np.random.permutation(262144), 81)
nz_indsCol = nz_indsCol[:260100*81]
nz_indsVal = np.random.rand(260100*81)
print(nz_indsRow.shape, nz_indsCol.shape, nz_indsVal.shape)
t0 = time()
L = scipy.sparse.coo_matrix(
(nz_indsVal, (nz_indsRow, nz_indsCol)), shape=(262144, 262144))
t1 = time()
print('Runtime of generating a sparse matrix via SicPy:', t1-t0, 'second.')
In my desktop: the runtime is 1.2399 s vs 0.12721 s.
Can someone explain to me that why sparse function in Matlab is so slow? How to find a more efficient function that generate a sparse matrix in Matlab?
  15 Comments
Bruno Luong
Bruno Luong on 21 Sep 2020
Good point cyclist. For fair comparison, one must run CSC, whih is MATLAB format.
Lantao Yu
Lantao Yu on 21 Sep 2020
Thank you for your point, cyclist.
I run the following code involving convert COO matrix to CSC/CSR matrix. The print goes:
Runtime of generating a sparse CSC matrix via SicPy: 1.3742189407348633 second.
Runtime of generating a sparse CSR matrix via SicPy: 1.3034861087799072 second.
Now the runtime is close to that in Matlab. I apologize for not conducting a fair comparison.
import numpy as np
import scipy.sparse
import scipy.sparse.linalg
from time import time
if __name__ == "__main__":
nz_indsRow = np.tile(np.random.permutation(262144), 81)
nz_indsRow = nz_indsRow[:260100*81]
nz_indsCol = np.tile(np.random.permutation(262144), 81)
nz_indsCol = nz_indsCol[:260100*81]
nz_indsVal = np.random.rand(260100*81)
print(nz_indsRow.shape, nz_indsCol.shape, nz_indsVal.shape)
t0 = time()
L = scipy.sparse.coo_matrix(
(nz_indsVal, (nz_indsRow, nz_indsCol)), shape=(262144, 262144))
LL = scipy.sparse.coo_matrix.tocsc(L)
t1 = time()
print('Runtime of generating a sparse matrix via SicPy:', t1-t0, 'second.')
t0 = time()
L = scipy.sparse.coo_matrix(
(nz_indsVal, (nz_indsRow, nz_indsCol)), shape=(262144, 262144))
LL = scipy.sparse.coo_matrix.tocsr(L)
t1 = time()
print('Runtime of generating a sparse matrix via SicPy:', t1-t0, 'second.')

Sign in to comment.

Answers (0)

Categories

Find more on Sparse Matrices in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!