Efficient matrix multiplication for large sparse matrices stored as sparse column vectors

Question

1 vote

I'm working on a project where I need to manipulate very large sparse matrices. To keep the memory cost down, I store these tensors as a sparse column vector, along with their actual dimensions. For example, a random (M x N) sparse matrix matrixA would instead be stored as a struct A as follows:

matrixA = sprand(M,N,0.01);
[rows, cols, val] = find(matrixA);
ind = sub2ind(size(matrixA), rows, cols); 
A.size = size(matrixA);
A.var = sparse(ind, ones(size(ind)), val, prod(A.size),1); % create (N,1) sparse matrix

Note that this is just an example to show the struct A is related to matrixA, in practice I would never actually construct matrixA in the first place.

I need to multiply two such matrices, A and B, with sizes (M,N) and (N,P) and store the result in the same way as above. One way of doing this would be as follows:

 A = reshape(A.var,A.size); % redefine A as an actual sparse matrix with size A.size
 B = reshape(B.var,B.size); % redfine B as an actual sparse matrix with size B.size
 result = A * B;
 
 C.size = size(result);
 C.var = reshape(result, [prod(C.size),1]);

The problem with this method however, is that I loose the advantage of the lower memory consumption that I obtained by storing my matrices as sparse column vectors instead normal sparse matrices (which have a very large number of columns in my case). Alternatively, I could attempt to use some for-loops to do the multiplication element-wise, but this will certainly be very slow.

What is a better way of doing this? I'm looking for a solution that is more memory-efficient than the first method, while still being reasonably fast.

Note that I have little flexibility in the way these matrices are stored, since other parts of my code rely heavily on this.

5 Comments
Show 3 older comments Hide 3 older comments

AS on 10 Sep 2020

Well, the full story is that I'm using a custom class to deal with sparse tensors, which enables me to define and manipulate sparse arrays of any size. It is quite similar to, but not as elaborate as, the Tensor Toolbox. However, this class was designed with complex tensor in mind, which is something the Tensor Toolbox has some issue with.

The main manipulations I need to do are this type of matrix products (which I need to define a tensor contraction).

Matt J on 15 Sep 2020

Edited: Matt J on 15 Sep 2020

You could also try the ndSparse class,

https://www.mathworks.com/matlabcentral/fileexchange/29832-n-dimensional-sparse-arrays

It shouldn't have any issue with complex arrays. A minor issue that you might have is that ndSparse uses pre-R2016 syntax, so it doesn't support implicit expansion or reduction operations along more than one dimension at a time, like in sum(X,[1,2]).

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Matt J on 10 Sep 2020

Edited: Matt J on 15 Sep 2020

Open in MATLAB Online

1 vote

Note that I have little flexibility in the way these matrices are stored, since other parts of my code rely heavily on this.

That's not true. You can create a class that represents the matrix with any kind of internal storage scheme that you want, but define dependent class properties and methods to access it with the same interface as currently.

One alternative storage scheme I would consider is to store only the non-zero columns of the sparse matrix, but with index vectors to tabulate where their true column index locations would be in the full-sized matrix. It is straightforward to write an mtimes() method for such a class.

classdef compactMatrix
    properties
     size
     
     nzcols %The non-zero column indices in the actual matrix representation
     
     submatrix %sparse submatrix containing only the nonzero columns of the actual matrix
    end
    
    properties(Dependent)
        var
    end
    
    methods
        function obj=compactMatrix(rows,cols,vals, M,N)
            
            obj.size=[M,N];
            obj.nzcols=unique(cols);
            obj.submatrix=sparse(rows, findgroups(cols),vals);
 
        end
        function C=mtimes(A,B)
            submatrix=A.submatrix*B.submatrix( A.nzcols , :);
           
            keep=any(submatrix,1); %check if any columns became zero
            nzcols=B.nzcols(keep);
            submatrix=submatrix(:,keep);
            
            C.nzcols=nzcols;
            C.dims=[A.size(1), B.size(2)];
            C.submatrix = submatrix;
            
        end
        
        function var=get.var(obj) %for back compatibility with existing code
            
            [rows,cols,vals]=find(obj.submatrix);
 
            var = sparse(rows, obj.nzcols(cols), vals);
            var=var(:);
            var(end+1:prod(obj.size))=0;
            
        end
    end
    
end

1 Comment
Show -1 older comments Hide -1 older comments

AS on 10 Sep 2020

Thanks for the excellent answer.

Sign in to comment.

Efficient matrix multiplication for large sparse matrices stored as sparse column vectors

5 Comments
Show 3 older comments Hide 3 older comments

Accepted Answer

1 Comment
Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Products

Tags

Community Treasure Hunt

Efficient matrix multiplication for large sparse matrices stored as sparse column vectors

5 Comments Show 3 older comments Hide 3 older comments

Accepted Answer

1 Comment Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Products

Tags

See Also

Community Treasure Hunt

5 Comments
Show 3 older comments Hide 3 older comments

1 Comment
Show -1 older comments Hide -1 older comments