# Compute only a few entries of a big matrix product

1 view (last 30 days)
Yi Wang on 20 Nov 2023
Commented: Bruno Luong on 22 Nov 2023
Suppose I have a tall matrix x (say 10000 by 10), and a collection of indices {N1, N2, ..... Nk} where each Ni is a subset of {1, 2, ......, 10000}, and has small size (say |Ni| = 4), and {Ni} can overlap. Suppose I want to compute X:=xx', but am only interested in X(Ni, Ni) for all i, and I can set other entries of X to 0. Is there a way to quickly form such an X without doing the entire matrix product xx' and without using forloop?
Matt J on 20 Nov 2023
Edited: Matt J on 20 Nov 2023
each Ni is a subset of {1, 2, ......, 10000}, and has small size (say |Ni| = 4)
Is |Ni| fixed, i.e., independent of i?
Yi Wang on 20 Nov 2023
Yes.

Bruno Luong on 20 Nov 2023
Edited: Bruno Luong on 20 Nov 2023
It will not store the result in 2D matrix but 3D array. I hope you can switch to this format of storage.
x=rand(10000,10);
k = 4000;
m = 4;
N = arrayfun(@(varargin)randi(size(x,1), 2+randi(m-2), 1), 1:k, 'unif', 0);
tic
X=x*x';
toc
Elapsed time is 0.325735 seconds.
tic
[p,q] = size(x);
x(end+1,:)=NaN;
l = cellfun('prodofsize', N);
m = max(l);
NA = cellfun(@(n) paddata(n,m,'FillValue',p+1), N, 'unif', 0);
NA = [NA{:}];
y = x(NA,:);
y = reshape(y,[m k q]);
y = permute(y, [1 3 2]);
Y = pagemtimes(y,'none',y,'ctranspose'); % each page of Y contain X(N1,N1)
toc
Elapsed time is 0.126183 seconds.
% Check first and last results
X(N{1},N{1})
ans = 3×3
4.5222 2.2467 3.3031 2.2467 1.5639 1.8717 3.3031 1.8717 3.3260
Y(1:l(1),1:l(1),1)
ans = 3×3
4.5222 2.2467 3.3031 2.2467 1.5639 1.8717 3.3031 1.8717 3.3260
X(N{k},N{k})
ans = 3×3
3.2240 1.9555 3.6699 1.9555 1.8531 2.7131 3.6699 2.7131 5.1233
Y(1:l(k),1:l(k),k)
ans = 3×3
3.2240 1.9555 3.6699 1.9555 1.8531 2.7131 3.6699 2.7131 5.1233
Bruno Luong on 21 Nov 2023
Edited: Bruno Luong on 21 Nov 2023
This version putback the 3D array result to (sparse) matrix
x=rand(10000,10);
k = 4000;
m = 4;
N = arrayfun(@(varargin)randi(size(x,1), 2+randi(m-2), 1), 1:k, 'unif', 0);
tic
X=x*x';
toc
Elapsed time is 0.307213 seconds.
tic
[p,q] = size(x);
x(end+1,:)=0; % NOTE: x is modified here, not difficult to fix it if not wanted
l = cellfun('prodofsize', N);
m = max(l);
NA = cellfun(@(n) paddata(n,m,'FillValue',p+1), N, 'unif', 0);
NA = [NA{:}];
y = x(NA,:);
y = reshape(y,[m k q]);
y = permute(y, [1 3 2]);
Y = pagemtimes(y,'none',y,'ctranspose'); % each page of Y contain X(N1,N1)
I=repmat(NA,m,1);
J=repelem(NA,m,1);
[IJ,loc] = unique([I(:) J(:)],'rows');
keep = all(IJ<=p,2);
XX=sparse(IJ(keep,1),IJ(keep,2),Y(loc(keep)),p,p);
toc
Elapsed time is 0.162196 seconds.
% Check first and last results
X(N{1},N{1})
ans = 3×3
3.9532 2.9068 2.4660 2.9068 3.2922 2.5177 2.4660 2.5177 2.8154
full(XX(N{1},N{1}))
ans = 3×3
3.9532 2.9068 2.4660 2.9068 3.2922 2.5177 2.4660 2.5177 2.8154
X(N{k},N{k})
ans = 4×4
4.5085 3.3695 3.3036 2.9847 3.3695 3.4139 2.2194 1.9003 3.3036 2.2194 3.6579 2.7151 2.9847 1.9003 2.7151 3.0315
full(XX(N{k},N{k}))
ans = 4×4
4.5085 3.3695 3.3036 2.9847 3.3695 3.4139 2.2194 1.9003 3.3036 2.2194 3.6579 2.7151 2.9847 1.9003 2.7151 3.0315
Bruno Luong on 22 Nov 2023
A variation, avoid filter with subindexing, since the elements of the last row and column are 0
x=rand(10000,10);
k = 4000;
m = 4;
N = arrayfun(@(varargin)randi(size(x,1), 2+randi(m-2), 1), 1:k, 'unif', 0);
tic
[p,q] = size(x);
x(end+1,:) = 0; % NOTE: x is modified here, not difficult to fix it if not wanted
l = cellfun('prodofsize', N);
m = max(l);
NA = cellfun(@(n) paddata(n,m,'FillValue',p+1), N, 'unif', 0);
NA = [NA{:}];
y = x(NA,:);
y = reshape(y,[m k q]);
y = permute(y, [1 3 2]);
Y = pagemtimes(y,'none',y,'ctranspose'); % each page of Y contain X(N1,N1)
I = repmat(NA,m,1);
J = repelem(NA,m,1);
[IJ,loc] = unique([I(:) J(:)],'rows');
IJ(IJ>p) = 1;
XX = sparse(IJ(:,1),IJ(:,2),Y(loc), p, p);
toc
Elapsed time is 0.108971 seconds.