Efficient algorithm for a duplication matrix

Can anybody help me to design a Matlab code function that creates a duplication matrix D?
Thanks in advnace.
My codes is very slow...
Any ideas to speed it up?
n=1000;
% Duplication matrix: vec(P)=Dvech(P)
tic
m=1/2*n*(n+1);
nsq=n^2;
DT=sparse(m,nsq);
for j=1:n
for i=j:n
ijth=(j-1)*n+i;
jith=(i-1)*n+j;
vecTij=sparse(ijth,1,1,nsq,1);
vecTij(jith,1)=1;
k=(j-1)*n+i-1/2*j*(j-1);
uij=sparse(k,1,1,m,1);
DT=DT+uij*vecTij';
end
end
D=DT';
toc
% test duplication matrix
C=rand(n,n);
P=1/2*(C+C');
vechP=nonzeros(tril(P));
vecP=P(:);
err_D=vecP-D*vechP;
max(err_D(:))
min(err_D(:))

2 Comments

What are vec and vech in this context?
The question Text is complete copied from Wikipedia- we can assume it is meant: https://en.m.wikipedia.org/wiki/Vectorization_%28mathematics%29?wprov=sfla1

Sign in to comment.

 Accepted Answer

Jan
Jan on 28 Jul 2019
Edited: Jan on 4 Aug 2021
For n=300 this needs 1.3 sec instead of 27.5 sec:
tic
m = n * (n + 1) / 2;
nsq = n^2;
D = spalloc(nsq, m, nsq);
row = 1;
a = 1;
for i = 1:n
b = i;
for j = 0:i-2
D(row + j, b) = 1;
b = b + n - j - 1;
end
row = row + i - 1;
for j = 0:n-i
D(row + j, a + j) = 1;
end
row = row + n - i + 1;
a = a + n - i + 1;
end
toc
But it is much faster to create the index vector at first instead of accessing the sparse matrix repeatedly:
tic
m = n * (n + 1) / 2;
nsq = n^2;
r = 1;
a = 1;
v = zeros(1, nsq);
for i = 1:n
b = i;
for j = 0:i-2
v(r) = b;
b = b + n - j - 1;
r = r + 1;
end
for j = 0:n-i
v(r) = a + j;
r = r + 1;
end
% BUGFIX: Omit "r = r + n - i + 1;" Thanks Trisha Phippard
a = a + n - i + 1;
end
D2 = sparse(1:nsq, v, 1, nsq, m);
toc
Now I get 0.013 sec for n=300. Finally vectorize the 2 inner loops:
tic
m = n * (n + 1) / 2;
nsq = n^2;
r = 1;
a = 1;
v = zeros(1, nsq);
cn = cumsum(n:-1:2); % [EDITED, 2021-08-04], 10% faster
for i = 1:n
% v(r:r + i - 2) = i - n + cumsum(n - (0:i-2));
v(r:r + i - 2) = i - n + cn(1:i - 1); % [EDITED, 2021-08-04]
r = r + i - 1;
v(r:r + n - i) = a:a + n - i;
r = r + n - i + 1;
a = a + n - i + 1;
end
D2 = sparse(1:nsq, v, 1, nsq, m);
toc
0.011 sec. A speedup of factor 2500 for n=300. And 0.12 sec for n=1000. Nice! :-)

7 Comments

@Youngkyu Kim: The runtime behavior of the original code is quadratic and the extrapolated run-time for n=1000 is about 14'000 seconds. My last suggestions needs 0.12 seconds.
I took me some time to figure this out. Don't you think, that the problem is solved? A short reaction would be fine, or at least accepting the answer as a working solution. I do not need any reputation points anymore, but I think, such a success is worth to be honored.
Thanks for your suggesteion.
I modified my code and it worked well enough.
Your suggestion is slightly faster than mine.
function D = duplication(n)
m=1/2*n*(n+1);
nsq=n^2;
Lind=tril(true(n));
Lind=find(Lind(:));
Lind=Lind(:);
Uind=rem(Lind-1,n)*n+ceil(Lind/n);
i=(1:m)';
a=(i-1)*nsq+Lind;
b=(i-1)*nsq+Uind;
c=union(a,b);
[I,J]=ind2sub([nsq,m],c);
D=sparse(I,J,1,nsq,m);
end
A very useful algorithm!
In your second version, it would seem that the following line is incorrect:
r = r + n - i + 1;
The whole series of increments in r work as they should, there is no need to jump to a new location. In the vectorized version it is needed, but in the scalar version it causes problems.
I wonder if there is a way to simply "store" the duplication matrices for n=1 until n=1000 in matlab and then just accessing them when needed instead of recreating them every time. If yes, would that speed up things even more?
@Michael Stollenwerk: Yes, of course. Simply store the calcuated matrices in a persistent cell array:
function D = duplication(n)
persistent C
if isempty(C)
nCache = 1000; % Set according to your needs
C = cell(1, nCache);
end
if n <= numel(C) && ~isempty(C{n})
D = C{n};
else
m = n * (n + 1) / 2;
nsq = n^2;
r = 1;
a = 1;
v = zeros(1, nsq);
cn = cumsum(n:-1:2);
for i = 1:n
v(r:r + i - 2) = i - n + cn(1:i - 1);
r = r + i - 1;
v(r:r + n - i) = a:a + n - i;
r = r + n - i + 1;
a = a + n - i + 1;
end
D = sparse(1:nsq, v, 1, nsq, m);
if n <= numel(C)
C{n} = D;
end
end
end
When this function is called the first time for a specific n, D is calculated. On further calls D is taken from the persistently stored list C.
% Timings for i7, Win 10 64, Matlab R2018b:
tic; for k = 1:1000; D = duplication(k); end; toc
tic; for k = 1:1000; D = duplication(k); end; toc
% First run with calculations:
% Elapsed time is 35.022836 seconds.
% Second run with copies from the list:
% Elapsed time is 0.006062 seconds.
The list C of my code needs 6GB of RAM, if all 1000 matrices are created. Writing it as -v7.3 MAT file creates an 1GB file. Reading it takes 15 seconds on a HDD instead of 35 seconds of creating all matrices dynamically.

Sign in to comment.

More Answers (0)

Categories

Asked:

on 27 Jul 2019

Commented:

Jan
on 5 Aug 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!