# Why is pagemtimes slower than just coding up the matrix multiplication?

Nathan Zechar on 29 May 2021
Commented: Rik on 31 May 2021
I noticed that pagemtimes is slower than just expanding an equation and coding it on both CPU and GPU. But for GPU it is exceptionally slow.
Here is an example. This can be coded up two different ways. Notice the performance of pagemtimes with just the CPU.
clear all
Nx = 100;
Ny = 100;
Nz = 100;
[A1,A2,A3,B1,B2,B3,C1,C2,C3,...
E11,E12,E13,E21,E22,E23,E31,E32,E33,...
F11,F12,F13,F21,F22,F23,F31,F32,F33] = deal(rand(Nx,Ny,Nz));
tic
for i = 1:20
%% Electric Field Update
C1 = F11.*(A1.*E11+B1)+F12.*(A2.*E12+B2)+F13.*(A3.*E13+B3);
C2 = F21.*(A2.*E21+B2)+F22.*(A2.*E22+B2)+F23.*(A3.*E23+B3);
C3 = F31.*(A3.*E31+B3)+F32.*(A3.*E32+B3)+F33.*(A3.*E33+B3);
end
toc
[A,B,C] = deal(rand(3,1,Nx,Ny,Nz));
[E,F] = deal(rand(3,3,Nx,Ny,Nz));
tic
for i = 1:20
C = pagemtimes(F,(B+pagemtimes(E,A)));
end
toc
Without pagemtimes - "Elapsed time is 0.032141 seconds".
With pagemtimes - "Elapsed time is 0.325006 seconds"
Using gpuArray() on the variables in the deal() function the the difference in times are even slower!
Without pagemtimes - "Elapsed time is 0.012688 seconds."
With pagemtimes - "Elapsed time is 5.357220 seconds."
Why is this functions so slow?
Rik on 31 May 2021
You're using different data and inputs for both strategies. Can you fix the bugs in my code below? You can only compare the times if you use the same size inputs. Otherwise you could do most pre-processing before you start your timer.
You should also be aware that tic,toc is only valid for longer times and should only be used for a first order estimate. If you want to truly compare performance you need the timeit function.
Nx = 100;
Ny = 100;
Nz = 100;
[A,B,C] = deal(rand(3,1,Nx,Ny,Nz));
[E,F] = deal(rand(3,3,Nx,Ny,Nz));
x1=direct(A,B,E,F);
x2=pagemtimes_version(A,B,E,F);
x=abs(x1-x2);
max(x(:)) %This should be very close to 0
ans = 6.3158
timeit(@()direct(A,B,E,F))
ans = 0.3207
timeit(@()pagemtimes_version(A,B,E,F))
ans = 0.0252
function C=direct(A,B,E,F)
%% Electric Field Update
%This version is probably incorrect
C(1,1,:,:,:) = ...
F(1,1,:,:,:).*(A(1,1,:,:,:).*E(1,1,:,:,:)+B(1,1,:,:,:)) +...
F(1,2,:,:,:).*(A(2,1,:,:,:).*E(1,2,:,:,:)+B(2,1,:,:,:)) +...
F(1,3,:,:,:).*(A(3,1,:,:,:).*E(1,3,:,:,:)+B(3,1,:,:,:));
C(2,1,:,:,:) = ...
F(2,1,:,:,:).*(A(2,1,:,:,:).*E(2,1,:,:,:)+B(2,1,:,:,:)) +...
F(2,2,:,:,:).*(A(2,1,:,:,:).*E(2,2,:,:,:)+B(2,1,:,:,:)) +...
F(2,3,:,:,:).*(A(3,1,:,:,:).*E(2,3,:,:,:)+B(3,1,:,:,:));
C(3,1,:,:,:) = ...
F(3,1,:,:,:).*(A(3,1,:,:,:).*E(3,1,:,:,:)+B(3,1,:,:,:)) +...
F(3,2,:,:,:).*(A(3,1,:,:,:).*E(3,2,:,:,:)+B(3,1,:,:,:)) +...
F(3,3,:,:,:).*(A(3,1,:,:,:).*E(3,3,:,:,:)+B(3,1,:,:,:));
end
function C=pagemtimes_version(A,B,E,F)
C = pagemtimes(F,(B+pagemtimes(E,A)));
end

Sulaymon Eshkabilov on 29 May 2021
Hi,
There is no need to perform this loop calcs that is just a repeatition:
tic
%for i = 1:20
%% Electric Field Update
C1 = F11.*(A1.*E11+B1)+F12.*(A2.*E12+B2)+F13.*(A3.*E13+B3);
C2 = F21.*(A2.*E21+B2)+F22.*(A2.*E22+B2)+F23.*(A3.*E23+B3);
C3 = F31.*(A3.*E31+B3)+F32.*(A3.*E32+B3)+F33.*(A3.*E33+B3);
% end
toc
Good luck.
Nathan Zechar on 31 May 2021
This is correct DGM,
The loop is needed to undestand the execution time.