# For loops are too slow

Joseph on 5 Aug 2013
I have some MATLAB code that was given to me by a coworker. This code contains a three layered for loop. So the code has the basic structure:
for
for
for
The code works but it takes an incredibly long time to execute as the three layered for loop is performing operations on a 40000x4 matrix. I need a way to either bypass the for loops or make them run faster. Someone mentioned to me that I could call Perl in MATLAB to do this but I am unsure how this works. Dose anyone have any ideas. Thank you.
Matt Kindig on 7 Aug 2013
I think the pre-allocating xmsd would help streamline the code, while still retaining the for loops. Something like this:
xmsd = NaN(nmax, 3);
for i = 1:1:N
for j = 1:1:norigin
jstart = (j-1)*N + i;
for k = nmin:1:nmax
kend = jstart + k*N;
xmsd(k,1:3) = xmsd(k,1:3) + (md_msd(kend,1:3) - md_msd(jstart,1:3) ).^2;
end
end
end

dpb on 7 Aug 2013
Edited: dpb on 8 Aug 2013
for i = 1:1:N
for j = 1:1:norigin
jstart = (j-1)*N + i;
for k = nmin:1:nmax
kend = jstart + k*N;
xmsd(k,1:3) = xmsd(k,1:3) + (md_msd(kend,1:3)-md_msd(jstart,1:3)).^2;
end
end
end
Start in the inner loop...first off
md_msd(jstart,1:3)
is invariant w/ k so replace the indexed expression w/ the equivalent
...
for j = 1:1:norigin
jstart = (j-1)*N + i;
mdj=md_msd(jstart,1:3);
for k = nmin:nmax
kend = jstart + k*N;
xmsd(k,1:3) = xmsd(k,1:3) + (md_msd(kend,1:3)-mdj).^2;
end
end
Then, kend takes on values of jstart+N, jstart+2N, jstart+3N, ... so define an index vector as
kdx=jstart+N:N:jstart+nmax*N;
and then the loop on k can be written as
xmsd(nmin:nmax,1:3) = xmsd(nmin:nmax,1:3) + (md_msd(kdx,1:3)-mdj).^2;
So, after the first loop reduction you're left with
for i = 1:1:N
for j = 1:1:norigin
jstart = (j-1)*N + i;
kdx=jstart+nmin*N:N:jstart+nmax*N;
mdj=md_msd(jstart,1:3);
xmsd(nmin:nmax,1:3) = xmsd(nmin:nmax,1:3) + (md_msd(kdx,1:3)-mdj).^2;
end
end
Now, see what you can do from here... :)
ERRATUM: "Then, kend takes on values of jstart+N, jstart+2N, jstart+3N, ..."
Actually, the first value is jstart+nmin*N not jstart+N. The increment is N, however, so only the lower bound needs must be corrected.
Joseph on 7 Aug 2013
Thank you very much this helped!

