Vectorization of For loop

Dear Matlab community,
Is it possible to vectorize the following for loop:
a = rand(100,100);
b = rand(500,100,100);
for i = 1:500
c = reshape(b(i, :, :),100,100);
d(i) = sum(sum(a.*c));
end

14 Comments

@ Walter & @ Bruno: both of your answers are slower than the For loop, especially when I used larger matrics, any idea? I need a solution that is faster than For loop. Please see validation below:
tic
a = rand(1000,1000);
b = rand(500,1000,1000);
for i = 1:500
c = reshape(b(i, :, :),1000,1000);
d(i) = sum(sum(a.*c));
end
toc
Elapsed time is 27.988668 seconds.
%%%%%%%%
% Walter answer:
tic
d = sum(b .* reshape(a, 1, 1000, 1000), [2 3]);
toc
Elapsed time is 68.882801 seconds.
%%%%%%%%
% Bruno answer:
tic
d = b(:,:)*a(:);
toc
Elapsed time is 46.952318 seconds.
I get different tic/toc results. Check if you have enough RAM, this code requires > 6Gb (my laptop have 16 Gb)
function benchtest
a = rand(1000,1000);
b = rand(500,1000,1000);
%%
tic
d = zeros(1,500);
for i = 1:500
c = reshape(b(i, :, :),1000,1000);
d(i) = sum(sum(a.*c));
end
toc % Elapsed time is 12.940297 seconds.
%% Walter answer:
tic
d = sum(b .* reshape(a, 1, 1000, 1000), [2 3]);
toc % Elapsed time is 1.453960 seconds.
%%%%%%%%
%% Bruno answer:
tic
d = b(:,:)*a(:);
toc % Elapsed time is 2.399184 seconds.
@ Bruno, my laptop has 8 Gb of RAM. Is it possible with such RAM to speed up the following simple code; knowing that my laptop can finish this for loop in about 20 sec.
If you want I can post this as a separate question.
a = rand(1000,1000);
b = rand(500,1000,1000);
for i = 1:500
d = sum(sum(a.*b))/sqrt(sum(sum(a.*a))*sum(sum(b.*b)));
end
MATLAB Online test using Bruno's code
Elapsed time is 11.686848 seconds. Elapsed time is 0.881116 seconds. Elapsed time is 2.748569 seconds.
d = sum(sum(a.*b))/sqrt(sum(sum(a.*a))*sum(sum(b.*b)));
You need to index b there, and probably squeeze()
parts can be pulled out of the loop.
Thanks Walter, I apologize as I missed the indexing. I already pulled some parts out of the loop, but still time consuming. Your answer and Bruno as well, are much faster. Can we represent the following code without for loop? The code that I'm running currenly takes almost 6 hours to finish running through the data!
a = rand(1000,1000);
b = rand(500,1000,1000);
for i = 1:500
d(i) = sum(sum(a.*b(i,:,:)))/sqrt(sum(sum(a.*a))*sum(sum(b(i,:,:).*b(i,:,:))));
end
a = rand(1000,1000);
b = rand(500,1000,1000);
a2sumsqrt = sqrt(sum(a(:).*a(:)));
for i = 1:500
B = squeeze(b(i,:,:));
b2sumsqrt = sqrt(sum(B(:).*B(:)));
d(i) = sum(a(:).*B(:))/(a2sumsqrt/b2sumsqrt);
end
And from there you can go to
a = rand(1000,1000);
b = rand(500,1000,1000);
a2sumsqrt = sqrt(sum(a.*a, [1 2]));
b2sumsqrt = sqrt(sum(b.*b, [2 3]));
for i = 1:500
B = squeeze(b(i,:,:));
d(i) = sum(a(:).*B(:))/(a2sumsqrt/b2sumsqrt(i));
end
and you can do better than that too.
"@ Bruno, my laptop has 8 Gb of RAM. Is it possible with such RAM to speed up the following simple code; knowing that my laptop can finish this for loop in about 20 sec. "
It's no longer the RAM speed. I think my code and Walter code requires some big copy of extra temporary array and your computer (has barely enough memory) starts to swap the RAM to the hard drive that slows down the runtime.
Your code however requires only a small chunk extra of memory and it can run entirely without swapping onto HD.
Your code bellow doesn't run
a = rand(1000,1000);
b = rand(500,1000,1000);
for i = 1:500
d(i) = sum(sum(a.*b(i,:,:)))/sqrt(sum(sum(a.*a))*sum(sum(b(i,:,:).*b(i,:,:))));
end
I guess you want to do this
function benchtest
a = rand(1000,1000);
b = rand(500,1000,1000);
%% Walter answer fixed error "/" -> '*'
tic
a2sumsqrt = sqrt(sum(a(:).*a(:)));
for i = 1:size(b,1)
B = squeeze(b(i,:,:));
b2sumsqrt = sqrt(sum(B(:).*B(:)));
d(i) = sum(a(:).*B(:))/(a2sumsqrt*b2sumsqrt);
end
toc % Elapsed time is 14.231275 seconds.
%%%%%%%%
%% Bruno answer:
tic
b2 = b(:,:);
a2 = a(:);
d = ((b2*a2) ./ sqrt(sum(b2.^2,2))) / norm(a2);
toc % Elapsed time is 3.586152 seconds.
end
@ Walter, Thanks for the explanation. It is obvious that my RAM is too limited. Therefore for loop is the best.
You seem to mistaken between me and Walter, I gave the explanation of RAM issue.
@ Bruno, appreciate your clarification. Your answer is super fast, but as you mentioned the RAM can not handle such big matrics. I guess sticking to the for loop is the best choice I have now.
You could do a hybrid method: for-loop with each iteration compute a chunk of 50 elements of d.
@ Bruno, Thanks for bringing the hybrid idea, I like it. Also, I'm aware that you explained the RAM issue, but I was telling Walter that the RAM limitation make the for loop my best bet.

Sign in to comment.

 Accepted Answer

d = sum(b .* reshape(a, 1, 100, 100), [2 3]);

More Answers (1)

d = b(:,:)*a(:)

1 Comment

Thanks Bruno for your smart answer.

Sign in to comment.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Asked:

on 14 Aug 2020

Commented:

on 16 Aug 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!