Why is the vectorized calculation is slower?

Question

Mr M. on 10 May 2017

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/339730-why-is-the-vectorized-calculation-is-slower

Commented: Jan on 11 May 2017

The first part of the script is the vectorized form of the second part, but the second part is much faster (2 or 3 orders of magnitude). Why?

N = 10000000;
block = [0 1 1 0 0 0 1 0 0 1];
X = repmat(block,1,N);
N = length(X);
tic; 
IND0 = (X==0);
IND1 = not(IND0);
Y = nan(1,N);
Y(IND0) = cos(X(IND0));
Y(IND1) = sin(X(IND1));
result = prod(Y);
toc;
clear all
N = 10000;
block = [0 1 1 0 0 0 1 0 0 1];
X = repmat(block,1,N);
N = length(X);
tic;
result = 1;
for i = 1:N
  if X(i) == 0
      result = result*cos(X(i));
  else
      result = result*sin(X(i));
  end
end
toc;

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Matt J on 10 May 2017

1
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/339730-why-is-the-vectorized-calculation-is-slower#answer_266489

Edited: Matt J on 10 May 2017

You are using different N in each test.

Also you allocate a lot more memory in version 1 and this takes a lot of time. Version 1 creates at least 5 large arrays IND0, X(IND0), X(IND1), cos(X(IND0)), and sin(X(IND0)), whereas the 2nd version allocates no memory at all.

3 Comments
Show 1 older commentHide 1 older comment

Matt J on 11 May 2017

Edited: Matt J on 11 May 2017

Open in MATLAB Online

but this is alway a problem during vectorization of for cycles

It is a problem, which is why it is hard to know sometimes whether vectorized code or a loop will be better.

Usually, though, the thing that slows down loops is that in each iteration, you are allocating significant memory or calling non-optimized, non-builtin functions. In your case, none of that is true.

Is it possible to vectorize better?

It's hard to know from your example, because it is obviously artificial. I believe the most efficient method for this example is,

   n=numel(X)-sum(X);
   result=sin(1)^n;

Jan on 11 May 2017

If N is chosen the same for both cases, I get with N=1e6: 0.566 sec for the vectorized code and 0.762 for the loop. For larger arrays the vectorized code can exceed the available RAM and swapping to disk will slow down the processing massively.

Sign in to comment.

Why is the vectorized calculation is slower?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

3 Comments
Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Community Treasure Hunt

Why is the vectorized calculation is slower?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

3 Comments Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

3 Comments
Show 1 older commentHide 1 older comment