is x' * x faster than x.^2 (x is a vector)

Question

Yuji Zhang on 3 Oct 2014

1
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/157314-is-x-x-faster-than-x-2-x-is-a-vector

Edited: Matheus Rocha on 9 Aug 2021

Accepted Answer: John D'Errico

Hi~ Somebody recommended me to use x' * x.

Is it faster than sum(x.^2) ? Any thoughts are appreciated. Thanks!

[EDITED, sum() was forgotton at first]

1 Comment
Show -1 older commentsHide -1 older comments

Bruce Elliott on 9 Jan 2019

I had the same question. Years ago we were taught that even for scalars x*x would be faster than x^2, since the "power" function actually evaluates x^n as exp(n*ln(x)), in order to handle non-integer exponents.

Is that true for Matlab? I just ran some simple timing tests, and it does not seem to be true. I'm guessing that the power function is smart enough to detect integer exponents.

Sign in to comment.

Sign in to answer this question.

Answer 1

John D'Errico on 3 Oct 2014

9
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/157314-is-x-x-faster-than-x-2-x-is-a-vector#answer_153849

Edited: John D'Errico on 3 Oct 2014

Open in MATLAB Online

Um, the two do completely different things. Or, at least significantly different things.

Assuming that x is a column vector, x'*x will yield the SUM of squares of the elements, returning a SCALAR value. This is a dot product.

x.^2 returns a vector of squares of the elements.

And, of course, if x is a row vector, then x'*x yields an n by n array of all products of elements.

So whoever told you the two do the same thing was simply wrong. Had they told you to compare x'*x (with x a column vector) with sum(x.^2), then they would have been correct that the two do the same thing, as long as x contains only real values. If x has complex numbers in it, then again, they would have been wrong, even if you included that sum.

x = [1 + i; 2 - i]
x =
            1 +          1i
            2 -          1i
x'*x
ans =
     7
sum(x.^2)
ans =
            3 -          2i

So you need to think carefully about what you need to do. Do you need the sum of the squares? Then you need to compare the dot product with the sum of squares. Do you want to form a vector of squares of the elements? Then compare x.*x with x.^2, with no transpose. Only you know what you are looking for.

IF x is a real column vector AND you use a sum in the second version, then and only then can you compare things, but even then one must be careful. The length of the vectors will be crucial, since MATLAB probably makes decisions on how to do those computations based on the length. Well, actually it is probably the BLAS that makes those decisions. And for small vectors, the difference is nothing to worry about anyway.

But why not test it yourself, rather than asking a question? You will ALWAYS learn something by playing around. Using Steve Eddins timeit from the FEX, I find that...

x = rand(1e6,1);
timeit(@() x'*x)
ans =
    0.0003879
timeit(@() sum(x.^2))
ans =
    0.0057262

For larger vectors, the difference is similar. This time I'll try a couple of other options, just for kicks.

x = rand(1e7,1);
timeit(@() x'*x)
ans =
    0.0075326
timeit(@() sum(x.^2))
ans =
     0.056066
timeit(@() norm(x).^2)
ans =
     0.068013
timeit(@() dot(x,x))
ans =
     0.048728
timeit(@() sum(x.*x))
ans =
     0.048614
timeit(@() sum(bsxfun(@times,x,x)))
ans =
     0.033474

The reason it is a bit faster is probably because the vector dot product is implemented in the BLAS using pretty highly optimized code, but it looks like the dot version costs due to some function overhead. But unless your vectors are seriously large, and you are doing the computation a lot, don't bother. And in any case, make sure you understand when the two will yield different results. (I was surprised by the way that the bsxfun version was second fastest.)

Similarly, if you really wanted to compute a vector of squares of elements, then again, a simple test is best.

x = rand(1e7,1);
timeit(@() x.*x)
ans =
     0.047043
timeit(@() x.^2)
ans =
     0.047777

Note that the two results are VERY close here. By the way, see that my use of timeit is important, as tic and toc are a dangerous way to compare time. (A simple use of tic and toc does not factor in the startup overhead, so the first time a function is called, it takes longer to run. As well, a single call is not a good estimator of the time required, since your own CPU often does things behind the scene that may cost time. You must make replicate calls, and average the time after deleting the first few. timeit does all this for you and more.)

Next, it is important to recognize that there are differences if the comparison is done as a script, versus inside a function. If your computations are done inside functions, then MATLAB does additional accelerations that are not done in scripts. So be careful when comparing things at the command line, and assuming that your timed results will be valid in a function!

And, yes, please accept the answer that does give a clear AND complete answer to your question.

5 Comments
Show 3 older commentsHide 3 older comments

David Young on 5 Oct 2014

Open in MATLAB Online

A small point: if you have a recent version of MATLAB, you don't need to get timeit from the FEX - it's in the system:

>> which timeit
C:\Program Files\MATLAB\R2013b\toolbox\matlab\timefun\timeit.m

John D'Errico on 5 Oct 2014

Neat. I've long wondered why Steve's great FEX submission did not get into MATLAB. It finally did.

Sign in to comment.

Answer 2

José-Luis on 3 Oct 2014

4
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/157314-is-x-x-faster-than-x-2-x-is-a-vector#answer_153847

Edited: José-Luis on 3 Oct 2014

Open in MATLAB Online

Why don't you test it?

x = rand(10^6,1);
tic
res1 = x.*x; %The transpose is unnecessary;
toc
res2 = x.^2;
toc

Elapsed time is 0.005957 seconds.

Elapsed time is 0.016413 seconds.

Please accept the answer that best solves your problem.

2 Comments
Show NoneHide None

Yuji Zhang on 4 Oct 2014

Hi Jose thanks for your help! I voted for you!

Matheus Rocha on 9 Aug 2021

Edited: Matheus Rocha on 9 Aug 2021

A tic is necessary after the first toc in order to reset the time counter. Without this, the second toc will return the time spent in computing both res1 and res2.

Sign in to comment.

is x' * x faster than x.^2 (x is a vector)

1 Comment
Show -1 older commentsHide -1 older comments

Accepted Answer

5 Comments
Show 3 older commentsHide 3 older comments

More Answers (1)

2 Comments
Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

is x' * x faster than x.^2 (x is a vector)

1 Comment Show -1 older commentsHide -1 older comments

Accepted Answer

5 Comments Show 3 older commentsHide 3 older comments

More Answers (1)

2 Comments Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments

5 Comments
Show 3 older commentsHide 3 older comments

2 Comments
Show NoneHide None