Why is x(:) so much slower than reshape(x,N,1) with complex arrays?

Question

Matt J on 27 Jul 2021

7
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/887219-why-is-x-so-much-slower-than-reshape-x-n-1-with-complex-arrays

Edited: Matt J on 26 May 2022

The two for loops below differ only in the flattening operation used to obtain A_1D . Why is the run time so much worse with A_3D(:) than with a call to reshape()?

Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
A0 = rand(N,1);
tic
for k = 1:20
    B = reshape( A0, [Nz,Ny,Nx] ) ;
    A_3D = fftn(B);
    A_1D = reshape( A_3D, N,1); %<--- Version 1
end
toc
Elapsed time is 3.770859 seconds.
tic
for k = 1:20    
    B = reshape( A0, [Nz,Ny,Nx] ) ;
    A_3D = fftn(B);
    A_1D = A_3D(:); %<--- Version 2
end
toc
Elapsed time is 5.056827 seconds.

7 Comments
Show 5 older commentsHide 5 older comments

Stephen23 on 28 Jul 2021

Edited: Stephen23 on 28 Jul 2021

@Bruno Luong: does RESHAPE also copy the data?

If not, then does this mean that one array in memory can be linked to two or more meta-headers (with different array sizes)?

Bruno Luong on 28 Jul 2021

I must admit that understanding why/when MATLAB make data copy become obscure to me since few years now. I did not come to a full understanding of how it works.

Sign in to comment.

Sign in to answer this question.

Answer 1

Matt J on 28 Jul 2021

4
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/887219-why-is-x-so-much-slower-than-reshape-x-n-1-with-complex-arrays#answer_755564

Open in MATLAB Online

The following simple test seems to support @Bruno Luong's conjecture that (:) results in data copying. The data of B1 resulting from reshape() has the same data pointer location as A, but B2 generated with (:) points to different data.

format debug
A=complex(rand(2),rand(2))
A = 
Structure address = 7f3f47f4e0e0
m = 2
n = 2
pr = 7f3fcb0112e0

   0.5114 + 0.6181i   0.5881 + 0.4450i
   0.5713 + 0.9018i   0.3682 + 0.8103i
B1=reshape(A,4,1),
B1 = 
Structure address = 7f3fcf1f4be0
m = 4
n = 1
pr = 7f3fcb0112e0

   0.5114 + 0.6181i
   0.5713 + 0.9018i
   0.5881 + 0.4450i
   0.3682 + 0.8103i
B2=A(:)
B2 = 
Structure address = 7f3f47e45a20
m = 4
n = 1
pr = 7f3faff0b980

   0.5114 + 0.6181i
   0.5713 + 0.9018i
   0.5881 + 0.4450i
   0.3682 + 0.8103i

8 Comments
Show 6 older commentsHide 6 older comments

Matt J on 28 Jul 2021

Edited: Matt J on 28 Jul 2021

Mathworks tech support got back to me. As @Bruno Luong predicted, they claim this to be a feature since R2015b. Apparently, because subsref indexing operations generally result in data copying (paraphrasing), it was decided this would be true for A0(:) as a special case as well. Why this is only true for complex A0 and not real A0, I did not get a clear answer on.

I understand that you are observing the differences in performances between reshape and colon operation.

Since MATLAB R2015b, the colon operator, A0(:) is an indexing operation. For the provided code, MATLAB is going through every row and column, which is not computationally fast.

On the other hand, the ‘reshape’ command will only change the property of the created array, which is a rather fast process.

For your interests, I have also timed the code across different releases of MATLAB. The result is documented below:

MATLAB 8.3.0.85671 (R2014a)

Colon operator: 7.1884e-07

Reshape: 1.0690e-06

MATLAB 8.5.0.204617 (R2015a)

Colon operator: 6.4574e-07

Reshape: 1.0706e-06

MATLAB 8.6.0.267246 (R2015b)

Colon operator: 0.0487

Reshape: 5.1078e-07

MATLAB 9.0.0.341360 (R2016a)

Colon operator: 0.0493

Reshape: 5.6105e-07

MATLAB 9.6.0.1072779 (R2019a)

Colon operator: 0.041046

Reshape: 1.0141e-06

MATLAB 9.9.0.1467703 (R2020b)

Colon operator: 0.040691

Reshape: 7.0104e-07

MATLAB 9.10.0.1684407 (R2021a) Update 3

Colon operator: 0.040806

Reshape: 5.7803e-07

You can see the changes happened since MATLAB R2015b. If you would like further details on what has been altered under the hood, please feel free to reach out. Otherwise, I will close the case for now. Please do not hesitate to let me know if you have further questions on the matter.

G A on 14 Aug 2021

Open in MATLAB Online

Walter, I am discussing complex valued arrays, it can be

max(A,[],'all')

but anyway for a complex number max(A) = max(abs(A))

Walter Roberson on 14 Aug 2021

Open in MATLAB Online

The (:) options are the slowest. reshape(abs(A),N,1) might possibly be the fastest -- there is notable variation in different runs.

Nx = 256;

Ny = 256;

Nz = 128;

N = Nx*Ny*Nz;

A0 = complex(randn(Nx, Ny, Nz), randn(Nx, Ny, Nz));

t(1) = timeit(@() use_abs_all(A0, N), 0)

t = 0.0937

t(2) = timeit(@() use_abs_colon(A0, N), 0)

t = 1×2

0.0937 0.1727

t(3) = timeit(@() use_abs_reshape_null(A0, N), 0)

t = 1×3

0.0937 0.1727 0.0994

t(4) = timeit(@() use_abs_reshape_N(A0, N), 0)

t = 1×4

0.0937 0.1727 0.0994 0.0935

t(5) = timeit(@() use_all(A0, N), 0)

t = 1×5

0.0937 0.1727 0.0994 0.0935 0.1012

t(6) = timeit(@() use_colon(A0, N), 0)

t = 1×6

0.0937 0.1727 0.0994 0.0935 0.1012 0.1802

t(7) = timeit(@() use_reshape_null(A0, N), 0)

t = 1×7

0.0937 0.1727 0.0994 0.0935 0.1012 0.1802 0.1013

t(8) = timeit(@() use_reshape_N(A0, N), 0)

t = 1×8

0.0937 0.1727 0.0994 0.0935 0.1012 0.1802 0.1013 0.1018

cats = categorical({'abs(all)', 'abs(:)', 'reshape(abs,[])','reshape(abs,N)', 'all', '(:)', 'reshape([])', 'reshape(N)'});

bar(cats, t)

function B = use_abs_all(A, N)

B = max(abs(A), [], 'all');

end

function B = use_abs_colon(A, N)

B = max(abs(A(:)));

end

function B = use_abs_reshape_null(A, N)

B = max(reshape(abs(A), [], 1));

end

function B = use_abs_reshape_N(A, N)

B = max(reshape(abs(A), N, 1));

end

function B = use_all(A, N)

B = max(A, [], 'all');

end

function B = use_colon(A, N)

B = max(A(:));

end

function B = use_reshape_null(A, N)

B = max(reshape(A, [], 1));

end

function B = use_reshape_N(A, N)

B = max(reshape(A, N, 1));

end

Sign in to comment.

Answer 2

Walter Roberson on 28 Jul 2021

1
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/887219-why-is-x-so-much-slower-than-reshape-x-n-1-with-complex-arrays#answer_755289

Open in MATLAB Online

Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
A0 = rand(Nx, Ny, Nz);
timeit(@() use_colon(A0, N), 0)
ans = 8.3490e-06
timeit(@() use_reshape_null(A0, N), 0)
ans = 6.5490e-06
timeit(@() use_reshape_N(A0, N), 0)
ans = 6.0925e-06
function use_colon(A, N)
   B = A(:);
end
function use_reshape_null(A, N)
    B = reshape(A, [], 1);
end
function use_reshape_N(A, N)
   B = reshape(A, N, 1);
end

In this particular test, the timing is close enough that we can speculate some reasons:

Using an explicit size to reshape to is faster than reshape([]) because reshape([]) has to spend time calculating the size based upon dividing numel() by the size of the known parameters.

Using (:) versus reshape() is not immediately as clear. The model for (:) is that it invokes subsref() with struct('type', {'()'}, 'subs', {':'}) and then subsref() has to invoke reshape() . I point out "model" because potentially the Execution Engine could optimize all of this, and one would tend to think that optimization of (:) should be especially good.

10 Comments
Show 8 older commentsHide 8 older comments

Adam Danz on 10 Aug 2021

Edited: Adam Danz on 10 Aug 2021

Open in MATLAB Online

When I run your example (modified to store and plot values) using the run feature (first plot) and using Matlab online (second plot) I get conflicting results.

Nx = 256;

Ny = 256;

Nz = 128;

N = Nx*Ny*Nz;

A0 = rand(Nx, Ny, Nz);

T = nan(1,3);

T(1) = timeit(@() use_colon(A0, N), 0);

T(2) = timeit(@() use_reshape_null(A0, N), 0);

T(3) = timeit(@() use_reshape_N(A0, N), 0);

bar(categorical({'colon','reshapeNull','reshape'}),T)

title('Run feature')

function use_colon(A, N)

B = A(:);

end

function use_reshape_null(A, N)

B = reshape(A, [], 1);

end

function use_reshape_N(A, N)

B = reshape(A, N, 1);

end

Results of the exact same code using Matlab Online (same platform and Matlab release)

When I run it on my local copy of Matlab (same release, Windows 10 Pro), the first time the colon method was slower but on subsequent runs, it was faster than the reshape methods. There were also some warnings that the measured time may be inaccurate due fast execution. Using the tic/toc method with repeated measures to measure variability, on my system the colon method with real numbers is fastest.

Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
A0 = rand(Nx, Ny, Nz);
n = numel(A0); 
nIterations = 500;  % number of iterations to include within the timer
nReps = 100;        % number of times to repeate the process to measure variability 
durations = nan(nReps,3);
for i = 1:nReps
    
    T = tic; 
    for j = 1:nIterations
        y = A0(:); 
    end
    durations(i,1) = toc(T); 
    
     T = tic; 
    for j = 1:nIterations
        y = reshape(A0,[],1); 
    end
    durations(i,2) = toc(T); 
    
     T = tic; 
    for j = 1:nIterations
        y = reshape(A0,n,1); 
    end
    durations(i,3) = toc(T);
end
figure
boxplot(durations, 'labels',{'colon','reshapeNull','reshape'})
grid on
ylabel(sprintf('Duration of %d iterations (sec)',nIterations))
xlabel('Method')
title(sprintf('Summary of tic/toc timing repeated %d times (real numbers).',nReps))
subtitle('Win 10 Pro; R2021A update 4')

Walter Roberson on 10 Aug 2021

Edited: Walter Roberson on 11 Aug 2021

I took your earlier plot version and ran it on my desktop, and on the Run feature here, and in LiveScript on my desktop. I modified it to scale the plot relative to the slowest, to make it easier to compare relative rates. I also modified it to return values from the functions, to avoid the possibility that Execution Engine might optimize away the work because of the variable not being returned,

Desktop .m and .mlx, colon was fastest in all tests.

The time requirements did not vary much for the .m version. reshape([]) was typically pretty much 2.5 times slower than colon.

The time requirements for the colon test for the .mlx varied quite a lot, sometimes taking twice as long. The reshape() timings did not vary nearly as much. Because of that, the relative ratios between colon and reshape([]) varied quite bit, from about 1.5 to 4.

Bringing the code over to the Run feature here, colon was almost always slowest. Furthermore, the minimum timings (for reshape(N)) were pretty much 10 times slower than what I was seeing on my desktop -- where that reshape would take about 6e-7 on desktop, it takes about 6e-6 here in the Run feature.

Walter Roberson on 11 Aug 2021

Open in MATLAB Online

@Adam Danz, I could use another pair of eyes in looking at this.

I noticed when I was running your code on my desktop, that every time I had a large timing outlier on colon. My tests showed that it was always the very first run. When I poked around, I realized that there had to be some kind of internal optimization going on. To reduce the effects of "premature optimization", I moved the operative code into functions, and I added recreation of A0 for each repetition.

Please run the below code with seperate set true and false, and notice the substantial difference in rates for the runs.

To try to deal with the initial spike in timings for colon, I decided that I would call the work functions once, "prime the pump". That was not enough, so now I loop calling them several times, warm up the system, get all the Execution Engine optimization of the functions out of the way. But... with separate = false, I am still seeing the spike on duration(1,1) !!

The only thing I have been able to think of at the moment is that when I prime the pump, I am not saving the output of the calls to a variable, and that might be affecting the timing ??

By the way, have a look at the recorded d2 values -- the timing of the priming cycles. They are notably different than the other timings... and I see unexpected spikes early on, optimized times mixed with unoptimized times.

Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
nIterations = 500;  % number of iterations to include within the timer
nReps = 100;        % number of times to repeate the process to measure variability 
durations = nan(nReps,3);
d2 = nan(nReps,3);
seperate = false;
if ~seperate; A0 = rand(Nx, Ny, Nz); end
for i = 1:nReps
    
    if seperate; A0 = rand(Nx, Ny, Nz); end
    tic; for j = 1 : 5; A0_colon(A0,N); end; d2(i,1) = toc; %prime the pump
    T = tic; 
    for j = 1:nIterations
        y = A0_colon(A0,N); 
    end
    durations(i,1) = toc(T); 
    
    if seperate; A0 = rand(Nx, Ny, Nz); end
    tic; for j = 1 : 5; A0_reshape_null(A0,N); end; d2(i,2) = toc; %prime the pump
    T = tic; 
    for j = 1:nIterations
        y = A0_reshape_null(A0,N); 
    end
    durations(i,2) = toc(T); 
    
    if seperate; A0 = rand(Nx, Ny, Nz); end
    tic; for j = 1 : 5; A0_reshape_N(A0, N); end; d2(i,3) = toc;   %prime the pump
    T = tic; 
    for j = 1:nIterations
        y = A0_reshape_N(A0,N); 
    end
    durations(i,3) = toc(T);
end
figure
boxplot(durations, 'labels',{'colon','reshapeNull','reshape'})
grid on
ylabel(sprintf('Duration of %d iterations (sec)',nIterations))
xlabel('Method')
title(sprintf('Summary of tic/toc timing repeated %d times (real numbers).',nReps))
function y = A0_colon(A0,~)
    y = A0(:);
end
function y = A0_reshape_null(A0,~)
    y = reshape(A0, [], 1);
end
function y = A0_reshape_N(A0,N)
    y = reshape(A0, N, 1);
end

Walter Roberson on 11 Aug 2021

I had the hypothesis that the 5 might have to do with my having 4 cores, or might have to do with the number of priming iterations I did, so I tested on my system that has more cores, and I did more priming iterations. The result was the same: duration(1,1) still had the major peak, and duration(5,1) was reliably a seconary peak.

Adam Danz on 12 Aug 2021

I noticed that when I re-run it within a script without clearing variables, the second peak at x=5 vanishes. Still curious but out of ideas.

Sign in to comment.

Answer 3

Matt J on 26 May 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/887219-why-is-x-so-much-slower-than-reshape-x-n-1-with-complex-arrays#answer_972250

Edited: Matt J on 26 May 2022

Open in MATLAB Online

I was just told by Tech Support that the issue was fixed in R2022a, but it doesn't appear that way:

Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
A0 = rand(Nx, Ny, Nz);
A0=complex(A0,A0);
timeit(@() A0(:), 0)
ans = 0.0530
timeit(@() use_reshape_null(A0, N), 0)
ans = 6.5199e-06
timeit(@() use_reshape_N(A0, N), 0)
ans = 6.8033e-06
function use_reshape_null(A, N)
    B = reshape(A, [], 1);
end
function use_reshape_N(A, N)
   B = reshape(A, N, 1);
end

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Why is x(:) so much slower than reshape(x,N,1) with complex arrays?

7 Comments
Show 5 older commentsHide 5 older comments

Accepted Answer

8 Comments
Show 6 older commentsHide 6 older comments

More Answers (2)

10 Comments
Show 8 older commentsHide 8 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Why is x(:) so much slower than reshape(x,N,1) with complex arrays?

7 Comments Show 5 older commentsHide 5 older comments

Accepted Answer

8 Comments Show 6 older commentsHide 6 older comments

More Answers (2)

10 Comments Show 8 older commentsHide 8 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

7 Comments
Show 5 older commentsHide 5 older comments

8 Comments
Show 6 older commentsHide 6 older comments

10 Comments
Show 8 older commentsHide 8 older comments

0 Comments
Show -2 older commentsHide -2 older comments