Help to make a code faster/memory efficient

Dear all, I am trying to improve the computation speed/memory usage of a code I've done for a electromagnetism problem. I have converted the arrays to single values which has helped a bit and also tried converting to gpuArrays but this seems slower. I've copied the most time consuming part of my code, other parts are coded similarly so any suggestion in this piece of code would be helpful. It seems the cross product calculation consumes a lot of time, I think maybe using something like pagefun, arrayfun? but I have no idea how to convert this to that. Any ideas would be deeply appreciated.
magnetic=zeros(19128960,3,'single');
for u = 1:2592
for n = 1:7380
q =1+(n-1)+(7380*(u-1));
magne1=cross((PosS(n,:)-PosSeg1(u,:)),(ExtPosSeg1(u,:)-PosSeg1(u,:)));
magnetic(q,1)=magne1(:,1);
magnetic(q,2)=magne1(:,2);
magnetic(q,3)=magne1(:,3);
end
end

 Accepted Answer

Cedric
Cedric on 11 Sep 2017
Edited: Cedric on 11 Sep 2017
Check this out:
n1 = 7380 ;
n2 = 2592 ;
PosS = rand( n1, 3 ) ;
PosSeg1 = rand( n2, 3 ) ;
ExtPosSeg1 = rand( n2, 3 ) ;
Loop-based:
tic ;
magnetic=zeros(n1*n2, 3,'single');
for u = 1:n2
for n = 1:n1
q =1+(n-1)+(n1*(u-1));
magne1=cross((PosS(n,:)-PosSeg1(u,:)),(ExtPosSeg1(u,:)-PosSeg1(u,:)));
magnetic(q,1)=magne1(:,1);
magnetic(q,2)=magne1(:,2);
magnetic(q,3)=magne1(:,3);
end
end
toc
outputs
Elapsed time is 86.585071 seconds.
and here is a vector version
tic ;
magnetic_CW = reshape( permute( cross( ...
bsxfun( @minus, permute( PosS, [3,2,1] ), PosSeg1 ), ...
repmat( ExtPosSeg1 - PosSeg1, 1, 1, n1 )), [2,3,1] ), 3, [] ).' ;
toc
outputs
Elapsed time is 2.267693 seconds.
Both outputs are equal:
isequal( magnetic, single( magnetic_CW ))
ans =
logical
1
No need to convert to single, however, I did it for the comparison.
Cheers,
Cedric
NOTES
  1. It looks awfully complicated because I wanted to avoid creating intermediary variables, but what we do is just a cross product on two 3D arrays, one containing PosS-PosSeg1 for all columns of both (hence the call to BSXFUN), and the second being the corresponding 3D array of ExtPosSeg1-PosSeg1. All the rest is about permuting/reshaping/repeating so dimensions are appropriate.
  2. You may not need to work with 2D arrays actually, and you can probably eliminate the external RESHAPE and PERMUTE operations.

5 Comments

Fernando
Fernando on 11 Sep 2017
Edited: Fernando on 11 Sep 2017
Looks good to me, in just this type of operation was taking about 41 s, this seems to make it about 1 s. I had some other similar operations in the same for loop and the whole thing takes about 7 mins. This will probably make the whole thing faster.
I should say I had seen this function but I am a bit unsure about how to use it and I was not sure if it was applicable to what I am doing. I am going to try to understand it, thanks a lot this will help a lot as I am trying to make an analytical approximation to a FEM calculation in COMSOL that takes a bit of time to solve.
PD. I didn't get what you meant in note 2.
If you split the one liner and create a few intermediary variables, to understand:
A = bsxfun( @minus, permute( PosS, [3,2,1] ), PosSeg1 ) ;
B = repmat( ExtPosSeg1 - PosSeg1, 1, 1, n1 ) ;
C = cross( A, B ) ;
magnetic_CW = reshape( permute( C, [2,3,1] ), 3, [] ).' ;
you see that we use the ability of CROSS to work on 3D arrays. What comes before is for building two 3D arrays with appropriate content, and what comes after is for reshaping the output so it matches the shape of your original output.
These operations take time, not as much as compute the cross product but still. If you look inside the CROSS function ( open cross ) you will see more calls to (I)PERMUTE and RESHAPE.
What happens is quite a few conversions 2D->3D->2D->3D... Often, the fastest computation that you can get is by working on the largest possible arrays (so MALTAB and underlying libs can optimize). The point of note 2 is just that the 3D array C may be already shaped in a way that is advantageous for the next computation, so you may want to evaluate if it is worth reshaping is back to your original shape at this stage, or if it is better to perform further computations first and reshaping later.
The most complicated part to understand is the call to BSXFUN. To better understand, look at this:
>> bsxfun( @times, 1:5, [10;20;30] )
ans =
10 20 30 40 50
20 40 60 80 100
30 60 90 120 150
where the first array is 1x5 and the second 3x1. You can see what the implicit expansion does (here when the operation is a product, but you can pass handles to any appropriate functions).
In your case (think in 3D and xyz axes) I rotate PosS that is 7380x3x1 (in the xy plan) around the y axis (by permuting dims 1 and 3), so it becomes vertical (in the yz plan). So now I have the rotated/permuted PosS that is 1x3x7380 and PosSeg1 that is 2592x3x1 (in the xy plan) and I can perform an expansion (here the operation is minus). The expansion "fills the volume defined by the two arrays" in a sense.
The outcome is a 2592x3x7830 array with all relevant differences.
The next step is to create the corresponding 3D array of ExtPosSeg1-PosSeg1 ..
Oh I understand now.
Thanks a lot, I have managed to implement this vectorization in the different calculations of that for loop in my code, now it works 140 times faster in that part. Thanks a lot.

Sign in to comment.

More Answers (0)

Categories

Find more on Parallel Computing in Help Center and File Exchange

Asked:

on 11 Sep 2017

Edited:

on 12 Sep 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!