How can I speed up these for loops?

I have a group of 3 nested for loops that will calculate how much light is hitting a rectangular array of points if I know where the fixtures are located, in relation to the array of points. The results are used to determine how to best arrange a group of the same fixture in a space so there is a uniform amount of light at 5 different light levels, and at 8 different mounting heights, and it generally takes 10 to 20 calculations to find the best arrangement. At worst case scenario this is running 5*8*20 (800) times.
In the following code 'rows' and 'columns' are vectors of the same size that contains x and y coordinates of the of calculation points I need. 'xFixtureLocations' and 'yFixtureLocations' are vectors of the same size that contain the x and y coordinates of the fixtures. Finally, there is a struct 'IESdata' that contains 3 variables 'HorizAngles' (a vector of size 37), 'VertAngles' (a vector of size 37), and, 'photoTable' (an array of size m x n) that describes the intensity of the light in the steradian centered on the 'HorizAngles(m)' and 'VertAngles(n)' for the fixture in the calculation.
As of right now, I know that the calculation that is generated by this system is correct because I have compared it to programs that have been verified by the industry, namely AGI32. I have tried to parallelize the code by making the first loop a 'parfor' but I did not see a decent of improvement in how long it takes to run.
Irr = zeros(length(rows),length(columns),length(xFixtureLocations));
for i1 = 1:length(rows)
for i2 = 1:length(columns)
for i3 = 1:length(xFixtureLocations)
x = rows(i1)-xFixtureLocations(i3);
y = columns(i2)-yFixtureLocations(i3);
r = sqrt(x^2 + y^2);
thetaPt = atan(r/MountHeight);
if x==0
phiPt = 0;
else
phiPt = atan2(y,x);
end
phiPt = phiPt+pi + orientation(i3);
phiPt = mod(phiPt,2*pi)-pi;
dsq = r^2+(MountHeight)^2;
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
phiPt*180/pi, thetaPt*180/pi,'*nearest', 0.);
Irr(i1,i2,i3) = round((Ipt*cos(thetaPt)/dsq)*(Multiplier/1000),3);
end
end
end
Irr = sum(Irr,3);
Avg = mean2(Irr);
Max = max(max(Irr));
Min = min(min(Irr));

1 Comment

This would likely be easier to visualize with a (small) sample dataset to accompany it and I don't have time at the moment to pore over it in enough detail to really work out the details, but...
It strikes me you could generate the [X,Y] location arrays for the two sets of points (calculational and fixtures, respectively) and then use pdist2 to return the distance and then just do the calculation on that result. Seems as though could basically remove all the looping...
As said, a small sample case and code that could run in its entirety would surely help folks to do something without having to develop a baseline from which to start as well...

Sign in to comment.

 Accepted Answer

You're calling interp2 800 times with a single point each. Just moving that outside the loop should be able to speed things up considerably:
[phiPtall, thetaPtall] = deal(zeros(length(rows), length(columns), length(xFixtureLocations)));
for i1 = 1:length(rows)
for i2 = 1:length(columns)
for i3 = 1:length(xFixtureLocations)
x = rows(i1)-xFixtureLocations(i3);
y = columns(i2)-yFixtureLocations(i3);
r = sqrt(x^2 + y^2);
thetaPt = atan(r/MountHeight);
if x==0
phiPt = 0;
else
phiPt = atan2(y,x);
end
phiPt = phiPt+pi + orientation(i3);
phiPt = mod(phiPt,2*pi)-pi;
dsq = r^2+(MountHeight)^2;
phiPtall(i1,i2,i3) = phiPt*180/pi;
thetaPtall(i1,i2,i3) = thetaPt*180/pi;
end
end
end
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
phiPtall, thetaPtall,'*nearest', 0.);
Irr = round((Ipt.*cos(thetaPtall)./dsq).*(Multiplier./1000),3);
Irr = sum(Irr,3);
Avg = mean2(Irr);
Max = max(max(Irr));
Min = min(min(Irr));
As dpb mentioned, the calculations of the angles also look like they can be vectorized pretty well, but that would be more easily tested with some sample data.

5 Comments

I have attached a copy of the data that gets generated right before this loop in my code, as well as the results of running this through my original version of the loop, as well as your version of the loop.
Unfortunately, your version doesn't seem to work the same as there are a lot of negative numbers in my result, and that shouldn't happen. I think this is happening because of the line:
Irr(i1,i2,i3) = round((Ipt*cos(thetaPt)/dsq)*(Multiplier/1000),3);
Which can be replaced with:
Irr(i1,i2) = Irr (i1,i2) + round((Ipt*cos(thetaPt)/dsq)*(Multiplier/1000),3);
I don't believe that's the way interp2 works.
ps. I have the Multiplier set to 1, and all orientations set to 0 in this example.
Sorry I found the actual reason. thetaPtall is in Degrees, and cos needs everything to be in Radians. Add in deg2rad and everything works. In fact, it worked so well that a program that took over 2 hours to finish prior, now only took 4 minutes.
To compute the cosine of an angle in degrees, consider using cosd instead of converting using deg2rad yourself.
Timothy Plummer
Timothy Plummer on 12 Jun 2017
Edited: Timothy Plummer on 12 Jun 2017
So after iterating through this solution a lot, and expanding out the size of the x and y locations by almost eight-fold, I am back at the program taking more than 2 hours to full program to compute. If there is anything that can be done to make this loop even better that would be great.
I have attached some data that is generated before one attempted calculation and the results of that calculation.
I think your best bet would be to do away with the loop altogether. That may depend on the size of your data, though... the one-loop solution may make more sense if you're dealing with large arrays. I tested your 3-loop option against a fully vectorized option and a 1-loop option:
% Load input
A = load('~/Downloads/TestingData.mat');
rows = A.rows;
columns = A.columns;
xFixtureLocations = A.xFixtureLocations;
yFixtureLocations = A.yFixtureLocations;
IESdata = A.IESdata;
% Some counters
nr = length(rows);
nc = length(columns);
nfix = length(xFixtureLocations);
[m,n] = size(IESdata.photoTable);
% Additional input not provided
orientation = zeros(1, nfix);
Multiplier = 1;
MountHeight = 2;
% 3 loops
tic;
for ii = 1:500
phiPt = zeros(nr,nc,nfix);
thetaPt = zeros(nr,nc,nfix);
dsq = zeros(nr,nc,nfix);
for i1 = 1:length(rows)
for i2 = 1:length(columns)
for i3 = 1:length(xFixtureLocations)
x = rows(i1)-xFixtureLocations(i3);
y = columns(i2)-yFixtureLocations(i3);
r = sqrt(x^2 + y^2);
thetaPt(i1,i2,i3) = atan(r/MountHeight);
if x==0
phiPt(i1,i2,i3) = 0;
else
phiPt(i1,i2,i3) = atan2(y,x);
end
phiPt(i1,i2,i3) = phiPt(i1,i2,i3) + pi + orientation(i3);
phiPt(i1,i2,i3) = mod(phiPt(i1,i2,i3),2*pi)-pi;
dsq(i1,i2,i3) = r^2+(MountHeight)^2;
end
end
end
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
rad2deg(phiPt), rad2deg(thetaPt), 'nearest', 0);
Irr{1} = round((Ipt.*cos(thetaPt)./dsq).*(Multiplier./1000),3);
Out(1) = struct('phiPt', phiPt, 'thetaPt', thetaPt, 'dsq', dsq, 'Irr', Irr);
end
t(1) = toc;
% No loops
tic
for ii = 1:500
[xg, yg] = ndgrid(rows, columns);
x = bsxfun(@minus, xg, permute(xFixtureLocations, [1 3 2]));
y = bsxfun(@minus, yg, permute(yFixtureLocations, [1 3 2]));
r = sqrt(x.^2 + y.^2);
thetaPt = atan(r./MountHeight);
phiPt = zeros(size(y));
phiPt(x~=0) = atan2(y(x~=0), x(x~=0));
phiPt = bsxfun(@plus, phiPt + pi, permute(orientation, [1 3 2]));
phiPt = mod(phiPt, 2*pi) - pi;
dsq = r.^2 + MountHeight.^2;
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
rad2deg(phiPt), rad2deg(thetaPt), 'nearest', 0);
Irr = round((Ipt.*cos(thetaPt)./dsq).*(Multiplier./1000),3);
Out(2) = struct('phiPt', phiPt, 'thetaPt', thetaPt, 'dsq', dsq, 'Irr', Irr);
end
t(2) = toc;
% One loop
for ii = 1:500
[xg, yg] = ndgrid(rows, columns);
phiPt = zeros(nr,nc,nfix);
thetaPt = zeros(nr,nc,nfix);
dsq = zeros(nr,nc,nfix);
for ifix = 1:nfix
x = xg - xFixtureLocations(ifix);
y = yg - yFixtureLocations(ifix);
r = sqrt(x.^2 + y.^2);
thetaPt(:,:,ifix) = atan(r./MountHeight);
tmp = zeros(size(y));
tmp(x~=0) = atan2(y(x~=0),x(x~=0));
phiPt(:,:,ifix) = tmp;
phiPt(:,:,ifix) = phiPt(:,:,ifix) + pi + orientation(ifix);
dsq(:,:,ifix) = r.^2 + MountHeight.^2;
end
phiPt = mod(phiPt, 2*pi) - pi;
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
rad2deg(phiPt), rad2deg(thetaPt), 'nearest', 0);
Irr = round((Ipt.*cos(thetaPt)./dsq).*(Multiplier./1000),3);
Out(3) = struct('phiPt', phiPt, 'thetaPt', thetaPt, 'dsq', dsq, 'Irr', Irr);
end
t(3) = toc;
fprintf('3 loops: %f\n0 loops: %f\n1 loop : %f\n', t);
(The 500-times loop is just for timing purposes).
These were the results on my computer:
3 loops: 0.514277
0 loops: 0.351616
1 loop : 0.852183

Sign in to comment.

More Answers (0)

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!