How can I modify this nested loop to work as a parfor without requiring a ridiculously large array?
Show older comments
I am trying to nest several for loops in a parfor loop. I have even read the documentation and several other queries/replies before posting here.
I have a large dataset that I need to iterate over, calculating a property which would result in a prohibitively large array if I sent each answer to a separate element, and doing this on one cpu will take a prohibitively long time.
Instead, what I am doing is reducing the array by another property, and then combining the calculated results in bins of this second property (essentially making a histogram).
The parfor section of my code looks like this:
parfor i=1:box(1);
for j=1:box(2);
for k=1:box(3);
for l=1:box(1);
for m=1:box(2);
for n=1:box(3);
prop(horrendousfunction(i,j,k,l,m,n)) = prop(horrendousfunction(i,j,k,l,m,n)) + data(i,j,k)*data(l,m,n);
end
end
end
end
end
end
Trialling this on one cpu over i=j=k=l=m=n=[1:15] works fine and takes a few minutes.
The data array is the initial large array, but as written the code will iterate over every element of it many times within each parfor step, and therefore the data transmission I/O overhead shouldn't be too onerous with respect to the overall computation time. (Expected complete running time is ~1 month on 1 cpu, and I have several of these to do)
A few other (small, negligible I/O) arrays are also required.
horrendousfunction is essentially a way of mapping (i,j,k,l,m,n) into a single index h which has been previously split into bins earlier in the code. I will eventually want to plot(h,prop).
Now, after spending some time trawling the documentation and various previous questions on the topic, I realise that my initial efforts contravene the basic parfor assumption that j,k,l,m, and n (as components of the index of prop) should be constant within the parfor loop. I see that if I wanted to flesh out a behemoth array A(i,j,k,l,m,n) I could do so by making dummy variables and then combining it all with a command = A(i,:,:,:,:,:) at the end - but this will create the stupidly large array that I wish to avoid (I don't have enough HDD to store it).
Simply calculating trick = horrendousfunction(i,j,k,l,m,n) within the loop is also not an option, because I can't use trick as the index of prop either, since its value changes within the loop too.
Predetermining the complete mapping of [i,j,k,l,m,n] -> h is also not an option, since that would require a behemoth data array of its own, and require said array to be I/O'd to all the matlabpool workers. (And calculating it would take about as long as the original problem anyway) Also, the mapping is context-dependent, so I can't use the same mapping across the related family of problems I wish to solve.
Is there another way I can write the internals of this loop to take advantage of MATLAB's parallelism without requiring a behemoth output array?
8 Comments
cr
on 13 Oct 2013
Is prop 3dim matrix ? Whats the output (and its size) of horrendousfunction ?
Walter Roberson
on 13 Oct 2013
How much of horrendousfunction can you pre-calculate without all of the inputs? For example if it involved 7*i^2 - 3*i*j - sin(m*pi) then you could break this up into a vector
mpi = sin((1:box(2)*pi);
and right within the "j" loop, calculate and store ijportion = 7*i^2 - 3*i*j. Then although you still need the values from all of the variables to do the full calculation, you could use ijportion + mpi(m) which would be much more efficient.
Daniel
on 13 Oct 2013
Daniel
on 14 Oct 2013
You still haven't said what the typical magnitudes of box(i) will be.
Do you mean that N is typically 100 and therefore prod(box) is typically 4e5? So the box(i) are around 75? That doesn't sound like a challengingly large data set. Less than 2MB if data(i) are single precision floats. You could comfortably broadcast a separate copy of "data" to each lab very quickly and without straining memory.
Daniel
on 15 Oct 2013
Accepted Answer
More Answers (1)
Edric Ellis
on 14 Oct 2013
Does it work to do something like this:
prop = zeros(1, N);
parfor ...
...
tmp = zeros(1, N);
tmp(horrendousfunction(i,j,k,l,m,n)) = data(i,j,k)*data(l,m,n);
prop = prop + tmp;
end
Categories
Find more on Parallel Computing Fundamentals in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!