MATLAB Answers


parfor with very different computation times

Asked by Paolo Bocchini on 14 Apr 2011
Hi, I have been using matlab for a while with embarrassingly parallel codes. Now I have a problem to be solved N times independently. The issue is that each execution may require very different CPU times, therefore parfor is not efficient.
For instance, say that I have 9 problems with 8 cores. Problem 1 and 2 take 2 hours each. The others take 1 second. If I use parfor and problems 1 and 2 end up to core number 1, the entire execution will take 4 hours.
I would like to have something that assigns a new problem to a core as soon as it finishes the previous one. In this case, the previous example would take only 2 hours and 1 second.
I studied a little bit, but it seems that parfor cannot do what I want, and neither spmd+while. Could anybody give me a hint to start with?
Last note: there is no way to know the CPU time of each problem before the execution.
Thank you, Paolo


Sign in to comment.

1 Answer

Answer by jiro
on 15 Apr 2011

Normally, parfor does load balancing. parfor breaks up the iterations into different-sized chunks and keeps sending more work to workers that finish early. This works well when you have many iterations (number of iterations >> number of workers). However, in your example, you're looking at a case where the number of iterations is close to the number of workers. So even with the inherent load balancing capability, you may run into your scenario.
In that case, take a look at Jobs and Tasks. It requires a little more programming on your part, but you get more control of how things are distributed. It will distribute each task (iteration) one by one.


Thank you for your answer! The point is that when the number of iterations is much larger than the number of workers, the load will be balanced by the law of large numbers. I mean that, on average, the most expensive iterations will be equally distributed among the workers for a matter of probability.
This smarter balancing is useful only when the number of workers is not much larger than the number of iterations. I think this is a very basic feature that could be included in parfor in the next releases.
Anyway, in the meanwhile I will look at Jobs and Tasks, as you suggest. Thank you again!
on 16 Apr 2011
In fact, the load balancing doesn't happen just because of a "matter of probability". PARFOR uses some techniques to try to load balance. For example, let's look at a loop with 10 iterations and assume 4 workers. PARFOR may break up the iterations into different-sized chunks, e.g. 2-2-2-1-1-1-1. The key is that it creates some large chunks and some small chunks. Then it sends chunks to the workers from the larger ones first. As soon as a worker finishes a job, it receives the next chunk. This essentially ensures that workers don't sit around idle. By having smaller chunks towards the end, it tries to even out the finishing time.
PARFOR also creates chunks (as opposed to sending one at a time) to minimize communication. That's why when you have few iterations, it could potentially put two long iterations together as one chunk. PARFOR is meant for a simple parallelism with no customizability. To be able to control how things are distributed, Jobs and Tasks are the way.
Feel free to submit an enhancement through the website:

Sign in to comment.