What determines the increase in speed for parfor vs. for?
Show older comments
The question is simply that. I know that for some loops with a lot of overhead, it can be slower. In some cases, it can be faster, due to multiple calculations being done in parallel.
The reason I ask this is because my parfor loop runs much faster than I'd expect. When I use
tic
for run = 1:n_runs
function_to_call
end
toc
The code takes around 140-145 seconds to run for n_runs=12. My parallel pool has 12 threads, which makes me think it should run 12 times faster, plus/minus a little bit to account for parfor overhead. What actually happens when I run
tic
parfor run = 1:n_runs
function_to_call
end
toc
is that the code takes about 1 second to run.
What is the explanation here? Am I missing something obvious, or is it a something deeper?
7 Comments
Just a thought: what happens if you use profile instead to see rate limiting steps in your function?
profile on
% your snippet
profile viewer
Daniel Pollard
on 9 Dec 2020
Bjorn Gustavsson
on 9 Dec 2020
If your problem is "embarrassingly parallelizable" (forgotten the correct technical term), as it seems from the code-snippets (which indicates that there's n_runs independent calls to function_to_call without any coupling between them). If that's the case you should be able to reduce it to n_runs = 1, right? If that's possible you might get something from peppering the code with tic-toc to make a QD manual profiling. I'm well aware that is a poor method in general, but might give some indication.
Daniel Pollard
on 9 Dec 2020
Rik
on 14 Dec 2020
@Bjorn, that will be true in general, but not always. It is also possible that the function pulls a filename off of some queue (which must be outside of Matlab, otherwise parfor probably doesn't work correctly). The situation where this function would interact with an external device doesn't seem suited for parallelization, but doing something based on a random stream is also not out of the question.
Sindar
on 15 Dec 2020
roughly what does function_to_call do? I could imagine some cases where Matlab might do some trickery only with parfor, but not for.
For example, if the effect of the function is redundant (overwriting the same variables or files), for might assume that you have some reason to do it in sequence, whereas parfor explicitly knows that order doesn't matter, so it can just skip to the last iteration. (A human could see this; I doubt Matlab could)
Perhaps more realistically, some aspect of memory management might be similarly sped up. ("oh, order doesn't matter, so I can load this array in once and share it to each thread" or "let's just keep this file stream open")
[caveat: I know just enough about parallelization to be dangerous; I expect you'll explain the function and I'll have no clue why it's faster, but someone else might]
Daniel Pollard
on 15 Dec 2020
Answers (0)
Categories
Find more on Startup and Shutdown in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!