PCT: Parfor vs For with one core
6 views (last 30 days)
Show older comments
We are writing an application in which some parts are meant to be in parallel. Nevertheless, we want to tune the app so that it uses the optimal number of cores (even if the number of cores = 1)
My question is: what happens with the speed of execution for these two blocs
parfor i = 1:X
for i = 1:X
when only one core is being used, keeping everything else constant? Is the parfor slower because it checks if a matlabpool is open? Is the speed of execution the same for both?
But I can let you know that parfor will be slower due to overhead;
[t1, t2] = deal(0);
a = 1;
b = 2;
x = 3;
%Sum their times over 1000 function calls:
for ii = 1:1000
for jj = 1:1000
y = a*x+b;
t1 = t1+toc;
parfor kk = 1:1000
yp = a*x+b;
t2 = t2+toc;
fprintf(1,'\nFOR: %fs\nPARFOR: %fs\n',t1,t2);
fprintf(2,'\nSlowdown: %f\n\n', t2./t1);
I'm seeing a slowdown in the high 80x
Yeah, this is how I've always done it:
UseParallel = flag_for_parallel_use;
More Answers (2)
Jan on 28 Jan 2013
I do not have experiences with the parallel toolbox. Can you open a pool with 2 threads on a single core node? A process with alternating file access and computations, e.g. reading a movie file, could profit from multiple threads on a single core. Another example is a compression software, e.g. 7zip runs faster with two threads even on a single core processor.
This special-casing may not be strictly necessary, as "parfor" reverts to serial behavior if there is no matlabpool available, as follows:
"If the parfor-loop cannot run on workers in a MATLAB pool, MATLAB executes the loop on the client in a serial manner. In this situation, the parfor semantics are preserved in that the loop iterations can execute in any order."
In regards to the thread question, each MATLAB worker process is it's own process -- you can see the number of MATLAB instances increase as a matlabpool is started. And to be extra pedantic, core count is independent of the number of worker MATLABs started -- the placement of the worker is left to the operating system entirely, and the "one core per worker" is merely a "best guess default". There are situations where you would want different ratios, depending on a number of factors.
Using Sean's code I was able to see the difference. I couldn't come up with a way to do it better, unfortunately.
Find more on Cluster Configuration in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!