Error with parallel toolbox

I tried running code on a cluster using batch which I fills a 10000 by 10000 matrix with random numbers.
I have a script runTest:
job = batch('test');
wait(job);
load(job);
save("A","A",'-v7.3');
where test.m is the following:
parfor i=1:10000
for j=1:10000
A(i,j) = rand;
end
end
However, when I run this on a cluster, it says 'Error using batch (line 179). Reference to non-existent field 'AlternativeConstructorIndex'.
How do I go about fixing this? It runs fine in my own machine, and runs fine on the cluster when I don't go via the batch function (but then won't run in parallel, and seems to just ignore the parfor loop, or perform the same thing on each processor rather than splitting the tasks between them). The parallel computing toolbox is definitely installed (I checked via an interactive session).

Answers (1)

OCDER
OCDER on 27 Aug 2018
Edited: OCDER on 27 Aug 2018
Do not use batch and scripts if not needed.
"but then won't run in parallel, and seems to just ignore the parfor loop, or perform the same thing on each processor rather than splitting the tasks between them)"
This indicates the cluster default setting for parallel computing is to NOT autocreate a parpool when encountering a parfor statement for the first time. To check the setting on the cluster try this:
ps = parallel.Settings
ps.Pool.AutoCreate
= 0 %will not start parpool automatically
= 1 %will start pool automatically when encountering parfor
%You can set it to true:
if ~ps.Pool.AutoCreate
ps.Pool.AutoCreate = true; %But this isn't a good practice. See below
end
To ensure the Matlab starts parpool, use the parpool command explicitly. https://www.mathworks.com/help/distcomp/parpool.html
Pool = gcp('nocreate');
if isempty(Pool)
parpool();
end
A = test();
save('A.mat','A','-v7.3');
%test.m is this:
function A = test()
parfor i=1:10000
for j=1:10000
A(i,j) = rand;
end
end
I try to avoid using scripts in parallel computing, as it seems there are unique debugging issues.

5 Comments

I just checked, and ps.Pool.AutoCreate is set to 1 on the cluster. I'll have a go anyway at the other things you suggested, and let you know how it goes. Thanks!
Interesting... I wonder if the cluster configuration prevents an application from hogging more CPUs. I'm not sure about the server configuration stuff. Are you able to start a parpool session in the cluster? try parpool(2) for starter.
Alexander Holmes
Alexander Holmes on 27 Aug 2018
Edited: Alexander Holmes on 27 Aug 2018
I don't think it's working. On the interactive session it throws an error about not being able to find 'qstat'. It runs fine, but doesn't do anything in parallel (or only uses 1 worker, even if the profile said otherwise).
Is that the error you are receiving? seems like a matlab installation/configuration issue on the cluster. One fix is in that answer link.
I'd contact support. Either way that sounds like a bug.

Sign in to comment.

Categories

Products

Release

R2017a

Asked:

on 23 Aug 2018

Commented:

on 28 Aug 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!