MATLAB Answers

Batch processing a function

35 views (last 30 days)
Charlotte Findlay
Charlotte Findlay on 15 Jul 2019
Commented: Charlotte Findlay on 16 Jul 2019
I want to batch process the following for loop in my script (which contains a call to a function I wrote called 'interp'). I need the loop to be repeated 120 times but split across 4 workers, and the function needs to call the correct data layer from 3D arrays [e.g. squeeze(bearing_rec(a,:,:))]. I would want the data to put produced at the end of each loop and saved as a .mat file.
Would anyone be able to help me with how to code this?
for a = 1:length(sourcelat) % sourcelat = 120
integral_area_size = squeeze(integral_area(1,:,:)); % integral area = 5396 x 5396
[integral_rec, Hc_rec] = interp(integral_area_size, rng, squeeze(bearing_rec(a,:,:)), squeeze(range_rec(a,:,:)), squeeze(H3(a,:,:)), squeeze(min_H(a,:,:)));
% Need to print output for each here
save(['F:/Outputs/Output_' num2str(a) '.mat'],'integral_rec', 'Hc_rec')
end

  0 Comments

Sign in to comment.

Answers (2)

Stephane Dauvillier
Stephane Dauvillier on 15 Jul 2019
You need to use the batch function:
j = batch('yourScriptName','Pool',4);
And replace the for loop by a parfor loop.
Be careful: if you want your batch to be perform on 4 worker you will need 5 core as one is dedicated to launch the job.
Be careful²: if you have already strated a MATLAB pool before, then you will have fewer available worker for your batch job. Suppose you have 12 total worker and you already open a pool on 8 workers then your batch can only have 12-8 = 4 worker and since once will be dedicated to launch the job you have only 3 available worker
Be Careful^3/ save function is not allowed in parfor loop

  1 Comment

Charlotte Findlay
Charlotte Findlay on 15 Jul 2019
Hi Stephane,
Okay thanks for this so if the script has more non-looped components to create components for the for loop, the batch call wouldn’t be affected by this?
Also is it possible to use an alternative to ‘save’ I can use? I need the different output to be saved in mat files after it finishes each of the 120 iterations, so if I can’t use save I’m not sure how else I would save the outputs integral_rec and Hc_rec!

Sign in to comment.


Stephane Dauvillier
Stephane Dauvillier on 16 Jul 2019
With the batch function, your script will be executed on your PC as it is. The onlyu difference is your current MATLAB session won't be freeze.
If your script don't use parallel, your batch won't.
To launch the execute script on a pool of 3 workers (+1 for creating the job)
job=batch("execute","Pool",3);
if you want to see the state of your job
job.State
When the answer is finished that mean you can retrieve the result
Result = load(job)
N = Result.N ;
foo = Result.foo;
If the file execute is the following
N=10;
% Use cell array in order to save your results
foo = cell(N,1);
parfor i=1:N
foo{i} = peaks(i) ;
end
So if you want to save your result in a mat file, just do the following (at the end of your script of after retrieving the result
for i = 1:N
data = foo{i} ;
save(['mySaved',num2str(i,'%02i')],'data')
end
Note that the job (and its result will still be in memory) even when closing MATLAB. In order tio clear it from memory
delete(job)

  5 Comments

Show 2 older comments
Charlotte Findlay
Charlotte Findlay on 16 Jul 2019
Okay I've tried running this but I get the following error:
>> job=batch('Interp_Hc_H3_20190612_MultipleSites.m', 'Pool', 3);
Error using batch (line 187)
An unexpected error occurred accessing properties: "CaptureDiary" "CreateDateTime" "CreateTime" "DependentFiles" "Diary" "Error" "ErrorIdentifier"
"ErrorMessage" "FinishDateTime" "FinishTime" "Function" "InputArguments" "DiagnosticWarnings" "Name" "NumOutputArguments" "OutputArguments"
"StartDateTime" "StartTime" "StateEnum" "Worker"
Caused by:
Error using parallel.internal.cluster.FileSerializer>iSaveMat (line 278)
Data too large to be saved.
Is this because the script I have has data that needs to be run before the parfor loop?
Stephane Dauvillier
Stephane Dauvillier on 16 Jul 2019
OK, I didn't look at the size of you matrices.
1 double takes 8 bytes so 120 matrices 5400 by 5400 takes a little less than 28 GBytes (I assume this is too much for your PC).
OK So what you can do is the following:
N = 120;
parfor a = 1:length(sourcelat)
myTask(a,squeeze,integral_area,bearing_rec,range_rec,H3,min_H)
end
The idea is to create a function that does what you want (I may I forget some inputs). And you will be able to save your data in a matfile
function myTask(a,squeeze,integral_area,bearing_rec,range_rec,H3,min_H)
integral_area_size = squeeze(integral_area(1,:,:));
[integral_rec, Hc_rec] = interp(integral_area_size, rng, squeeze(bearing_rec(a,:,:)), squeeze(range_rec(a,:,:)), squeeze(H3(a,:,:)), squeeze(min_H(a,:,:)));
save(['mySaved',num2str(a,'%02i')],'integral_rec','Hc_rec')
end
By doing so you will need to have space in memory for 3 matrices 5400 by 5400 and not 120 which should stand in memory
Charlotte Findlay
Charlotte Findlay on 16 Jul 2019
Hi Stephane,
I'm still having issues with this and I think I'll struggle regardless so I might have to abandon ship and just do the process one after the other as I had been previously.
Really appreciate your help though and I'm sure your method would have worked if I wasn't looking at such huge amounts of data.
Charlotte

Sign in to comment.

Sign in to answer this question.