Optimizing several Gaussian Process models in parallel

Question

Robert on 25 Apr 2022

1
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/1704655-optimizing-several-gaussian-process-models-in-parallel

Answered: Ayush Anand on 20 Oct 2023

Dear community,

I have 80 datasets with 5000 datapoints each, that I want to fit a GP to. Is there any good way to parallelize this (HPC user here)?

Initially I would roughly realize this as follows: (it will run inside a batch script)

for i = 1:80

parfeval(p, @fitrgp, 1, X{i}, Y{i}, ...

'OptimizeHyperparameters', 'all',

'HyperparameterOptimizationOptions',...

struct(...

'MaxObjectiveEvaluations',500,...

'Optimizer', 'bayesopt',...

'Verbose', 0,...

'MaxTime', 60*60,...

'Repartition', true,...

'UseParallel', true,...

'Kfold' , 15))

end

% read results after it's finished

How does the 'Useparallel' option scale? Does it take effect at all if I let it run on a single worker? Is there any way that I can have multiple workers working for one fitrgp evaluation?

Best regards and thank you,

Robert

PS: I have up to ~500 cores available and I have up to 4 predictors.

1 Comment
Show -1 older commentsHide -1 older comments

Robert on 25 Apr 2022

Which submit arguments for a batch job would make sense and get the most out of our computing ressources?

--ntasks=81 --cpus-per-task=5?

Sign in to comment.

Sign in to answer this question.

Answer 1

Ayush Anand on 20 Oct 2023

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1704655-optimizing-several-gaussian-process-models-in-parallel#answer_1337476

Hi Robert,

I understand you are trying to run several Gaussian Process models in parallel and want to know more about the “UseParallel” argument, and if it is possible to have several workers working for one “fitrgp “evaluation.

You can use parallel computing to speed up the process of fitting a Gaussian Process (GP) to multiple datasets. The “UseParallel” option in MATLAB's “fitrgp” function parallelizes the cross-validation process when estimating the hyperparameters of the GP model, however it doesn't parallelize the fitting for a single GP model.

Here's how it works:

When you set “UseParallel” to true, MATLAB uses parallel computing to perform multiple cross-validation folds simultaneously. Each worker is responsible for one or more folds.
The “UseParallel” option doesn't have an effect when you're running the function on a single worker. It's specifically designed to take advantage of multiple workers.
You can't use multiple workers for a single “fitrgp” evaluation. The “UseParallel” option only parallelizes the cross-validation process within a single “fitrgp” call, not the fitting process itself.

You can refer to the following page for more information on the “fitrgp” function and the “UseParallel” argument:

https://www.mathworks.com/help/stats/fitrgp.html

I hope this helps!

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Optimizing several Gaussian Process models in parallel

1 Comment
Show -1 older commentsHide -1 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Optimizing several Gaussian Process models in parallel

1 Comment Show -1 older commentsHide -1 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments