On which data is the ML model trained after hyperparameter optimization /Application of trained ML on new training data

1 view (last 30 days)
Dear Matlab-Community,
I would be glad if someone could help me out with two questions. Let us regard the "fitcdiscr" function.
1.) On which data is the final ML model trained after hyperparameter optimization? Concretely asked, which is true:
a) The final model is trained just on the defined training set, which is 80% of my data. The optimal hyperparameters after i iterations are taken. (That means we have i model trainings using hold-out validaton)
b) The final model is trained on the entire data set with the the optimal hyperparameters after i iterations. (That means we have i + 1 model trainings using hold-out validaton)
2.) How can I directly apply my model with the Hyperparameters for training it on new data?
I have added a code snippet below.
I am grateful for any hints!
Thank you
Denys
Partition = cvpartition(Data.Response,"HoldOut",0.2, 'Stratify',true);
TrainingSetting.Discr.OptimizationOptions = struct('CVPartition',Partition,'MaxObjectiveEvaluations',30); % 20% Hold-Out-Partition
Model = fitcdiscr(X, Y,'HyperparameterOptimizationOptions',TrainingSetting.Discr.OptimizationOptions)

Answers (1)

Alan Weiss
Alan Weiss on 9 Dec 2022
With the settings you show, the software does not perform any cross validation. You need to set the OptimizeHyperparameters argument to something other than the default 'none' when you call fitcdiscr.
Assuming you set something such as 'auto', which as documented varies 'Delta' and 'Gamma' to minimize cross-validation loss, what happens is that the software first tries to minimize the cross-validation loss, and then performs one more step to fit the data using the resulting hyperparameters.
Alan Weiss
MATLAB mathematical toolbox documentation
  2 Comments
Denys Romanenko
Denys Romanenko on 9 Dec 2022
Dear Alan,
thank you for your reply!
do you mean by "... to fit the data using the resulting hyperparameters." that the entire data, consisting out of the train and validation set is taken for the final training of the model?
My code snippet was probably a bit poor as it did not show that the the "Partition" variable is a cvpartition hold-out validation object. Hence, after editing my code above, hold out validation should be performed?
Thank you
Regards,
Denys

Sign in to comment.

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!