Train Classifier Using Hyperparameter Optimization in Classification Learner App
This example shows how to tune hyperparameters of a classification support vector machine (SVM) model by using hyperparameter optimization in the Classification Learner app. Compare the test set performance of the trained optimizable SVM to that of the best-performing preset SVM model.
In the MATLAB® Command Window, load the
ionospheredata set, and create a table containing the data. Separate the table into training and test sets.
load ionosphere tbl = array2table(X); tbl.Y = Y; rng('default') % For reproducibility of the data split partition = cvpartition(Y,'Holdout',0.15); idxTrain = training(partition); % Indices for the training set tblTrain = tbl(idxTrain,:); tblTest = tbl(~idxTrain,:);
Open Classification Learner. Click the Apps tab, and then click the arrow at the right of the Apps section to open the apps gallery. In the Machine Learning and Deep Learning group, click Classification Learner.
On the Classification Learner tab, in the File section, select New Session > From Workspace.
In the New Session from Workspace dialog box, select the
tblTraintable from the Data Set Variable list.
As shown in the dialog box, the app selects the response and predictor variables. The default response variable is
Y. The default validation option is 5-fold cross-validation, to protect against overfitting. For this example, do not change the default settings.
To accept the default options and continue, click Start Session.
Train all preset SVM models. On the Classification Learner tab, in the Model Type section, click the arrow to open the gallery. In the Support Vector Machines group, click All SVMs. In the Training section, click Train. The app trains one of each SVM model type and displays the models in the Models pane.
If you have Parallel Computing Toolbox™, you can train all the SVM models (All SVMs) simultaneously by selecting the Use Parallel button in the Training section before clicking Train. After you click Train, the Opening Parallel Pool dialog box opens and remains open while the app opens a parallel pool of workers. During this time, you cannot interact with the software. After the pool opens, the app trains the SVM models simultaneously.
The app displays a validation confusion matrix for the first model (model 1.1). Blue values indicate correct classifications, and red values indicate incorrect classifications. The Models pane on the left shows the validation accuracy for each model.
Validation introduces some randomness into the results. Your model validation results can vary from the results shown in this example.
Select an optimizable SVM model to train. On the Classification Learner tab, in the Model Type section, click the arrow to open the gallery. In the Support Vector Machines group, click Optimizable SVM. The app disables the Use Parallel button when you select an optimizable model.
Select the model hyperparameters to optimize. In the Model Type section, select Advanced > Advanced. The app opens a dialog box in which you can select Optimize check boxes for the hyperparameters that you want to optimize. By default, all the check boxes for the available hyperparameters are selected. For this example, clear the Optimize check boxes for Kernel function and Standardize data. By default, the app disables the Optimize check box for Kernel scale whenever the kernel function has a fixed value other than
Gaussian. Select a
Gaussiankernel function, and select the Optimize check box for Kernel scale. Click OK.
In the Training section, click Train.
The app displays a Minimum Classification Error Plot as it runs the optimization process. At each iteration, the app tries a different combination of hyperparameter values and updates the plot with the minimum validation classification error observed up to that iteration, indicated in dark blue. When the app completes the optimization process, it selects the set of optimized hyperparameters, indicated by a red square. For more information, see Minimum Classification Error Plot.
The app lists the optimized hyperparameters in both the Optimization Results section to the right of the plot and the Optimized Hyperparameters section of the Current Model Summary pane.
In general, the optimization results are not reproducible.
Compare the trained preset SVM models to the trained optimizable model. In the Models pane, the app highlights the highest Accuracy (Validation) by outlining it in a box. In this example, the trained optimizable SVM model outperforms the six preset models.
A trained optimizable model does not always have a higher accuracy than the trained preset models. If a trained optimizable model does not perform well, you can try to get better results by running the optimization for longer. In the Model Type section, select Advanced > Optimizer Options. In the dialog box, increase the Iterations value. For example, you can double-click the default value of
30and enter a value of
60. Then click OK.
Because hyperparameter tuning often leads to overfitted models, check the performance of the optimizable SVM model on a test set and compare it to the performance of the best preset SVM model. Begin by importing test data into the app.
On the Classification Learner tab, in the Testing section, select Test Data > From Workspace.
In the Import Test Data dialog box, select the
tblTesttable from the Test Data Set Variable list.
As shown in the dialog box, the app identifies the response and predictor variables.
Compute the accuracy of the best preset model and the optimizable model on the
First, in the Models pane, click the star icons next to the Medium Gaussian SVM model and the Optimizable SVM model.
For each model, select the model in the Models pane, and then select Test All > Test Selected in the Testing section. The app computes the test set performance of the model trained on the full data set, including training and validation data.
Sort the models based on the test set accuracy. In the Models pane, open the Sort by list and select
In this example, the trained optimizable model does not perform as well as the trained preset model on the test set data.
Visually compare the test set performance of the models. For each of the starred models, select the model in the Models pane. On the Classification Learner tab, in the Plots section, click the arrow to open the gallery, and then click Confusion Matrix (Test) in the Test Results group.
Rearrange the layout of the plots to better compare them. First, close the validation confusion matrix for Model 1.1. Then, on the Classification Learner tab, in the Plots section, click the Layout button and select Compare models. Click the Hide plot options button in the top right of the plots to make more room for the plots.
To return to the original layout, you can click the Layout button in the Plots section and select Single model (Default).
- Hyperparameter Optimization in Classification Learner App
- Train Classification Models in Classification Learner App
- Select Data and Validation for Classification Problem
- Choose Classifier Options
- Assess Classifier Performance in Classification Learner
- Export Classification Model to Predict New Data
- Bayesian Optimization Workflow