Main Content

trainingProgressMonitor

Monitor and plot training progress for deep learning custom training loops

Since R2022b

    Description

    Use a TrainingProgressMonitor object to track training progress when using a custom training loop.

    You can use a TrainingProgressMonitor object to:

    • Create animated custom metric plots and record custom metrics during training.

    • Display and record training information during training.

    • Stop training early.

    • Track training progress with a progress bar.

    • Track elapsed time.

    This image shows an example of the Training Progress window during training. For more information about configuring the Training Progress window and an example showing how to generate this figure, see Monitor Custom Training Loop Progress.

    Training Progress window. The figure contains plots of the loss and accuracy for both the training and validation data, and information about the training progress, status, elapsed time, epoch number, execution environment, iteration, and learning rate.

    Creation

    Description

    example

    monitor = trainingProgressMonitor creates a TrainingProgressMonitor object that you can use to track the training progress and create training plots.

    example

    monitor = trainingProgressMonitor(Name=Value) sets the Metrics, Info, Visible, Progress, Status, and XLabel properties using one or more name-value arguments.

    Properties

    expand all

    Metric names, specified as a string scalar, character vector, string array, or cell array of character vectors. Valid names begin with a letter, and contain letters, digits, and underscores. Each metric appears in its own training subplot. To plot more than one metric in a single subplot, use the groupSubPlot function.

    Example: ["TrainingLoss","ValidationLoss"];

    Data Types: char | string | cell

    Information names, specified as a string scalar, character vector, string array, or cell array of character vectors. Valid names begin with a letter, and contain letters, digits, and underscores. These names appear in the Training Progress window but do not appear as training plots.

    Example: ["GradientDecayFactor","SquaredGradientDecayFactor"];

    Data Types: char | string | cell

    This property is read-only.

    Request to stop training, specified as a numeric or logical 0 (false) or 1 (true). The value of this property changes to 1 when you click the Stop button in the Training Progress window. The Stop button only appears if you set the Visible property to 'on' or 1 (true).

    Data Types: logical

    State of visibility, specified as 'on' or 'off', or as numeric or logical 1 (true) or 0 (false). A value of 'on' is equivalent to true, and 'off' is equivalent to false. Thus, you can use the value of this property as a logical value. The value is stored as an on/off logical value of type matlab.lang.OnOffSwitchState.

    • 'on' — Display the Training Progress window.

    • 'off' — Hide the Training Progress window without deleting it. You still can access the properties of an invisible object.

    Example: 'off'

    Training progress percentage, specified as a scalar or dlarray object in the range [0, 100].

    Example: 17;

    Horizontal axis label in the training plot, specified as a string scalar or character vector.

    Example: "Iteration";

    Data Types: char | string | cell

    Training status, specified as a string scalar or character vector.

    Example: "Running";

    Data Types: char | string | cell

    This property is read-only.

    Metric values, specified as a structure. Use the Metrics property to specify the field names for the structure. Each field contains a matrix with two columns. The first column contains the custom training loop step values and the second column contains the metric values recorded by the recordMetrics function.

    Data Types: struct

    This property is read-only.

    Information values, specified as a structure. Use the Info property to specify the field names for the structure. Each field is a column vector that contains the values updated by the updateInfo function.

    Data Types: struct

    Object Functions

    groupSubPlotGroup metrics in training plot
    recordMetricsRecord metric values for custom training loops
    updateInfoUpdate information values for custom training loops

    Examples

    collapse all

    Use a TrainingProgressMonitor object to track training progress and produce training plots for custom training loops.

    Create a TrainingProgressMonitor object. The monitor automatically tracks the start time and the elapsed time. The timer starts when you create the object.

    Tip

    To ensure that the elapsed time accurately reflects the training time, make sure you create the TrainingProgressMonitor object close to the start of your custom training loop.

    monitor = trainingProgressMonitor;

    Before you start the training, specify names for the information and metric values.

    monitor.Info = ["LearningRate","Epoch","Iteration"];
    monitor.Metrics = ["TrainingLoss","ValidationLoss","TrainingAccuracy","ValidationAccuracy"];

    Specify the horizontal axis label for the training plot. Group the training and validation loss in the same subplot, and group the training and validation accuracy in the same plot.

    monitor.XLabel = "Iteration";
    groupSubPlot(monitor,"Loss",["TrainingLoss","ValidationLoss"]);
    groupSubPlot(monitor,"Accuracy",["TrainingAccuracy","ValidationAccuracy"]);
    

    During training:

    • Evaluate the Stop property at the start of each step in your custom training loop. When you click the Stop button in the Training Progress window, the Stop property changes to 1. Training stops if your training loop exits when the Stop property is 1.

    • Update the information values. The updated values appear in the Training Progress window.

    • Record the metric values. The recorded values appear in the training plot.

    • Update the training progress percentage based on the fraction of iterations completed.

    Note

    The following example code is a template. You must edit this training loop to compute your metric and information values. For a complete example that you can run in MATLAB, see Monitor Custom Training Loop Progress During Training.

    epoch = 0;
    iteration = 0;
    
    monitor.Status = "Running";
    
    while epoch < maxEpochs && ~monitor.Stop
        epoch = epoch + 1;
    
        while hasData(mbq) && ~monitor.Stop
            iteration = iteration + 1;
    
            % Add code to calculate metric and information values.
            % lossTrain = ...
    
           updateInfo(monitor, ...
                LearningRate=learnRate, ...
                Epoch=string(epoch) + " of " + string(maxEpochs), ...
                Iteration=string(iteration) + " of " + string(numIterations));
    
           recordMetrics(monitor,iteration, ...
                TrainingLoss=lossTrain, ...
                TrainingAccuracy=accuracyTrain, ...
                ValidationLoss=lossValidation, ...
                ValidationAccuracy=accuracyValidation);
    
            monitor.Progress = 100*iteration/numIterations;
        end
    end

    The Training Progress window shows animated plots of the metrics, and the information values, training progress bar, and elapsed time.

    Training Progress window. The first plot shows the training and validation loss and the second plot shows the training and validation accuracy.

    A TrainingProgressMonitor object has the same properties and object functions as an experiments.Monitor object. Therefore, you can easily adapt your plotting code for use in an Experiment Manager setup script.

    How you monitor training depends on where you are training.

    • If you are using a custom training loop script, you must create and manage a TrainingProgressMonitor object yourself.

    • If you are using a custom training experiment, Experiment Manager creates an experiments.Monitor object for each trial of your experiment. By default, Experiment Manager saves the experiments.Monitor object as the variable monitor.

    In Experiment Manager, you can use the experiments.Monitor object in place of the TrainingProgressMonitor object in your custom training loop code.

    For example, suppose your training script creates a TrainingProgressMonitor object to track and plot training and validation loss.

    monitor = trainingProgressMonitor( ...
        Metrics=["TrainingLoss","ValidationLoss"], ...
        XLabel="Iteration");
    
    groupSubPlot(monitor,"Loss",["TrainingLoss","ValidationLoss"]);
    
    iteration = 1;
    recordMetrics(monitor,iteration,TrainingLoss=loss,ValidationLoss=lossVal);

    To adapt this code for use in Experiment Manager with an experiments.Monitor object:

    • Convert any code that sets properties using Name=Value syntax to use dot notation.

    • Delete the call to trainingProgressMonitor. This is because Experiment Manager creates a monitor for you.

    Use the adapted code inside your Experiment Manager setup function.

    % Inside custom training experiment setup function
    
    monitor.Metrics=["TrainingLoss","ValidationLoss"];
    monitor.XLabel = "Iteration";
    
    groupSubPlot(monitor,"Loss",["TrainingLoss","ValidationLoss"]);
    
    iteration = 1;
    recordMetrics(monitor,iteration,TrainingLoss=loss,ValidationLoss=lossVal);
    

    Note

    Experiment Manager accesses the monitor object as the second input argument of the training function. You must check that the second input argument matches the variable name of your monitor object. For more information, see Configure Custom Training Experiment.

    Tips

    • The information values appear in the Training Progress window and the training plot shows a record of the metric values. Use information values for text and for numerical values that you want to display in the training window but not in the training plot.

    • When you click the Stop button in the Training Progress window, the Stop property is set to 1 (true). This stops training if your training loop exits when the Stop property is 1. For example, to enable early stopping, include the following code in your custom training loop.

      while numEpochs < maxEpochs && ~monitor.Stop    
      % Custom training loop code.   
      end

    • The elapsed time updates each time you call recordMetrics or updateInfo, and when you update the Progress property.

    Version History

    Introduced in R2022b