Main Content

RegressionNeuralNetwork

Neural network model for regression

Since R2021a

    Description

    A RegressionNeuralNetwork object is a trained, feedforward, and fully connected neural network for regression. The first fully connected layer of the neural network has a connection from the network input (predictor data X), and each subsequent layer has a connection from the previous layer. Each fully connected layer multiplies the input by a weight matrix (LayerWeights) and then adds a bias vector (LayerBiases). An activation function follows each fully connected layer, excluding the last (Activations and OutputLayerActivation). The final fully connected layer produces the network's output, namely predicted response values. For more information, see Neural Network Structure.

    Creation

    Create a RegressionNeuralNetwork object by using fitrnet.

    Properties

    expand all

    Neural Network Properties

    This property is read-only.

    Sizes of the fully connected layers in the neural network model, returned as a positive integer vector. The ith element of LayerSizes is the number of outputs in the ith fully connected layer of the neural network model.

    LayerSizes does not include the size of the final fully connected layer. This layer always has one output.

    Data Types: single | double

    This property is read-only.

    Learned layer weights for fully connected layers, returned as a cell array. The ith entry in the cell array corresponds to the layer weights for the ith fully connected layer. For example, Mdl.LayerWeights{1} returns the weights for the first fully connected layer of the model Mdl.

    LayerWeights includes the weights for the final fully connected layer.

    Data Types: cell

    This property is read-only.

    Learned layer biases for fully connected layers, returned as a cell array. The ith entry in the cell array corresponds to the layer biases for the ith fully connected layer. For example, Mdl.LayerBiases{1} returns the biases for the first fully connected layer of the model Mdl.

    LayerBiases includes the biases for the final fully connected layer.

    Data Types: cell

    This property is read-only.

    Activation functions for the fully connected layers of the neural network model, returned as a character vector or cell array of character vectors with values from this table.

    ValueDescription
    'relu'

    Rectified linear unit (ReLU) function — Performs a threshold operation on each element of the input, where any value less than zero is set to zero, that is,

    f(x)={x,x00,x<0

    'tanh'

    Hyperbolic tangent (tanh) function — Applies the tanh function to each input element

    'sigmoid'

    Sigmoid function — Performs the following operation on each input element:

    f(x)=11+ex

    'none'

    Identity function — Returns each input element without performing any transformation, that is, f(x) = x

    • If Activations contains only one activation function, then it is the activation function for every fully connected layer of the neural network model, excluding the final fully connected layer, which does not have an activation function (OutputLayerActivation).

    • If Activations is an array of activation functions, then the ith element is the activation function for the ith layer of the neural network model.

    Data Types: char | cell

    This property is read-only.

    Activation function for final fully connected layer, returned as 'none'.

    This property is read-only.

    Parameter values used to train the RegressionNeuralNetwork model, returned as a NeuralNetworkParams object. ModelParameters contains parameter values such as the name-value arguments used to train the regression neural network model.

    Access the properties of ModelParameters by using dot notation. For example, access the function used to initialize the fully connected layer weights of a model Mdl by using Mdl.ModelParameters.LayerWeightsInitializer.

    Convergence Control Properties

    This property is read-only.

    Convergence information, returned as a structure array.

    FieldDescription
    IterationsNumber of training iterations used to train the neural network model
    TrainingLossTraining mean squared error (MSE) for the returned model, or resubLoss(Mdl) for model Mdl
    GradientGradient of the loss function with respect to the weights and biases at the iteration corresponding to the returned model
    StepStep size at the iteration corresponding to the returned model
    TimeTotal time spent across all iterations (in seconds)
    ValidationLossValidation MSE for the returned model
    ValidationChecksMaximum number of times in a row that the validation loss was greater than or equal to the minimum validation loss
    ConvergenceCriterionCriterion for convergence
    HistorySee TrainingHistory

    Data Types: struct

    This property is read-only.

    Training history, returned as a table.

    ColumnDescription
    IterationTraining iteration
    TrainingLossTraining mean squared error (MSE) for the model at this iteration
    GradientGradient of the loss function with respect to the weights and biases at this iteration
    StepStep size at this iteration
    TimeTime spent during this iteration (in seconds)
    ValidationLossValidation MSE for the model at this iteration
    ValidationChecksRunning total of times that the validation loss is greater than or equal to the minimum validation loss

    Data Types: table

    This property is read-only.

    Solver used to train the neural network model, returned as 'LBFGS'. To create a RegressionNeuralNetwork model, fitrnet uses a limited-memory Broyden-Fletcher-Goldfarb-Shanno quasi-Newton algorithm (LBFGS) as its loss function minimization technique, where the software minimizes the mean squared error (MSE).

    Predictor Properties

    This property is read-only.

    Predictor variable names, returned as a cell array of character vectors. The order of the elements of PredictorNames corresponds to the order in which the predictor names appear in the training data.

    Data Types: cell

    This property is read-only.

    Categorical predictor indices, returned as a vector of positive integers. Assuming that the predictor data contains observations in rows, CategoricalPredictors contains index values corresponding to the columns of the predictor data that contain categorical predictors. If none of the predictors are categorical, then this property is empty ([]).

    Data Types: double

    This property is read-only.

    Expanded predictor names, returned as a cell array of character vectors. If the model uses encoding for categorical variables, then ExpandedPredictorNames includes the names that describe the expanded variables. Otherwise, ExpandedPredictorNames is the same as PredictorNames.

    Data Types: cell

    Since R2023b

    This property is read-only.

    Predictor means, returned as a numeric vector. If you set Standardize to 1 or true when you train the neural network model, then the length of the Mu vector is equal to the number of expanded predictors (see ExpandedPredictorNames). The vector contains 0 values for dummy variables corresponding to expanded categorical predictors.

    If you set Standardize to 0 or false when you train the neural network model, then the Mu value is an empty vector ([]).

    Data Types: double

    Since R2023b

    This property is read-only.

    Predictor standard deviations, returned as a numeric vector. If you set Standardize to 1 or true when you train the neural network model, then the length of the Sigma vector is equal to the number of expanded predictors (see ExpandedPredictorNames). The vector contains 1 values for dummy variables corresponding to expanded categorical predictors.

    If you set Standardize to 0 or false when you train the neural network model, then the Sigma value is an empty vector ([]).

    Data Types: double

    This property is read-only.

    Unstandardized predictors used to train the neural network model, returned as a numeric matrix or table. X retains its original orientation, with observations in rows or columns depending on the value of the ObservationsIn name-value argument in the call to fitrnet.

    Data Types: single | double | table

    Response Properties

    This property is read-only.

    Response variable name, returned as a character vector.

    Data Types: char

    This property is read-only.

    Response values used to train the model, returned as a numeric vector. Each row of Y represents the response value of the corresponding observation in X.

    Data Types: single | double

    Response transformation function, specified as 'none' or a function handle. ResponseTransform describes how the software transforms raw response values.

    For a MATLAB® function or a function that you define, enter its function handle. For example, you can enter Mdl.ResponseTransform = @function, where function accepts a numeric vector of the original responses and returns a numeric vector of the same size containing the transformed responses.

    Data Types: char | function_handle

    Other Data Properties

    This property is read-only.

    Cross-validation optimization of hyperparameters, specified as a BayesianOptimization object or a table of hyperparameters and associated values. This property is nonempty if the 'OptimizeHyperparameters' name-value pair argument is nonempty when you create the model. The value of HyperparameterOptimizationResults depends on the setting of the Optimizer field in the HyperparameterOptimizationOptions structure when you create the model.

    Value of Optimizer FieldValue of HyperparameterOptimizationResults
    'bayesopt' (default)Object of class BayesianOptimization
    'gridsearch' or 'randomsearch'Table of hyperparameters used, observed objective function values (cross-validation loss), and rank of observations from lowest (best) to highest (worst)

    This property is read-only.

    Number of observations in the training data stored in X and Y, returned as a positive numeric scalar.

    Data Types: double

    This property is read-only.

    Observations of the original training data stored in the model, returned as a logical vector. This property is empty if all observations are stored in X and Y.

    Data Types: logical

    This property is read-only.

    Observation weights used to train the model, returned as an n-by-1 numeric vector. n is the number of observations (NumObservations).

    The software normalizes the observation weights specified in the Weights name-value argument so that the elements of W sum up to 1.

    Data Types: single | double

    Object Functions

    expand all

    compactReduce size of machine learning model
    crossvalCross-validate machine learning model
    limeLocal interpretable model-agnostic explanations (LIME)
    partialDependenceCompute partial dependence
    plotPartialDependenceCreate partial dependence plot (PDP) and individual conditional expectation (ICE) plots
    shapleyShapley values
    lossLoss for regression neural network
    predictPredict responses using regression neural network
    resubLossResubstitution regression loss
    resubPredictPredict responses for training data using trained regression model

    Examples

    collapse all

    Train a neural network regression model, and assess the performance of the model on a test set.

    Load the carbig data set, which contains measurements of cars made in the 1970s and early 1980s. Create a table containing the predictor variables Acceleration, Displacement, and so on, as well as the response variable MPG.

    load carbig
    cars = table(Acceleration,Displacement,Horsepower, ...
        Model_Year,Origin,Weight,MPG);

    Remove rows of cars where the table has missing values.

    cars = rmmissing(cars);

    Categorize the cars based on whether they were made in the USA.

    cars.Origin = categorical(cellstr(cars.Origin));
    cars.Origin = mergecats(cars.Origin,["France","Japan",...
        "Germany","Sweden","Italy","England"],"NotUSA");

    Partition the data into training and test sets. Use approximately 80% of the observations to train a neural network model, and 20% of the observations to test the performance of the trained model on new data. Use cvpartition to partition the data.

    rng("default") % For reproducibility of the data partition
    c = cvpartition(height(cars),"Holdout",0.20);
    trainingIdx = training(c); % Training set indices
    carsTrain = cars(trainingIdx,:);
    testIdx = test(c); % Test set indices
    carsTest = cars(testIdx,:);

    Train a neural network regression model by passing the carsTrain training data to the fitrnet function. For better results, specify to standardize the predictor data.

    Mdl = fitrnet(carsTrain,"MPG","Standardize",true)
    Mdl = 
      RegressionNeuralNetwork
               PredictorNames: {'Acceleration'  'Displacement'  'Horsepower'  'Model_Year'  'Origin'  'Weight'}
                 ResponseName: 'MPG'
        CategoricalPredictors: 5
            ResponseTransform: 'none'
              NumObservations: 314
                   LayerSizes: 10
                  Activations: 'relu'
        OutputLayerActivation: 'none'
                       Solver: 'LBFGS'
              ConvergenceInfo: [1x1 struct]
              TrainingHistory: [1000x7 table]
    
    
    

    Mdl is a trained RegressionNeuralNetwork model. You can use dot notation to access the properties of Mdl. For example, you can specify Mdl.TrainingHistory to get more information about the training history of the neural network model.

    Evaluate the performance of the regression model on the test set by computing the test mean squared error (MSE). Smaller MSE values indicate better performance.

    testMSE = loss(Mdl,carsTest,"MPG")
    testMSE = 6.8780
    

    Specify the structure of the neural network regression model, including the size of the fully connected layers.

    Load the carbig data set, which contains measurements of cars made in the 1970s and early 1980s. Create a matrix X containing the predictor variables Acceleration, Cylinders, and so on. Store the response variable MPG in the variable Y.

    load carbig
    X = [Acceleration Cylinders Displacement Weight];
    Y = MPG;

    Delete rows of X and Y where either array has missing values.

    R = rmmissing([X Y]);
    X = R(:,1:end-1);
    Y = R(:,end);

    Partition the data into training data (XTrain and YTrain) and test data (XTest and YTest). Reserve approximately 20% of the observations for testing, and use the rest of the observations for training.

    rng("default") % For reproducibility of the partition
    c = cvpartition(length(Y),"Holdout",0.20);
    trainingIdx = training(c); % Indices for the training set
    XTrain = X(trainingIdx,:);
    YTrain = Y(trainingIdx);
    testIdx = test(c); % Indices for the test set
    XTest = X(testIdx,:);
    YTest = Y(testIdx);

    Train a neural network regression model. Specify to standardize the predictor data, and to have 30 outputs in the first fully connected layer and 10 outputs in the second fully connected layer. By default, both layers use a rectified linear unit (ReLU) activation function. You can change the activation functions for the fully connected layers by using the Activations name-value argument.

    Mdl = fitrnet(XTrain,YTrain,"Standardize",true, ...
        "LayerSizes",[30 10])
    Mdl = 
      RegressionNeuralNetwork
                 ResponseName: 'Y'
        CategoricalPredictors: []
            ResponseTransform: 'none'
              NumObservations: 319
                   LayerSizes: [30 10]
                  Activations: 'relu'
        OutputLayerActivation: 'none'
                       Solver: 'LBFGS'
              ConvergenceInfo: [1x1 struct]
              TrainingHistory: [1000x7 table]
    
    
    

    Access the weights and biases for the fully connected layers of the trained model by using the LayerWeights and LayerBiases properties of Mdl. The first two elements of each property correspond to the values for the first two fully connected layers, and the third element corresponds to the values for the final fully connected layer for regression. For example, display the weights and biases for the first fully connected layer.

    Mdl.LayerWeights{1}
    ans = 30×4
    
        0.0122    0.0116   -0.0094    0.1174
       -0.4400   -1.5674   -0.1234   -2.2396
        0.3370    0.2628   -1.9752    0.2937
       -2.9872   -3.1024   -0.9050   -1.5978
        0.7721    2.2010    1.3134    0.2364
        0.1718    1.8862   -3.0548   -0.4272
        0.9583   -0.0591   -0.9272   -0.3960
        1.6701   -0.1617   -1.2640    0.7811
       -0.7890   -0.8045    0.2993    1.5391
        0.2053   -2.3423    1.7768    1.1690
          ⋮
    
    
    Mdl.LayerBiases{1}
    ans = 30×1
    
       -0.4448
       -1.0814
       -0.5026
       -0.9984
        0.2245
       -2.1709
        1.6112
        1.3802
       -1.2855
        0.1969
          ⋮
    
    

    The final fully connected layer has one output. The number of layer outputs corresponds to the first dimension of the layer weights and layer biases.

    size(Mdl.LayerWeights{end})
    ans = 1×2
    
         1    10
    
    
    size(Mdl.LayerBiases{end})
    ans = 1×2
    
         1     1
    
    

    To estimate the performance of the trained model, compute the test set mean squared error (MSE) for Mdl. Smaller MSE values indicate better performance.

    testMSE = loss(Mdl,XTest,YTest)
    testMSE = 16.8576
    

    Compare the predicted test set response values to the true response values. Plot the predicted miles per gallon (MPG) along the vertical axis and the true MPG along the horizontal axis. Points on the reference line indicate correct predictions. A good model produces predictions that are scattered near the line.

    testPredictions = predict(Mdl,XTest);
    plot(YTest,testPredictions,".")
    hold on
    plot(YTest,YTest)
    hold off
    xlabel("True MPG")
    ylabel("Predicted MPG")

    Extended Capabilities

    Version History

    Introduced in R2021a

    expand all