Main Content

idNeuralNetwork

Multilayer neural network mapping function for nonlinear ARX models and Hammerstein-Wiener models (requires Statistics and Machine Learning Toolbox or Deep Learning Toolbox)

Since R2023b

Description

An idNeuralNetwork object creates a neural network function and is a nonlinear mapping object for estimating nonlinear ARX models and Hammerstein-Wiener models. This mapping object lets you create neural networks using the regression networks of Statistics and Machine Learning Toolbox™ and the deep and shallow networks of Deep Learning Toolbox™.

Mathematically, idNeuralNetwork is a function that maps m inputs X(t) = [x(t1),x2(t),…,xm(t)]T to a single scalar output y(t) using the following relationship:

y(t)=y0+Χ(t)TPL+S(ΧT(t)Q)

Here:

  • X(t) is an m-by-1 vector of inputs, or regressors.

  • y0 is the output offset, a scalar.

  • P and Q are m-by-p and m-by-q projection matrices, respectively.

  • L is a p-by-1 vector of weights.

  • S(.) represents a neural network object of one of the following types:

    • RegressionNeuralNetwork (Statistics and Machine Learning Toolbox) object — Network object created using fitrnet (Statistics and Machine Learning Toolbox)

    • dlnetwork (Deep Learning Toolbox) object — Deep learning network object

    • network (Deep Learning Toolbox) object — Shallow network object created using a command such as feedforwardnet (Deep Learning Toolbox)

See Examples for more information.

Use idNeuralNetwork as the output value, or, for multiple-output systems, one of the output values in the OutputFcn property of an idnlarx model or the InputNonlinearity and OutputNonlinearity properties of an idnlhw object. For example, specify idNeuralNetwork when you estimate an idnlarx model with the following command.

sys = nlarx(data,regressors,idNeuralNetwork)
When nlarx estimates the model, it essentially estimates the parameters of the idNeuralNetwork function.

You can use a similar approach when you specify input or output linearities using the nlhw command. For example, specify idNeuralNetwork as both the input and output nonlinearities with the following command.

sys = nlhw(data,orders,idNeuralNetwork,idNeuralNetwork)

Creation

Description

Create Regression Network or Deep Learning Network

NW = idNeuralNetwork creates an idNeuralNetwork object NW that uses a single hidden layer of ten rectified linear unit (ReLU) activations.

The specific type of network that NW represents depends on the toolboxes you have access to.

  • If you have access to Statistics and Machine Learning Toolbox, then idNeuralNetwork uses fitrnet (Statistics and Machine Learning Toolbox) to create a RegressionNeuralNetwork (Statistics and Machine Learning Toolbox)-based map.

  • If Statistics and Machine Learning Toolbox is not available but you have access to Deep Learning Toolbox, then idNeuralNetwork uses dlnetwork (Deep Learning Toolbox) to create a deep learning network map.

For idnlhw models, the number of inputs to the network is 1. For idnlarx models, the number of inputs is unknown, as this number is determined during estimation. NW also uses a parallel linear function and an offset element.

For multiple-output nonlinear ARX or Hammerstein-Wiener models, create a separate idNeuralNetwork object for each output. Each element of the output function must represent a single-output network object.

example

NW = idNeuralNetwork(LayerSizes) uses numel(LayerSizes) layers. Each ith element in LayerSizes specifies the number of activations in the corresponding ith layer.

example

NW = idNeuralNetwork(LayerSizes,Activations) specifies the types of activation to use in each layer. The combination of the Activations specification and the available toolboxes determines which type of neural network NW uses.

example

NW = idNeuralNetwork(LayerSizesActivations,UseLinearFcn) specifies whether NW uses a linear function as a subcomponent.

example

NW = idNeuralNetwork(LayerSizesActivations,UseLinearFcn,UseOffset) specifies whether NW uses an offset term.

example

NW = idNeuralNetwork(___,NetworkType=type) forces the use of either a regression neural network from Statistics and Machine Learning Toolbox or a deep network from Deep Learning Toolbox. Specify type as "RegressionNeuralNetwork" or "dlnetwork". The specification in this syntax overrides the default automatic activation-based selection of network type described in Activations. Setting type to "auto" is equivalent to the using the default selection.

You can use this syntax with any of the previous input-argument combinations.

example

Use Existing Shallow Neural Network

NW = idNeuralNetwork(shallownet) creates NW using the network (Deep Learning Toolbox) object shallownet.

shallownet is typically the output of feedforwardnet (Deep Learning Toolbox), cascadeforwardnet (Deep Learning Toolbox), or linearlayer (Deep Learning Toolbox).

example

NW = idNeuralNetwork(shallownet,[],UseLinearFcn) specifies whether NW uses a linear function as a subcomponent.

example

NW = idNeuralNetwork(shallownet,[],UseLinearFcn,UseOffset) specifies whether NW uses an offset term.

example

Input Arguments

expand all

Number of network layers, specified as a row vector of positive integers with length equal to the number of layers. Each integer indicates the number of activations in the corresponding layer. For instance, a value of [10 5 2] corresponds to a three-layer network, with ten activations in the first layer, five in the second, and two in the third.

Activation types to use in each layer, specified as a string array of length equal to the length of LayerSizes. The activation types can be divided into two groups.

  1. Four activation types are available in both Statistics and Machine Learning Toolbox and Deep Learning Toolbox. These activation types are "relu", "tanh", "sigmoid", and "none". Use "none" to specify a linear layer.

  2. The remaining activation types are available only in Deep Learning Toolbox, and consist of "leakyRelu", "clippedRelu", "elu", "gelu", "swish", "softplus", "scaling", and "softmax". "scaling" and "softplus" also require the Reinforcement Learning Toolbox™.

You can also specify hyperparameter values for "leakyRelu", "clippedRelu", "elu", and "scaling". For example:

  • "leakyRelu(0.2)" specifies a leaky ReLu activation layer with a scaling value of 0.2.

  • "clippedRelu(5)" specifies a clipped ReLu activation layer with a ceiling value of 5.

  • "elu(2)" specifies an ELU activation layer with the Alpha property equal to 2.

  • "scaling(0.2,4)" specifies a scaling activation layer with a scale of 0.2 and a bias of 4.

To apply the same set of activations to all layers, specify Activations as a scalar string. To use the default activation, specify Activations as [].

The choice of Activations combined with the availability of Statistics and Machine Learning Toolbox and Deep Learning Toolbox determines which network to use, as the following table shows. In the toolbox availability columns, an entry of "—" indicates that the corresponding toolbox availability does not impact the network selection for that row.

ActivationsStatistics and ML AvailableDL AvailableNetwork Type
Group 1 onlyYesRegressionNeuralNetwork (Statistics and Machine Learning Toolbox) from fitrnet (Statistics and Machine Learning Toolbox)
NoYesDeep network from dlnetwork (Deep Learning Toolbox)
At least one activation from group 2YesDeep network from dlnetwork (Deep Learning Toolbox)

For more information about these activations, as well as additional toolbox requirements for using them, see the Activations properties in RegressionNeuralNetwork (Statistics and Machine Learning Toolbox) and lbfgsupdate (Deep Learning Toolbox), and the Activation Layers section in List of Deep Learning Layers (Deep Learning Toolbox).

Shallow network, specified as a network (Deep Learning Toolbox) object.

shallownet is typically the output of feedforwardnet (Deep Learning Toolbox), cascadeforwardnet (Deep Learning Toolbox), or linearlayer (Deep Learning Toolbox).

This argument sets the value of the NW.

Option to use the linear function subcomponent, specified as true or false. This argument sets the value of the NW.LinearFcn.Use property.

Option to use an offset term, specified as true or false. This argument sets the value of the NW.Offset.Use property.

Properties

expand all

Input signal names for the inputs to the mapping object, specified as a 1-by-m cell array, where m is the number of input signals. This property is determined during estimation.

Output signal name for the output of the mapping object, specified as a 1-by-1 cell array. This property is determined during estimation.

Parameters of the linear function, specified as follows:

  • Use — Option to use the linear function in the mapping object, specified as a scalar logical. The default value is true.

  • Value — Linear weights that compose L', specified as a 1-by-p vector.

  • InputProjection — Input projection matrix P, specified as an m-by-p matrix, that transforms the detrended input vector of length m into a vector of length p. For Hammerstein-Wiener models, InputProjection is equal to 1.

  • Free — Option to update entries of Value during estimation, specified as a 1-by-p logical vector. The software honors the Free specification only if the starting value of Value is finite. The default value is true.

Parameters of the offset term, specified as follows:

  • Use — Option to use the offset in the mapping object, specified as a scalar logical. The default value is true.

  • Value — Offset value, specified as a scalar.

  • Free — Option to update Value during estimation, specified as a scalar logical. The software honors the Free specification of false only if the value of Value is finite. The default value is true.

Parameters of the idNeuralNetwork network function, which are the properties Parameters, Inputs, and Outputs.

Parameters contains the learnable hyperparameters and initial hyperparameter values used by the network:

  • Learnables — Vector of tunable parameters that represent the weights and biases for the network. For each tunable item, you can set the initial value and specify whether the value is fixed or free during training.

  • InputProjection — Parameters of input projection matrix Q, determined during training as an m-by-q matrix. Q transforms the detrended input vector (XX¯) of length m into a vector of length q. Typically, Q has the same dimensions as the linear projection matrix P. In this case, q is equal to p, which is the number of linear weights.

    For Hammerstein-Wiener models, InputProjection is equal to 1.

Estimation options for the idNeuralNetwork model, specified as a structure with the fields Solver and SolverOptions. The set of estimation options for the model depends on what type of network idNeuralNetwork represents. The following tables each present the estimation options for one network type.

To specify an estimation option, use dot notation after creating NW. For example, to reduce the iteration limit for a regression neural network to 500, use NW.EstimationOptions.SolverOptions.IterationLimit = 500.

For more information on any of these options, see the corresponding name-value argument in fitrensemble (Statistics and Machine Learning Toolbox).

Regression Neural Network

When NW uses a regression neural network, the solver is fixed to "LBFGS". For more information on the solver, see the Solver property in RegressionNeuralNetwork (Statistics and Machine Learning Toolbox).

SolverSolver Options
LBFGS
Option Specification and Default Value
IterationLimit

Positive integer (default 1000)

GradientTolerance

0 or positive number (default 1e-6)

LossTolerance

0 or positive number (default 1e-6)

StepTolerance

0 or positive number (default 1e-6)

Lambda (regularization penalty)

Nonnegative scalar (default 0)

LayerWeightsInitializer (weight value initialization scheme)

Either 'glorot'(default) or 'he'

LayerBiasesInitializer (bias value initialization scheme)

Either 'zeros' or 'ones'(default)

Deep Learning Network

When NW uses a deep learning network (dlnetwork (Deep Learning Toolbox)), the solver choices are LBFGS (default), SGDM, ADAM, and RMSProp. The following tables show the solver options for each solver type. For more information on the solvers and options, see the Algorithms section in trainingOptions (Deep Learning Toolbox) and the Update Learnable Parameters section in Define Custom Training Loops, Loss Functions, and Networks (Deep Learning Toolbox).

SolverSolver Options
LBFGS
Option Specification and Default Value
LineSearchMethod"weak-wolfe" (default), "strong-wolfe", "backtracking"
MaxNumLineSearchIterationsPositive, finite, integer (default 20)
HistorySizePositive integer (default 10)
InitialInverseHessianFactorPositive, finite, real number (default 1)
MaxIterationsPositive, finite, integer (default 100)
GradientTolerance

0 or positive number (default 1e-6)

StepTolerance

0 or positive number (default 1e-6)

Lambda (regularization penalty)

Nonnegative scalar (default 0)

LayerWeightsInitializer (weight value initialization scheme)

Either 'glorot'(default) or 'he'

LayerBiasesInitializer (bias value initialization scheme)

Either 'zeros' or 'ones'(default)

SolverSolver Options
SGDM
Option Specification and Default Value
LearnRatePositive scalar (default 0.01)
MomentumPositive scalar less than or equal to 1 (default 0.95)
Lambda (regularization penalty)

Nonnegative scalar (default 0)

MaxEpochsPositive integer (default 100)
MiniBatchSizePositive integer (default 1000)
LayerWeightsInitializer (weight value initialization scheme)

Either 'glorot'(default) or 'he'

LayerBiasesInitializer (bias value initialization scheme)

Either 'zeros' or 'ones'(default)

SolverSolver Options
ADAM
Option Specification and Default Value
LearnRatePositive scalar (default 0.001)
GradientDecayFactorPositive scalar less than 1 (default 0.9)
SquaredGradientDecayFactorPositive scalar less than 1 (default 0.999)
Lambda

Nonnegative scalar (default 0)

MaxEpochsPositive integer (default 100)
MiniBatchSizePositive integer (default 1000)
LayerWeightsInitializer (weight value initialization scheme)

Either 'glorot'(default) or 'he'

LayerBiasesInitializer (bias value initialization scheme)

Either 'zeros' or 'ones'(default)

SolverSolver Options
RMSProp
Option Specification and Default Value
LearnRatePositive scalar (default 0.001)
SquaredGradientDecayFactorPositive scalar less than 1 (default 0.9)
Lambda

Nonnegative scalar (default 0)

MaxEpochsPositive integer (default 100)
MiniBatchSizePositive integer (default 1000)
LayerWeightsInitializer (weight value initialization scheme)

Either 'glorot'(default) or 'he'

LayerBiasesInitializer (bias value initialization scheme)

Either 'zeros' or 'ones'(default)

Shallow Network

When NW uses an existing shallow neural network, the solvers are equivalent to the training functions in Deep Learning Toolbox. The corresponding options are the same for all training function choices. The default option is trainlm. For information on available shallow network training functions and their associated algorithms, see Train and Apply Multilayer Shallow Neural Networks (Deep Learning Toolbox).

SolverSolver Options
Any shallow network training function
Option Specification and Default Value
showWindow

Boolean (scalar) (default 1)

showCommandLine

Boolean (scalar) (default 0)

show

Positive integer (default 25)

epochs

Positive integer (default 1000)

time

Positive scalar (default Inf)

goal

Nonnegative scalar (default 0)

min_grad

Positive scalar (default 1e-07)

max_fail

Positive integer (default 6)

mu

Positive scalar (default 0.001)

mu_dec Positive number less than or equal to 1 (default 0.1)
min_inc Number greater than 1 (default 10)
min_max Positive scalar (default 1e+10)

Examples

collapse all

Create an idNeuralNetwork object with default properties.

NW = idNeuralNetwork
Constructing a RegressionNeuralNetwork object from the Statistics and Machine Learning Toolbox... 
If you want to use a deep network representation, specify NetworkType="dlnetwork" in the idNeuralNetwork constructor.
NW = 
Multi-Layer Neural Network

 Nonlinear Function: Uninitialized regression neural network
         Contains 1 hidden layers using "relu" activations.
         (uses Statistics and Machine Learning Toolbox)
 Linear Function: uninitialized
 Output Offset: uninitialized

              Network: '<Regression neural network parameters>'
            LinearFcn: '<Linear function parameters>'
               Offset: '<Offset parameters>'
    EstimationOptions: [1x1 struct]

NW is a regression neural network with a single layer of relu activations.

Specify a network that uses two hidden layers of sizes 5 and 3 respectively. Specify that both layers use tanh activations.

NW = idNeuralNetwork([5 3],"tanh");
Constructing a RegressionNeuralNetwork object from the Statistics and Machine Learning Toolbox... 
If you want to use a deep network representation, specify NetworkType="dlnetwork" in the idNeuralNetwork constructor.
disp(NW.Network)
Regression neural network parameters

    Parameters: '<Learnables and hyperparameters>'
        Inputs: {1x0 cell}
       Outputs: {1x0 cell}

This example assumes that you have access to Statistics and Machine Learning Toolbox, but will also run with Deep Learning Toolbox. If you have access to both toolboxes, then NW is a regression network. If you have access to only Deep Learning Toolbox, then NW is a deep network.

Create a network that contains three hidden layers. The first layer uses 10 relu activations, the second layer uses 5 tanh activations, and the third layer uses swish activations.

  NW = idNeuralNetwork([10 5 2],["relu", "tanh", "swish"])
NW = 
Multi-Layer Neural Network

 Nonlinear Function: Deep learning network
         Contains 3 hidden layers using "relu", "tanh", "swish" activations.
         (uses Deep Learning Toolbox)
 Linear Function: uninitialized
 Output Offset: uninitialized

              Network: '<Deep learning network parameters>'
            LinearFcn: '<Linear function parameters>'
               Offset: '<Offset parameters>'
    EstimationOptions: [1x1 struct]

The swish network requires Deep Learning Toolbox. Therefore, NW is a deep network whether or not you also have Statistics and Machine Learning Toolbox.

Create an idNeuralNetwork object that has no linear function or offset.

UseLinear = false;
UseOffset = false;
NW = idNeuralNetwork(5,"relu",UseLinear,UseOffset);
Constructing a RegressionNeuralNetwork object from the Statistics and Machine Learning Toolbox... 
If you want to use a deep network representation, specify NetworkType="dlnetwork" in the idNeuralNetwork constructor.
disp(NW.Linear)
Linear Function: not in use
              Value: [1x0 double]
               Free: [1x0 logical]
                Use: 0
             Inputs: {1x0 cell}
            Outputs: {1x0 cell}
    InputProjection: []
disp(NW.Offset)
Output Offset: not in use
      Use: 0
    Value: NaN
     Free: 1

NW does not use the linear function or offset.

Create a network function with default settings, but enforce that the function be based on the deep network architecture.

NW = idNeuralNetwork(NetworkType="dlnetwork")
NW = 
Multi-Layer Neural Network

 Nonlinear Function: Deep learning network
         Contains 1 hidden layers using "relu" activations.
         (uses Deep Learning Toolbox)
 Linear Function: uninitialized
 Output Offset: uninitialized

              Network: '<Deep learning network parameters>'
            LinearFcn: '<Linear function parameters>'
               Offset: '<Offset parameters>'
    EstimationOptions: [1x1 struct]

The network specification overrides the default selection of a regression network.

Construct an idNeuralNetwork object that uses a deep learning shallow network.

Create a feedforward (shallow) network that uses three hidden layers with four, six, and one neurons, respectively.

snet = feedforwardnet([4 6 1]);

Specify the transfer functions for the hidden layers. The output layer uses the default transfer function 'purelin'.

   snet.layers{1}.transferFcn = 'logsig';
   snet.layers{2}.transferFcn = 'radbas';
   snet.layers{3}.transferFcn = 'purelin';

Incorporate snet into the idNeuralNetwork object NW.

   NW = idNeuralNetwork(snet)
NW = 
Multi-Layer Neural Network

 Nonlinear Function: Uninitialized shallow network
         (uses Deep Learning Toolbox)
 Linear Function: uninitialized
 Output Offset: uninitialized

              Network: '<Shallow network parameters>'
            LinearFcn: '<Linear function parameters>'
               Offset: '<Offset parameters>'
    EstimationOptions: [1x1 struct]

Identify a nonlinear ARX model that uses a regression neural network to describe the regressor-to-output mapping.

Load the data, which consists of the column vectors u and y. Convert the data into a timetable tt with a sample time of 0.8 minutes.

load twotankdata u y
tt = timetable(y,u,TimeStep=minutes(0.8));

Split tt into estimation (training) and validation data sets tte and ttv.

tte = tt(1:1000,:);
ttv = tt(1001:2000,:);

Specify estimation and search options.

  opt = nlarxOptions(Focus="simulation",Display="on",SearchMethod="fmincon");
  opt.SearchOptions.MaxIterations = 10;

Create a regression network NW that uses two hidden layers with five activations each. Use sigmoid for the activations in the first layer and tanh for activations in the second layer.

NW = idNeuralNetwork([5 5],["sigmoid","tanh"]);
Constructing a RegressionNeuralNetwork object from the Statistics and Machine Learning Toolbox... 
If you want to use a deep network representation, specify NetworkType="dlnetwork" in the idNeuralNetwork constructor.

Estimate a nonlinear ARX model that uses NW as the output function and uses y as the output variable.

sys = nlarx(tte,[2 2 1],NW,opt,OutputName="y");

Using the validation data, compare the measured output with the model output.

compare(ttv,sys)

Figure contains an axes object. The axes object with ylabel y contains 2 objects of type line. These objects represent Validation data (y), sys: 88.55%.

The nonlinear model shows a good fit to the estimation data.

Identify a nonlinear ARX model that uses a cascade-forward shallow network.

Load the data, which consists of the input and output arrays u and y, respectively. Convert the data into the timetable tt with a sample time of 0.8 minutes.

load twotankdata
tte = timetable(u,y,TimeStep=minutes(0.8));

Create a cascade-forward shallow network with a single hidden layer.

cnet = cascadeforwardnet(20);

Construct an idNeuralNetwork object NW that incorporates cnet and excludes the linear and offset elements.

NW = idNeuralNetwork(cnet,[],false,false); 

Specify estimation and search options.

opt = nlarxOptions(SearchMethod="gna");
opt.SearchOptions.MaxIterations = 2;

Estimate the nlarx model sys and compare the model output with the measured data.

sys = nlarx(tte,[2 2 1],NW,opt);

Figure Neural Network Training (20-Jul-2024 12:49:12) contains an object of type uigridlayout.

compare(tte,sys)

Figure contains an axes object. The axes object with ylabel y contains 2 objects of type line. These objects represent Validation data (y), sys: 87.61%.

Algorithms

The learnable parameters of the idNeuralNetwork function are determined during estimation of the nonlinear ARX and Hammerstein-Wiener models, using nlarx and nlhw commands, respectively.

The software initializes these parameters using the following steps:

  1. Determine the linear function coefficients L and the offset y0, if in use and free, by performing a least-squares fit to the data.

  2. Initialize the learnable parameters of the network function by fitting the residues of the linear and offset terms from step 1. The initialization scheme depends upon the type of the underlying network:

    • For RegressionNeuralNetwork (Statistics and Machine Learning Toolbox) networks, use fitrnet (Statistics and Machine Learning Toolbox).

    • For dlnetwork (Deep Learning Toolbox) networks, perform initialization by training the network using the specified solver in NW.EstimationOptions.

    • For network (Deep Learning Toolbox) networks, perform initialization by training the network using the specified solver in NW.EstimationOptions.

After initialization, the software updates the parameters using a nonlinear least-squares optimization solver (see SearchMethod in nlarxOptions and SearchOptions in nlhwOptions) to minimize the chosen objective, as the following objective summaries describe:

  • For nonlinear ARX models, the objective is either prediction-error minimization or simulation-error minimization, depending on whether the Focus option in nlarxOptions is "prediction" or "simulation".

  • For Hammerstein-Wiener models, the objective is simulation-error-norm minimization.

See nlarxOptions and nlhwOptions for more information on how to configure the objective and search method.

Version History

Introduced in R2023b

See Also

| | | | (Statistics and Machine Learning Toolbox) | (Statistics and Machine Learning Toolbox) | (Deep Learning Toolbox) | (Deep Learning Toolbox) | (Deep Learning Toolbox) | (Deep Learning Toolbox) |