Main Content

Loss of linear model for incremental learning on batch of data

`loss`

returns the regression or classification loss of a configured incremental learning model for linear regression (`incrementalRegressionLinear`

object) or linear binary classification (`incrementalClassificationLinear`

object).

To measure model performance on a data stream and store the results in the output model, call `updateMetrics`

or `updateMetricsAndFit`

.

The performance of an incremental model on streaming data is measured in three ways:

Cumulative metrics measure the performance since the start of incremental learning.

Window metrics measure the performance on a specified window of observations. The metrics are updated every time the model processes the specified window.

The

`loss`

function measures the performance on a specified batch of data only.

Load the human activity data set. Randomly shuffle the data.

load humanactivity n = numel(actid); rng(1); % For reproducibility idx = randsample(n,n); X = feat(idx,:); Y = actid(idx);

For details on the data set, enter `Description`

at the command line.

Responses can be one of five classes: Sitting, Standing, Walking, Running, or Dancing. Dichotomize the response by identifying whether the subject is moving (`actid`

> 2).

Y = Y > 2;

Create an incremental linear SVM model for binary classification. Configure it for `loss`

by specifying the class names, prior class distribution (uniform), and arbitrary coefficient and bias values. Specify a metrics window size of 1000 observations.

p = size(X,2); Beta = randn(p,1); Bias = randn(1); Mdl = incrementalClassificationLinear('Beta',Beta,'Bias',Bias,... 'ClassNames',unique(Y),'Prior','uniform','MetricsWindowSize',1000);

`Mdl`

is an `incrementalClassificationLinear`

model. All its properties are read-only. Instead of specifying arbitrary values, you can take either of these actions to configure the model:

Train an SVM model using

`fitcsvm`

or`fitclinear`

on a subset of the data (if available), and then convert the model to an incremental learner by using`incrementalLearner`

.Incrementally fit

`Mdl`

to data by using`fit`

.

Simulate a data stream, and perform the following actions on each incoming chunk of 50 observations:

Call

`updateMetrics`

to measure the cumulative performance and the performance within a window of observations. Overwrite the previous incremental model with a new one to track performance metrics.Call

`loss`

to measure the model performance on the incoming chunk.Call

`fit`

to fit the model to the incoming chunk. Overwrite the previous incremental model with a new one fitted to the incoming observation.Store all performance metrics to see how they evolve during incremental learning.

% Preallocation numObsPerChunk = 50; nchunk = floor(n/numObsPerChunk); ce = array2table(zeros(nchunk,3),'VariableNames',["Cumulative" "Window" "Loss"]); % Incremental learning for j = 1:nchunk ibegin = min(n,numObsPerChunk*(j-1) + 1); iend = min(n,numObsPerChunk*j); idx = ibegin:iend; Mdl = updateMetrics(Mdl,X(idx,:),Y(idx)); ce{j,["Cumulative" "Window"]} = Mdl.Metrics{"ClassificationError",:}; ce{j,"Loss"} = loss(Mdl,X(idx,:),Y(idx)); Mdl = fit(Mdl,X(idx,:),Y(idx)); end

`Mdl`

is an `incrementalClassificationLinear`

model object trained on all the data in the stream. During incremental learning and after the model is warmed up, `updateMetrics`

checks the performance of the model on the incoming observation, then and the `fit`

function fits the model to that observation. `loss`

is agnostic of the metrics warm-up period, so it measures the classification error for all iterations.

To see how the performance metrics evolved during training, plot them.

figure; plot(ce.Variables); xlim([0 nchunk]); ylim([0 0.05]) ylabel('Classification Error') xline(Mdl.MetricsWarmupPeriod/numObsPerChunk,'r-.'); legend(ce.Properties.VariableNames) xlabel('Iteration')

During the metrics warm-up period (the area to the left of the red line), the yellow line represents the classification error on each incoming chunk of data. After the metrics warm-up period, `Mdl`

tracks the cumulative and window metrics. The cumulative and batch losses converge as the `fit`

function fits the incremental model to the incoming data.

Fit an incremental learning model for regression to streaming data, and compute the mean absolute deviation (MAD) on the incoming data batches.

Load the robot arm data set. Obtain the sample size `n`

and the number of predictor variables `p`

.

```
load robotarm
n = numel(ytrain);
p = size(Xtrain,2);
```

For details on the data set, enter `Description`

at the command line.

Create an incremental linear model for regression. Configure the model as follows:

Specify a metrics warm-up period of 1000 observations.

Specify a metrics window size of 500 observations.

Track the mean absolute deviation (MAD) to measure the performance of the model. Create an anonymous function that measures the absolute error of each new observation. Create a structure array containing the name

`MeanAbsoluteError`

and its corresponding function.Configure the model to predict responses by specifying that all regression coefficients and the bias are 0.

maefcn = @(z,zfit,w)(abs(z - zfit)); maemetric = struct("MeanAbsoluteError",maefcn); Mdl = incrementalRegressionLinear('MetricsWarmupPeriod',1000,'MetricsWindowSize',500,... 'Metrics',maemetric,'Beta',zeros(p,1),'Bias',0,'EstimationPeriod',0)

Mdl = incrementalRegressionLinear IsWarm: 0 Metrics: [2x2 table] ResponseTransform: 'none' Beta: [32x1 double] Bias: 0 Learner: 'svm' Properties, Methods

`Mdl`

is an `incrementalRegressionLinear`

model object configured for incremental learning.

Perform incremental learning. At each iteration:

Simulate a data stream by processing a chunk of 50 observations.

Call

`updateMetrics`

to compute cumulative and window metrics on the incoming chunk of data. Overwrite the previous incremental model with a new one fitted to overwrite the previous metrics.Call

`loss`

to compute the MAD on the incoming chunk of data. Whereas the cumulative and window metrics require that custom losses return the loss for each observation,`loss`

requires the loss on the entire chunk. Compute the mean of the absolute deviation.Call

`fit`

to fit the incremental model to the incoming chunk of data.Store the cumulative, window, and chunk metrics to see how they evolve during incremental learning.

% Preallocation numObsPerChunk = 50; nchunk = floor(n/numObsPerChunk); mae = array2table(zeros(nchunk,3),'VariableNames',["Cumulative" "Window" "Chunk"]); % Incremental fitting for j = 1:nchunk ibegin = min(n,numObsPerChunk*(j-1) + 1); iend = min(n,numObsPerChunk*j); idx = ibegin:iend; Mdl = updateMetrics(Mdl,Xtrain(idx,:),ytrain(idx)); mae{j,1:2} = Mdl.Metrics{"MeanAbsoluteError",:}; mae{j,3} = loss(Mdl,Xtrain(idx,:),ytrain(idx),'LossFun',@(x,y,w)mean(maefcn(x,y,w))); Mdl = fit(Mdl,Xtrain(idx,:),ytrain(idx)); end

`IncrementalMdl`

is an `incrementalRegressionLinear`

model object trained on all the data in the stream. During incremental learning and after the model is warmed up, `updateMetrics`

checks the performance of the model on the incoming observation, and the `fit`

function fits the model to that observation.

Plot the performance metrics to see how they evolved during incremental learning.

figure; h = plot(mae.Variables); ylabel('Mean Absolute Deviation') xline(Mdl.MetricsWarmupPeriod/numObsPerChunk,'r-.'); xlabel('Iteration') legend(h,mae.Properties.VariableNames)

The plot suggests the following:

`updateMetrics`

computes performance metrics after the metrics warm-up period only.`updateMetrics`

computes the cumulative metrics during each iteration.`updateMetrics`

computes the window metrics after processing 500 observationsBecause

`Mdl`

was configured to predict observations from the beginning of incremental learning,`loss`

can compute the MAD on each incoming chunk of data.

`Mdl`

— Incremental learning model`incrementalClassificationLinear`

model object | `incrementalRegressionLinear`

model objectIncremental learning model, specified as an `incrementalClassificationLinear`

or `incrementalRegressionLinear`

model object. You can create `Mdl`

directly or by converting a supported, traditionally trained machine learning model using the `incrementalLearner`

function. For more details, see the corresponding reference page.

You must configure `Mdl`

to compute its loss on a batch of observations.

If

`Mdl`

is a converted, traditionally trained model, you can compute its loss without any modifications.Otherwise,

`Mdl`

must satisfy the following criteria, which you can specify directly or by fitting`Mdl`

to data using`fit`

or`updateMetricsAndFit`

.If

`Mdl`

is an`incrementalRegressionLinear`

model, its model coefficients`Mdl.Beta`

and bias`Mdl.Bias`

must be nonempty arrays.If

`Mdl`

is an`incrementalClassificationLinear`

model, its model coefficients`Mdl.Beta`

and bias`Mdl.Bias`

must be nonempty arrays, the class names`Mdl.ClassNames`

must contain two classes, and the prior class distribution`Mdl.Prior`

must contain known values.Regardless of object type, if you configure the model so that functions standardize predictor data, the predictor means

`Mdl.Mu`

and standard deviations`Mdl.Sigma`

must be nonempty arrays.

`X`

— Batch of predictor datafloating-point matrix

Batch of predictor data with which to compute the loss, specified as a floating-point matrix of *n* observations and `Mdl.NumPredictors`

predictor variables. The value of the `'ObservationsIn'`

name-value pair argument determines the orientation of the variables and observations.

The length of the observation labels `Y`

and the number of observations in `X`

must be equal; `Y(`

is the label of observation * j*)

`X`

.**Note**

`loss`

supports only floating-point input predictor data. If the input model `Mdl`

represents a converted, traditionally trained model fit to categorical data, use `dummyvar`

to convert each categorical variable to a numeric matrix of dummy variables, and concatenate all dummy variable matrices and any other numeric predictors. For more details, see Dummy Variables.

**Data Types: **`single`

| `double`

`Y`

— Batch of labelscategorical array | character array | string array | logical vector | floating-point vector | cell array of character vectors

Batch of labels with which to compute the loss, specified as a categorical, character, or string array, logical or floating-point vector, or cell array of character vectors for classification problems; or a floating-point vector for regression problems.

The length of the observation labels `Y`

and the number of observations in `X`

must be equal; `Y(`

is the label of observation * j*)

`X`

.For classification problems:

`loss`

supports binary classification only.When the

`ClassNames`

property of the input model`Mdl`

is nonempty, the following conditions apply:If

`Y`

contains a label that is not a member of`Mdl.ClassNames`

,`loss`

issues an error.The data type of

`Y`

and`Mdl.ClassNames`

must be the same.

**Data Types: **`char`

| `string`

| `cell`

| `categorical`

| `logical`

| `single`

| `double`

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

`'ObservationsIn','columns','Weights',W`

specifies that the columns of the predictor matrix correspond to observations, and the vector `W`

contains observation weights to apply.`'LossFun'`

— Loss functionstring vector | function handle | cell vector | structure array | ...

Loss function, specified as the comma-separated pair consisting of `'LossFun'`

and a built-in loss function name or function handle.

**Classification problems**: The following table lists the available loss functions when`Mdl`

is an`incrementalClassificationLinear`

model. Specify one using its corresponding character vector or string scalar.Name Description `"binodeviance"`

Binomial deviance `"classiferror"`

(default)Misclassification rate in decimal `"exponential"`

Exponential loss `"hinge"`

Hinge loss `"logit"`

Logistic loss `"quadratic"`

Quadratic loss For more details, see Classification Loss.

Logistic regression learners return posterior probabilities as classification scores, but SVM learners do not (see

`predict`

).To specify a custom loss function, use function handle notation. The function must have this form:

lossval =

*lossfcn*(C,S,W)The output argument

`lossval`

is an*n*-by-1 floating-point vector, where`lossval(`

is the classification loss of observation)`j`

.`j`

You specify the function name (

).`lossfcn`

`C`

is an*n*-by-2 logical matrix with rows indicating the class to which the corresponding observation belongs. The column order corresponds to the class order in the`ClassNames`

property. Create`C`

by setting`C(`

=,`p`

)`q`

`1`

, if observation

is in class`p`

, for each observation in the specified data. Set the other element in row`q`

to`p`

`0`

.`S`

is an*n*-by-2 numeric matrix of predicted classification scores.`S`

is similar to the`score`

output of`predict`

, where rows correspond to observations in the data and the column order corresponds to the class order in the`ClassNames`

property.`S(`

is the classification score of observation,`p`

)`q`

being classified in class`p`

.`q`

`W`

is an*n*-by-1 numeric vector of observation weights.

**Regression problems**: The following table lists the available loss functions when`Mdl`

is an`incrementalRegressionLinear`

model. Specify one using its corresponding character vector or string scalar.Name Description Learners Supporting Metric `"epsiloninsensitive"`

Epsilon insensitive loss `'svm'`

`"mse"`

(default)Weighted mean squared error `'svm'`

and`'leastsquares'`

For more details, see Regression Loss.

To specify a custom loss function, use function handle notation. The function must have this form:

lossval =

*lossfcn*(Y,YFit,W)The output argument

`lossval`

is a floating-point scalar.You specify the function name (

).`lossfcn`

`Y`

is a length*n*numeric vector of observed responses.`YFit`

is a length*n*numeric vector of corresponding predicted responses.`W`

is an*n*-by-1 numeric vector of observation weights.

**Example: **`'LossFun',"mse"`

**Example: **`'LossFun',@`

`lossfcn`

**Data Types: **`char`

| `string`

| `function_handle`

`'ObservationsIn'`

— Predictor data observation dimension`'rows'`

(default) | `'columns'`

Predictor data observation dimension, specified as the comma-separated pair consisting of `'ObservationsIn'`

and `'columns'`

or `'rows'`

.

**Data Types: **`char`

| `string`

`'Weights'`

— Batch of observation weightsfloating-point vector of positive values

Batch of observation weights, specified as the comma-separated pair consisting of `'Weights'`

and a floating-point vector of positive values. `loss`

weighs the observations in the input data with the corresponding values in `Weights`

. The size of `Weights`

must equal *n*, which is the number of observations in the input data.

By default, `Weights`

is `ones(`

.* n*,1)

For more details, see Observation Weights.

**Data Types: **`double`

| `single`

*Classification loss* functions measure the predictive inaccuracy of classification models. When you compare the same type of loss among many models, a lower loss indicates a better predictive model.

Consider the following scenario.

*L*is the weighted average classification loss.*n*is the sample size.For binary classification:

*y*is the observed class label. The software codes it as –1 or 1, indicating the negative or positive class (or the first or second class in the_{j}`ClassNames`

property), respectively.*f*(*X*) is the positive-class classification score for observation (row)_{j}*j*of the predictor data*X*.*m*=_{j}*y*_{j}*f*(*X*) is the classification score for classifying observation_{j}*j*into the class corresponding to*y*. Positive values of_{j}*m*indicate correct classification and do not contribute much to the average loss. Negative values of_{j}*m*indicate incorrect classification and contribute significantly to the average loss._{j}

The weight for observation

*j*is*w*._{j}

Given this scenario, the following table describes the supported loss functions that you can specify by using the `'LossFun'`

name-value pair argument.

Loss Function | Value of `LossFun` | Equation |
---|---|---|

Binomial deviance | `"binodeviance"` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{log}\left\{1+\mathrm{exp}\left[-2{m}_{j}\right]\right\}}.$$ |

Exponential loss | `"exponential"` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{exp}\left(-{m}_{j}\right)}.$$ |

Misclassification rate in decimal | `"classiferror"` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}}I\left\{{\widehat{y}}_{j}\ne {y}_{j}\right\}.$$ $${\widehat{y}}_{j}$$ is the class label corresponding to the class with the maximal score. |

Hinge loss | `"hinge"` | $$L={\displaystyle \sum}_{j=1}^{n}{w}_{j}\mathrm{max}\left\{0,1-{m}_{j}\right\}.$$ |

Logit loss | `"logit"` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{log}\left(1+\mathrm{exp}\left(-{m}_{j}\right)\right)}.$$ |

Quadratic loss | `"quadratic"` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}{\left(1-{m}_{j}\right)}^{2}}.$$ |

This figure compares the loss functions over the score *m* for one observation. Some functions are normalized to pass through the point (0,1).

*Regression loss* functions measure the predictive inaccuracy of regression models. When you compare the same type of loss among many models, a lower loss indicates a better predictive model.

Consider the following scenario.

*L*is the weighted average classification loss.*n*is the sample size.*y*is the observed response of observation_{j}*j*.*f*(*X*) =_{j}*β*_{0}+*x*_{j}*β*is the predicted value of observation*j*of the predictor data*X*, where*β*_{0}is the bias and*β*is the vector of coefficients.The weight for observation

*j*is*w*._{j}

Given this scenario, the following table describes the supported loss functions that you can specify by using the `'LossFun'`

name-value pair argument.

Loss Function | Value of `LossFun` | Equation |
---|---|---|

Epsilon insensitive loss | `"epsiloninsensitive"` |
$$L=\mathrm{max}\left[0,\left|y-f\left(x\right)\right|-\epsilon \right].$$ |

Mean squared error | `"mse"` |
$$L={\left[y-f\left(x\right)\right]}^{2}.$$ |

For classification problems, if the prior class probability distribution is known (in other words, the prior distribution is not empirical), `loss`

normalizes observation weights to sum to the prior class probabilities in the respective classes. This action implies that observation weights are the respective prior class probabilities by default.

For regression problems or if the prior class probability distribution is empirical, the software normalizes the specified observation weights to sum to 1 each time you call `loss`

.

Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

Use

`saveLearnerForCoder`

,`loadLearnerForCoder`

, and`codegen`

(MATLAB Coder) to generate code for the`loss`

function. Save a trained model by using`saveLearnerForCoder`

. Define an entry-point function that loads the saved model by using`loadLearnerForCoder`

and calls the`loss`

function. Then use`codegen`

to generate code for the entry-point function.To generate single-precision C/C++ code for loss, specify the name-value argument

`'DataType','single'`

when you call the`loadLearnerForCoder`

function.This table contains notes about the arguments of

`loss`

. Arguments not included in this table are fully supported.Argument Notes and Limitations `Mdl`

For usage notes and limitations of the model object, see

`incrementalClassificationLinear`

or`incrementalRegressionLinear`

.`X`

Batch-to-batch, the number of observations can be a variable size, but must equal the number of observations in

`Y`

.The number of predictor variables must equal to

`Mdl.NumPredictors`

.`X`

must be`single`

or`double`

.

`Y`

Batch-to-batch, the number of observations can be a variable size, but must equal the number of observations in

`X`

.For classification problems, all labels in

`Y`

must be represented in`Mdl.ClassNames`

.`Y`

and`Mdl.ClassNames`

must have the same data type.

`'LossFun'`

The specified function cannot be an anonymous function. If you configure

`Mdl`

to shuffle data (`Mdl.Shuffle`

is`true`

, or`Mdl.Solver`

is`'sgd'`

or`'asgd'`

), the`loss`

function randomly shuffles each incoming batch of observations before it fits the model to the batch. The order of the shuffled observations might not match the order generated by MATLAB^{®}. Therefore, if you fit`Mdl`

before computing the loss, the loss computed in MATLAB and those computed by the generated code might not be equal.Use a homogeneous data type for all floating-point input arguments and object properties, specifically, either

`single`

or`double`

.

For more information, see Introduction to Code Generation.

You have a modified version of this example. Do you want to open this example with your edits?

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)