loss

Regression loss for Gaussian kernel regression model

Description

example

L = loss(Mdl,X,Y) returns the mean squared error (MSE) for the Gaussian kernel regression model Mdl using the predictor data in X and the corresponding responses in Y.

example

L = loss(Mdl,X,Y,Name,Value) uses additional options specified by one or more name-value pair arguments. For example, you can specify a regression loss function and observation weights. Then, loss returns the weighted regression loss using the specified loss function.

Examples

collapse all

Train a Gaussian kernel regression model for a tall array, then calculate the resubstitution mean squared error and epsilon-insensitive error.

When you perform calculations on tall arrays, MATLAB® uses either a parallel pool (default if you have Parallel Computing Toolbox™) or the local MATLAB session. If you want to run the example using the local MATLAB session when you have Parallel Computing Toolbox, you can change the global execution environment by using the mapreducer function.

Create a datastore that references the folder location with the data. The data can be contained in a single file, a collection of files, or an entire folder. Treat 'NA' values as missing data so that datastore replaces them with NaN values. Select a subset of the variables to use. Create a tall table on top of the datastore.

varnames = {'ArrTime','DepTime','ActualElapsedTime'};
ds = datastore('airlinesmall.csv','TreatAsMissing','NA',...
    'SelectedVariableNames',varnames);
t = tall(ds);
Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 4).

Specify DepTime and ArrTime as the predictor variables (X) and ActualElapsedTime as the response variable (Y). Select the observations for which ArrTime is later than DepTime.

daytime = t.ArrTime>t.DepTime;
Y = t.ActualElapsedTime(daytime);     % Response data
X = t{daytime,{'DepTime' 'ArrTime'}}; % Predictor data

Standardize the predictor variables.

Z = zscore(X); % Standardize the data

Train a default Gaussian kernel regression model with the standardized predictors. Set 'Verbose',0 to suppress diagnostic messages.

[Mdl,FitInfo] = fitrkernel(Z,Y,'Verbose',0)
Mdl = 
  RegressionKernel
            PredictorNames: {'x1'  'x2'}
              ResponseName: 'Y'
                   Learner: 'svm'
    NumExpansionDimensions: 64
               KernelScale: 1
                    Lambda: 8.5385e-06
             BoxConstraint: 1
                   Epsilon: 5.9303


  Properties, Methods

FitInfo = struct with fields:
                  Solver: 'LBFGS-tall'
            LossFunction: 'epsiloninsensitive'
                  Lambda: 8.5385e-06
           BetaTolerance: 1.0000e-03
       GradientTolerance: 1.0000e-05
          ObjectiveValue: 30.7814
       GradientMagnitude: 0.0191
    RelativeChangeInBeta: 0.0228
                 FitTime: 124.6182
                 History: []

Mdl is a trained RegressionKernel model, and the structure array FitInfo contains optimization details.

Determine how well the trained model generalizes to new predictor values by estimating the resubstitution mean squared error and epsilon-insensitive error.

lossMSE = loss(Mdl,Z,Y) % Resubstitution mean squared error
lossMSE =

  MxNx... tall array

    ?    ?    ?    ...
    ?    ?    ?    ...
    ?    ?    ?    ...
    :    :    :
    :    :    :
lossEI = loss(Mdl,Z,Y,'LossFun','epsiloninsensitive') % Resubstitution epsilon-insensitive error
lossEI =

  MxNx... tall array

    ?    ?    ?    ...
    ?    ?    ?    ...
    ?    ?    ?    ...
    :    :    :
    :    :    :

Evaluate the tall arrays and bring the results into memory by using gather.

[lossMSE,lossEI] = gather(lossMSE,lossEI)
Evaluating tall expression using the Parallel Pool 'local':
- Pass 1 of 1: Completed in 2.7 sec
Evaluation completed in 4 sec
lossMSE = 2.8851e+03
lossEI = 28.0050

Specify a custom regression loss (Huber loss) for a Gaussian kernel regression model.

Load the carbig data set.

load carbig

Specify the predictor variables (X) and the response variable (Y).

X = [Weight,Cylinders,Horsepower,Model_Year];
Y = MPG;

Delete rows of X and Y where either array has NaN values. Removing rows with NaN values before passing data to fitrkernel can speed up training and reduce memory usage.

R = rmmissing([X Y]); 
X = R(:,1:4); 
Y = R(:,end); 

Reserve 10% of the observations as a holdout sample. Extract the training and test indices from the partition definition.

rng(10)  % For reproducibility
N = length(Y);
cvp = cvpartition(N,'Holdout',0.1);
idxTrn = training(cvp); % Training set indices
idxTest = test(cvp);    % Test set indices

Standardize the training data and train the regression kernel model.

Xtrain = X(idxTrn,:);
Ytrain = Y(idxTrn);
[Ztrain,tr_mu,tr_sigma] = zscore(Xtrain); % Standardize the training data
tr_sigma(tr_sigma==0) = 1;
Mdl = fitrkernel(Ztrain,Ytrain)
Mdl = 
  RegressionKernel
              ResponseName: 'Y'
                   Learner: 'svm'
    NumExpansionDimensions: 128
               KernelScale: 1
                    Lambda: 0.0028
             BoxConstraint: 1
                   Epsilon: 0.8617


  Properties, Methods

Mdl is a RegressionKernel model.

Create an anonymous function that measures Huber loss (δ=1), that is,

L=1wjj=1nwjj,

where

j={0.5ejˆ2;|ejˆ|-0.5;|ejˆ|1|ejˆ|>1.

ejˆ is the residual for observation j. Custom loss functions must be written in a particular form. For rules on writing a custom loss function, see the 'LossFun' name-value pair argument.

huberloss = @(Y,Yhat,W)sum(W.*((0.5*(abs(Y-Yhat)<=1).*(Y-Yhat).^2) + ...
    ((abs(Y-Yhat)>1).*abs(Y-Yhat)-0.5)))/sum(W);

Estimate the training set regression loss using the Huber loss function.

eTrain = loss(Mdl,Ztrain,Ytrain,'LossFun',huberloss)
eTrain = 1.7210

Standardize the test data using the same mean and standard deviation of the training data columns. Estimate the test set regression loss using the Huber loss function.

Xtest = X(idxTest,:);
Ztest = (Xtest-tr_mu)./tr_sigma; % Standardize the test data
Ytest = Y(idxTest);

eTest = loss(Mdl,Ztest,Ytest,'LossFun',huberloss)
eTest = 1.3062

Input Arguments

collapse all

Kernel regression model, specified as a RegressionKernel model object. You can create a RegressionKernel model object using fitrkernel.

Predictor data, specified as an n-by-p numeric matrix, where n is the number of observations and p is the number of predictors. p must be equal to the number of predictors used to train Mdl.

Data Types: single | double

Response data, specified as an n-dimensional numeric vector. The length of Y and the number of observations in X must be equal.

Data Types: single | double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: L = loss(Mdl,X,Y,'LossFun','epsiloninsensitive','Weights',weights) returns the weighted regression loss using the epsilon-insensitive loss function.

Loss function, specified as the comma-separated pair consisting of 'LossFun' and a built-in loss function name or a function handle.

  • The following table lists the available loss functions. Specify one using its corresponding character vector or string scalar. Also, in the table, f(x)=T(x)β+b.

    • x is an observation (row vector) from p predictor variables.

    • T(·) is a transformation of an observation (row vector) for feature expansion. T(x) maps x in p to a high-dimensional space (m).

    • β is a vector of m coefficients.

    • b is the scalar bias.

    ValueDescription
    'epsiloninsensitive'Epsilon-insensitive loss: [y,f(x)]=max[0,|yf(x)|ε]
    'mse'MSE: [y,f(x)]=[yf(x)]2

    'epsiloninsensitive' is appropriate for SVM learners only.

  • Specify your own function by using function handle notation.

    Let n be the number of observations in X. Your function must have this signature:

    lossvalue = lossfun(Y,Yhat,W)

    • The output argument lossvalue is a scalar.

    • You choose the function name (lossfun).

    • Y is an n-dimensional vector of observed responses. loss passes the input argument Y in for Y.

    • Yhat is an n-dimensional vector of predicted responses, which is similar to the output of predict.

    • W is an n-by-1 numeric vector of observation weights.

    Specify your function using 'LossFun',@lossfun.

Data Types: char | string | function_handle

Observation weights, specified as the comma-separated pair consisting of 'Weights' and a numeric vector of positive values. loss weighs the observations in X with the corresponding values in Weights. The size of Weights must equal n, the number of observations (rows in X). If you supply the observation weights, loss computes the weighted regression loss, that is, the Weighted Mean Squared Error or Epsilon-Insensitive Loss Function.

loss normalizes Weights to sum to 1.

Data Types: double | single

Output Arguments

collapse all

Regression loss, returned as a numeric scalar. The interpretation of L depends on Weights and LossFun. For example, if you use the default observation weights and specify 'epsiloninsensitive' as the loss function, then L is the epsilon-insensitive loss.

More About

collapse all

Weighted Mean Squared Error

The weighted mean squared error is calculated as follows:

mse=j=1nwj(f(xj)yj)2j=1nwj,

where:

  • n is the number of observations.

  • xj is the jth observation (row of predictor data).

  • yj is the observed response to xj.

  • f(xj) is the response prediction of the Gaussian kernel regression model Mdl to xj.

  • w is the vector of observation weights.

Each observation weight in w is equal to ones(n,1)/n by default. You can specify different values for the observation weights by using the 'Weights' name-value pair argument. loss normalizes Weights to sum to 1.

Epsilon-Insensitive Loss Function

The epsilon-insensitive loss function ignores errors that are within the distance epsilon (ε) of the function value. The function is formally described as:

Lossε={0,if|yf(x)|ε|yf(x)|ε,otherwise.

The mean epsilon-insensitive loss is calculated as follows:

Loss=j=1nwjmax(0,|yjf(xj)|ε)j=1nwj,

where:

  • n is the number of observations.

  • xj is the jth observation (row of predictor data).

  • yj is the observed response to xj.

  • f(xj) is the response prediction of the Gaussian kernel regression model Mdl to xj.

  • w is the vector of observation weights.

Each observation weight in w is equal to ones(n,1)/n by default. You can specify different values for the observation weights by using the 'Weights' name-value pair argument. loss normalizes Weights to sum to 1.

Extended Capabilities

Introduced in R2018a