# loss

Loss of *k*-nearest neighbor classifier

## Description

returns a scalar representing how well `L`

= loss(`mdl`

,`tbl`

,`ResponseVarName`

)`mdl`

classifies the data
in `tbl`

when `tbl.ResponseVarName`

contains the
true classifications. If `tbl`

contains the response variable
used to train `mdl`

, then you do not need to specify
`ResponseVarName`

.

When computing the loss, the `loss`

function normalizes the
class probabilities in `tbl.ResponseVarName`

to the class
probabilities used for training, which are stored in the `Prior`

property of `mdl`

.

The meaning of the classification loss (`L`

) depends on the
loss function and weighting scheme, but, in general, better classifiers yield
smaller classification loss values. For more details, see Classification Loss.

returns a scalar representing how well `L`

= loss(`mdl`

,`tbl`

,`Y`

)`mdl`

classifies the data
in `tbl`

when `Y`

contains the true
classifications.

When computing the loss, the `loss`

function normalizes the
class probabilities in `Y`

to the class probabilities used for
training, which are stored in the `Prior`

property of
`mdl`

.

returns a scalar representing how well `L`

= loss(`mdl`

,`X`

,`Y`

)`mdl`

classifies the data
in `X`

when `Y`

contains the true
classifications.

When computing the loss, the `loss`

function normalizes the
class probabilities in `Y`

to the class probabilities used for
training, which are stored in the `Prior`

property of
`mdl`

.

specifies options using one or more name-value pair arguments in addition to the
input arguments in previous syntaxes. For example, you can specify the loss function
and the classification weights.`L`

= loss(___,`Name,Value`

)

## Examples

### Loss Calculation

Create a *k*-nearest neighbor classifier for the Fisher iris data, where *k* = 5.

Load the Fisher iris data set.

`load fisheriris`

Create a classifier for five nearest neighbors.

`mdl = fitcknn(meas,species,'NumNeighbors',5);`

Examine the loss of the classifier for a mean observation classified as `'versicolor'`

.

```
X = mean(meas);
Y = {'versicolor'};
L = loss(mdl,X,Y)
```

L = 0

All five nearest neighbors classify as `'versicolor'`

.

## Input Arguments

`mdl`

— *k*-nearest neighbor classifier model

`ClassificationKNN`

object

*k*-nearest neighbor classifier model, specified as a
`ClassificationKNN`

object.

`tbl`

— Sample data

table

Sample data used to train the model, specified as a table. Each row of
`tbl`

corresponds to one observation, and each column corresponds
to one predictor variable. Optionally, `tbl`

can contain one
additional column for the response variable. Multicolumn variables and cell arrays other
than cell arrays of character vectors are not allowed.

If `tbl`

contains the response variable
used to train `mdl`

, then you do not need to specify `ResponseVarName`

or `Y`

.

If you train `mdl`

using sample data contained in a
`table`

, then the input data for `loss`

must also be in a table.

**Data Types: **`table`

`ResponseVarName`

— Response variable name

name of a variable in `tbl`

Response variable name, specified as the name of a variable
in `tbl`

. If `tbl`

contains
the response variable used to train `mdl`

, then
you do not need to specify `ResponseVarName`

.

You must specify `ResponseVarName`

as a character vector or string scalar.
For example, if the response variable is stored as `tbl.response`

, then
specify it as `'response'`

. Otherwise, the software treats all columns
of `tbl`

, including `tbl.response`

, as
predictors.

The response variable must be a categorical, character, or string array, logical or numeric vector, or cell array of character vectors. If the response variable is a character array, then each element must correspond to one row of the array.

**Data Types: **`char`

| `string`

`X`

— Predictor data

numeric matrix

Predictor data, specified as a numeric matrix. Each row of `X`

represents one observation, and each column represents one variable.

**Data Types: **`single`

| `double`

`Y`

— Class labels

categorical array | character array | string array | logical vector | numeric vector | cell array of character vectors

Class labels, specified as a categorical, character, or string array, logical or
numeric vector, or cell array of character vectors. Each row of `Y`

represents the classification of the corresponding row of `X`

.

**Data Types: **`categorical`

| `char`

| `string`

| `logical`

| `single`

| `double`

| `cell`

### Name-Value Arguments

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

**Example:**

`loss(mdl,tbl,'response','LossFun','exponential','Weights','w')`

returns the weighted exponential loss of `mdl`

classifying the data
in `tbl`

. Here, `tbl.response`

is the response
variable, and `tbl.w`

is the weight variable.`LossFun`

— Loss function

`'mincost'`

(default) | `'binodeviance'`

| `'classiferror'`

| `'exponential'`

| `'hinge'`

| `'logit'`

| `'quadratic'`

| function handle

Loss function, specified as the comma-separated pair consisting of
`'LossFun'`

and a built-in loss function name or a
function handle.

The following table lists the available loss functions.

Value Description `'binodeviance'`

Binomial deviance `'classiferror'`

Misclassified rate in decimal `'exponential'`

Exponential loss `'hinge'`

Hinge loss `'logit'`

Logistic loss `'mincost'`

Minimal expected misclassification cost (for classification scores that are posterior probabilities) `'quadratic'`

Quadratic loss `'mincost'`

is appropriate for classification scores that are posterior probabilities. By default,*k*-nearest neighbor models return posterior probabilities as classification scores (see`predict`

).You can specify a function handle for a custom loss function using

`@`

(for example,`@lossfun`

). Let*n*be the number of observations in`X`

and*K*be the number of distinct classes (`numel(mdl.ClassNames)`

). Your custom loss function must have this form:`function lossvalue = lossfun(C,S,W,Cost)`

`C`

is an*n*-by-*K*logical matrix with rows indicating the class to which the corresponding observation belongs. The column order corresponds to the class order in`mdl.ClassNames`

. Construct`C`

by setting`C(p,q) = 1`

, if observation`p`

is in class`q`

, for each row. Set all other elements of row`p`

to`0`

.`S`

is an*n*-by-*K*numeric matrix of classification scores. The column order corresponds to the class order in`mdl.ClassNames`

. The argument`S`

is a matrix of classification scores, similar to the output of`predict`

.`W`

is an*n*-by-1 numeric vector of observation weights. If you pass`W`

, the software normalizes the weights to sum to`1`

.`Cost`

is a*K*-by-*K*numeric matrix of misclassification costs. For example,`Cost = ones(K) – eye(K)`

specifies a cost of`0`

for correct classification and`1`

for misclassification.The output argument

`lossvalue`

is a scalar.

For more details on loss functions, see Classification Loss.

**Data Types: **`char`

| `string`

| `function_handle`

`Weights`

— Observation weights

`ones(size(X,1),1)`

(default) | numeric vector | name of a variable in `tbl`

Observation weights, specified as the comma-separated pair consisting
of `'Weights'`

and a numeric vector or the name of a
variable in `tbl`

.

If you specify `Weights`

as a numeric vector, then
the size of `Weights`

must be equal to the number of
rows in `X`

or `tbl`

.

If you specify `Weights`

as the name of a variable
in `tbl`

, the name must be a character vector or
string scalar. For example, if the weights are stored as
`tbl.w`

, then specify `Weights`

as `'w'`

. Otherwise, the software treats all columns of
`tbl`

, including `tbl.w`

, as
predictors.

`loss`

normalizes the weights so that observation
weights in each class sum to the prior probability of that class. When
you supply `Weights`

, `loss`

computes the weighted classification loss.

**Example: **`'Weights','w'`

**Data Types: **`single`

| `double`

| `char`

| `string`

## Algorithms

### Classification Loss

*Classification loss* functions measure the predictive
inaccuracy of classification models. When you compare the same type of loss among many
models, a lower loss indicates a better predictive model.

Consider the following scenario.

*L*is the weighted average classification loss.*n*is the sample size.For binary classification:

*y*is the observed class label. The software codes it as –1 or 1, indicating the negative or positive class (or the first or second class in the_{j}`ClassNames`

property), respectively.*f*(*X*) is the positive-class classification score for observation (row)_{j}*j*of the predictor data*X*.*m*=_{j}*y*_{j}*f*(*X*) is the classification score for classifying observation_{j}*j*into the class corresponding to*y*. Positive values of_{j}*m*indicate correct classification and do not contribute much to the average loss. Negative values of_{j}*m*indicate incorrect classification and contribute significantly to the average loss._{j}

For algorithms that support multiclass classification (that is,

*K*≥ 3):*y*is a vector of_{j}^{*}*K*– 1 zeros, with 1 in the position corresponding to the true, observed class*y*. For example, if the true class of the second observation is the third class and_{j}*K*= 4, then*y*_{2}^{*}= [0 0 1 0]′. The order of the classes corresponds to the order in the`ClassNames`

property of the input model.*f*(*X*) is the length_{j}*K*vector of class scores for observation*j*of the predictor data*X*. The order of the scores corresponds to the order of the classes in the`ClassNames`

property of the input model.*m*=_{j}*y*_{j}^{*}′*f*(*X*). Therefore,_{j}*m*is the scalar classification score that the model predicts for the true, observed class._{j}

The weight for observation

*j*is*w*. The software normalizes the observation weights so that they sum to the corresponding prior class probability. The software also normalizes the prior probabilities so they sum to 1. Therefore,_{j}$$\sum _{j=1}^{n}{w}_{j}}=1.$$

Given this scenario, the following table describes the supported loss
functions that you can specify by using the `'LossFun'`

name-value pair
argument.

Loss Function | Value of `LossFun` | Equation |
---|---|---|

Binomial deviance | `'binodeviance'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{log}\left\{1+\mathrm{exp}\left[-2{m}_{j}\right]\right\}}.$$ |

Misclassified rate in decimal | `'classiferror'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}}I\left\{{\widehat{y}}_{j}\ne {y}_{j}\right\}.$$ $${\widehat{y}}_{j}$$ is the class label corresponding to the class with the
maximal score. |

Cross-entropy loss | `'crossentropy'` |
The weighted cross-entropy loss is $$L=-{\displaystyle \sum _{j=1}^{n}\frac{{\tilde{w}}_{j}\mathrm{log}({m}_{j})}{Kn}},$$ where the weights $${\tilde{w}}_{j}$$ are normalized to sum to |

Exponential loss | `'exponential'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{exp}\left(-{m}_{j}\right)}.$$ |

Hinge loss | `'hinge'` | $$L={\displaystyle \sum}_{j=1}^{n}{w}_{j}\mathrm{max}\left\{0,1-{m}_{j}\right\}.$$ |

Logit loss | `'logit'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{log}\left(1+\mathrm{exp}\left(-{m}_{j}\right)\right)}.$$ |

Minimal expected misclassification cost | `'mincost'` |
The software computes
the weighted minimal expected classification cost using this procedure
for observations Estimate the expected misclassification cost of classifying the observation *X*into the class_{j}*k*:$${\gamma}_{jk}={\left(f{\left({X}_{j}\right)}^{\prime}C\right)}_{k}.$$ *f*(*X*) is the column vector of class posterior probabilities for binary and multiclass classification for the observation_{j}*X*._{j}*C*is the cost matrix stored in the`Cost` property of the model.For observation *j*, predict the class label corresponding to the minimal expected misclassification cost:$${\widehat{y}}_{j}=\underset{k=1,\mathrm{...},K}{\text{argmin}}{\gamma}_{jk}.$$ Using *C*, identify the cost incurred (*c*) for making the prediction._{j}
The weighted average of the minimal expected misclassification cost loss is $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}{c}_{j}}.$$ If you use the default cost matrix (whose element
value is 0 for correct classification and 1 for incorrect
classification), then the |

Quadratic loss | `'quadratic'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}{\left(1-{m}_{j}\right)}^{2}}.$$ |

This figure compares the loss functions (except `'crossentropy'`

and
`'mincost'`

) over the score *m* for one observation.
Some functions are normalized to pass through the point (0,1).

### True Misclassification Cost

Two costs are associated with KNN classification: the true misclassification cost per class and the expected misclassification cost per observation.

You can set the true misclassification cost per class by using the `'Cost'`

name-value pair argument when you run `fitcknn`

. The value `Cost(i,j)`

is the cost of classifying
an observation into class `j`

if its true class is `i`

. By
default, `Cost(i,j) = 1`

if `i ~= j`

, and
`Cost(i,j) = 0`

if `i = j`

. In other words, the cost
is `0`

for correct classification and `1`

for incorrect
classification.

### Expected Cost

Two costs are associated with KNN classification: the true misclassification cost per class
and the expected misclassification cost per observation. The third output of `predict`

is the expected misclassification cost per
observation.

Suppose you have `Nobs`

observations that you want to classify with a trained
classifier `mdl`

, and you have `K`

classes. You place the
observations into a matrix `Xnew`

with one observation per row. The
command

[label,score,cost] = predict(mdl,Xnew)

returns a matrix `cost`

of size
`Nobs`

-by-`K`

, among other outputs. Each row of the
`cost`

matrix contains the expected (average) cost of classifying the
observation into each of the `K`

classes. `cost(n,j)`

is

$$\sum _{i=1}^{K}\widehat{P}\left(i|Xnew(n)\right)C\left(j|i\right)},$$

where

*K*is the number of classes.$$\widehat{P}\left(i|X(n)\right)$$ is the posterior probability of class

*i*for observation*Xnew*(*n*).$$C\left(j|i\right)$$ is the true misclassification cost of classifying an observation as

*j*when its true class is*i*.

## Extended Capabilities

### Tall Arrays

Calculate with arrays that have more rows than fit in memory.

This function fully supports tall arrays. For more information, see Tall Arrays.

### GPU Arrays

Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Usage notes and limitations:

`loss`

does not support GPU arrays for`ClassificationKNN`

models with the following specifications:The

`'NSMethod'`

property is specified as`'kdtree'`

.The

`'Distance'`

property is specified as a function handle.The

`'IncludeTies'`

property is specified as`true`

.

For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

## See Also

`ClassificationKNN`

| `fitcknn`

| `edge`

| `margin`

**Introduced in R2012a**

## Open Example

You have a modified version of this example. Do you want to open this example with your edits?

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

# Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)