# predict

Predict responses for new observations from naive Bayes classification model for incremental learning

## Syntax

``label = predict(Mdl,X)``
``[label,Posterior,Cost] = predict(Mdl,X)``

## Description

example

``label = predict(Mdl,X)` returns the predicted responses or labels `label` of the observations in the predictor data `X` from the naive Bayes classification model for incremental learning `Mdl`.`

example

``[label,Posterior,Cost] = predict(Mdl,X)` also returns the posterior probabilities (`Posterior`) and predicted (expected) misclassification costs (`Cost`) corresponding to the observations (rows) in `X`. For each observation in `X`, the predicted class label corresponds to the minimum expected classification cost among all classes.`

## Examples

collapse all

Load the human activity data set.

`load humanactivity`

For details on the data set, enter `Description` at the command line.

Fit a naive Bayes classification model to the entire data set.

`TTMdl = fitcnb(feat,actid)`
```TTMdl = ClassificationNaiveBayes ResponseName: 'Y' CategoricalPredictors: [] ClassNames: [1 2 3 4 5] ScoreTransform: 'none' NumObservations: 24075 DistributionNames: {1×60 cell} DistributionParameters: {5×60 cell} Properties, Methods ```

`TTMdl` is a `ClassificationNaiveBayes` model object representing a traditionally trained model.

Convert the traditionally trained model to a naive Bayes classification model for incremental learning.

`IncrementalMdl = incrementalLearner(TTMdl)`
```IncrementalMdl = incrementalClassificationNaiveBayes IsWarm: 1 Metrics: [1×2 table] ClassNames: [1 2 3 4 5] ScoreTransform: 'none' DistributionNames: {1×60 cell} DistributionParameters: {5×60 cell} Properties, Methods ```

`IncrementalMdl` is an `incrementalClassificationNaiveBayes` model object prepared for incremental learning.

• The `incrementalLearner` function initializes the incremental learner by passing learned conditional predictor distribution parameters to it, along with other information `TTMdl` learned from the training data.

• `IncrementalMdl` is warm (`IsWarm` is `1`), which means that incremental learning functions can start tracking performance metrics.

An incremental learner created from converting a traditionally trained model can generate predictions without further processing.

Predict class labels for all observations using both models.

```ttlabels = predict(TTMdl,feat); illables = predict(IncrementalMdl,feat); sameLabels = sum(ttlabels ~= illables) == 0```
```sameLabels = logical 1 ```

Both models predict the same labels for each observation.

Load the human activity data set. Randomly shuffle the data.

```load humanactivity n = numel(actid); rng(10); % For reproducibility idx = randsample(n,n); X = feat(idx,:); Y = actid(idx);```

For details on the data set, enter `Description` at the command line.

Create a naive Bayes classification model for incremental learning; specify the class names. Prepare it for `predict` by fitting the model to the first 10 observations.

```Mdl = incrementalClassificationNaiveBayes('ClassNames',unique(Y)); initobs = 10; Mdl = fit(Mdl,X(1:initobs,:),Y(1:initobs)); canPredict = size(Mdl.DistributionParameters,1) == numel(Mdl.ClassNames)```
```canPredict = logical 1 ```

`Mdl` is an `incrementalClassificationNaiveBayes` model. All its properties are read-only. The model is configured to generate predictions.

Simulate a data stream, and perform the following actions on each incoming chunk of 100 observations.

1. Call `predict` to compute class posterior probabilities for each observation in the incoming chunk of data.

2. Consider incrementally measuring how well the model predicts whether a subject is dancing (Y is 5). You can accomplish this by computing the AUC of an ROC curve created by passing, for each observation in the chunk, the difference between the posterior probability of class 5 and the maximum posterior probability among the other classes to `perfcurve`.

3. Call `fit` to fit the model to the incoming chunk. Overwrite the previous incremental model with a new one fitted to the incoming observation.

```numObsPerChunk = 100; nchunk = floor((n - initobs)/numObsPerChunk) - 1; Posterior = zeros(nchunk,numel(Mdl.ClassNames)); auc = zeros(nchunk,1); classauc = 5; % Incremental learning for j = 1:nchunk ibegin = min(n,numObsPerChunk*(j-1) + 1 + initobs); iend = min(n,numObsPerChunk*j + initobs); idx = ibegin:iend; [~,Posterior(idx,:)] = predict(Mdl,X(idx,:)); diffscore = Posterior(idx,classauc) - max(Posterior(idx,setdiff(Mdl.ClassNames,classauc)),[],2); [~,~,~,auc(j)] = perfcurve(Y(idx),diffscore,Mdl.ClassNames(classauc)); Mdl = fit(Mdl,X(idx,:),Y(idx)); end```

`Mdl` is an `incrementalClassificationNaiveBayes` model object trained on all the data in the stream.

Plot the AUC on the incoming chunks of data.

```plot(auc) ylabel('AUC') xlabel('Iteration')```

The AUC suggests that the classifier correctly predicts dancing subjects well during incremental learning.

## Input Arguments

collapse all

Naive Bayes classification model for incremental learning, specified as an `incrementalClassificationNaiveBayes` model object. You can create `Mdl` directly or by converting a supported, traditionally trained machine learning model using the `incrementalLearner` function. For more details, see the corresponding reference page.

You must configure `Mdl` to predict labels for a batch of observations.

• If `Mdl` is a converted, traditionally trained model, you can predict labels without any modifications.

• Otherwise, `Mdl.DistributionParameters` must be a cell matrix with `Mdl.NumPredictors` > 0 columns and at least one row, where each row corresponds to each class name in `Mdl.ClassNames`.

Batch of predictor data for which to predict labels, specified as an n-by-`Mdl.NumPredictors` floating-point matrix.

The length of the observation labels `Y` and the number of observations in `X` must be equal; `Y(j)` is the label of observation j (row or column) in `X`.

Note

`predict` supports only floating-point input predictor data. If the input model `Mdl` represents a converted, traditionally trained model fit to categorical data, use `dummyvar` to convert each categorical variable to a numeric matrix of dummy variables, and concatenate all dummy variable matrices and any other numeric predictors. For more details, see Dummy Variables.

Data Types: `single` | `double`

## Output Arguments

collapse all

Predicted responses (or labels), returned as a categorical or character array; floating-point, logical, or string vector; or cell array of character vectors with n rows. n is the number of observations in `X`, and `label(j)` is the predicted response for observation `j`.

`label` has the same data type as the class names stored in `Mdl.ClassNames`. (The software treats string arrays as cell arrays of character vectors.)

Class posterior probabilities, returned as an n-by-2 floating-point matrix. `Posterior(j,k)` is the posterior probability that observation `j` is in class `k`. `Mdl.ClassNames` specifies the order of the classes.

Expected misclassification costs, returned as an n-by-`numel(Mdl.ClassNames)` floating-point matrix.

`Cost(j,k)` is the expected misclassification cost of the observation in row `j` of `X` predicted into class `k` (`Mdl.ClassNames(k)`).

collapse all

### Misclassification Cost

A misclassification cost is the relative severity of a classifier labeling an observation into the wrong class.

There are two types of misclassification costs: true and expected. Let K be the number of classes.

• True misclassification cost — A K-by-K matrix, where element (i,j) indicates the misclassification cost of predicting an observation into class j if its true class is i. The software stores the misclassification cost in the property `Mdl.Cost`, and uses it in computations. By default, `Mdl.Cost(i,j)` = 1 if `i``j`, and `Mdl.Cost(i,j)` = 0 if `i` = `j`. In other words, the cost is `0` for correct classification and `1` for any incorrect classification.

• Expected misclassification cost — A K-dimensional vector, where element k is the weighted average misclassification cost of classifying an observation into class k, weighted by the class posterior probabilities.

`${c}_{k}=\sum _{j=1}^{K}\stackrel{^}{P}\left(Y=j|{x}_{1},...,{x}_{P}\right)Cos{t}_{jk}.$`

In other words, the software classifies observations to the class corresponding with the lowest expected misclassification cost.

### Posterior Probability

The posterior probability is the probability that an observation belongs in a particular class, given the data.

For naive Bayes, the posterior probability that a classification is k for a given observation (x1,...,xP) is

`$\stackrel{^}{P}\left(Y=k|{x}_{1},..,{x}_{P}\right)=\frac{P\left({X}_{1},...,{X}_{P}|y=k\right)\pi \left(Y=k\right)}{P\left({X}_{1},...,{X}_{P}\right)},$`

where:

• $P\left({X}_{1},...,{X}_{P}|y=k\right)$ is the conditional joint density of the predictors given they are in class k. `Mdl.DistributionNames` stores the distribution names of the predictors.

• π(Y = k) is the class prior probability distribution. `Mdl.Prior` stores the prior distribution.

• $P\left({X}_{1},..,{X}_{P}\right)$ is the joint density of the predictors. The classes are discrete, so $P\left({X}_{1},...,{X}_{P}\right)=\sum _{k=1}^{K}P\left({X}_{1},...,{X}_{P}|y=k\right)\pi \left(Y=k\right).$

### Topics

Introduced in R2021a