classperf
Evaluate classifier performance
Syntax
Description
classperf
without input arguments displays the properties of a
classperformance
object. For more information, see classperformance Properties.
creates an empty cp
= classperf(groundTruth
)classperformance
object cp
using
the true labels groundTruth
for every observation in your data
set.
creates a cp
= classperf(groundTruth
,classifierOutput
)classperformance
object cp
using the true
labels groundTruth
, and then updates the object properties based on
the results of the classifier classifierOutput
. Use this syntax when
you want to know the classifier performance on a single validation run.
classperf(
updates the cp
,classifierOutput
)classperformance
object cp
with the
results of a classifier classifierOutput
. Use this syntax to update
the performance of the classifier iteratively, such as inside a for
loop for multiple cross-validation runs.
classperf(
uses cp
,classifierOutput
,testIdx
)testIdx
to compare the results of the classifier to the true
labels and update the object cp
. testIdx
represents a subset of the true labels (ground truth) in the current validation.
classperf(___,
specifies additional options with one or more Name,Value
)Name,Value
pair
arguments. Specify these options after all other input arguments.
Examples
Perform 10-Fold Cross-Validation
Create indices for the 10-fold cross-validation and classify measurement data for the Fisher iris data set. The Fisher iris data set contains width and length measurements of petals and sepals from three species of irises.
Load the data set.
load fisheriris
Create indices for the 10-fold cross-validation.
indices = crossvalind('Kfold',species,10);
Initialize an object to measure the performance of the classifier.
cp = classperf(species);
Perform the classification using the measurement data and report the error rate, which is the ratio of the number of incorrectly classified samples divided by the total number of classified samples.
for i = 1:10 test = (indices == i); train = ~test; class = classify(meas(test,:),meas(train,:),species(train,:)); classperf(cp,class,test); end cp.ErrorRate
ans = 0.0200
Suppose you want to use the observation data from the setosa
and virginica
species only and exclude the versicolor
species from cross-validation.
labels = {'setosa','virginica'}; indices = crossvalind('Kfold',species,10,'Classes',labels);
indices
now contains zeros for the rows that belong to the versicolor
species.
Perform the classification again.
for i = 1:10 test = (indices == i); train = ~test; class = classify(meas(test,:),meas(train,:),species(train,:)); classperf(cp,class,test); end cp.ErrorRate
ans = 0.0160
Classify Fisher Iris Data Using K-Nearest Neighbor
Load the data set.
load fisheriris
X = meas;
Y = species;
X is a numeric matrix that contains four petal measurements for 150 irises. Y contains the true class names (labels) of the corresponding iris species.
Initialize the classperformance
object using the true labels.
cp = classperf(Y)
cp = classperformance with properties: ClassLabels: {3x1 cell} GroundTruth: [150x1 double] NumberOfObservations: 150 ValidationCounter: 0 SampleDistribution: [150x1 double] ErrorDistribution: [150x1 double] SampleDistributionByClass: [3x1 double] ErrorDistributionByClass: [3x1 double] CountingMatrix: [4x3 double] CorrectRate: NaN ErrorRate: NaN LastCorrectRate: 0 LastErrorRate: 0 InconclusiveRate: NaN ClassifiedRate: NaN Sensitivity: NaN Specificity: NaN PositivePredictiveValue: NaN NegativePredictiveValue: NaN PositiveLikelihood: NaN NegativeLikelihood: NaN Prevalence: NaN DiagnosticTable: [2x2 double] Label: '' Description: '' ControlClasses: [2x1 double] TargetClasses: 1
Perform the classification using the k-nearest neighbor classifier. Cross-validate the model 10 times by using 145 samples as the training set and 5 samples as the test set. After each cross-validation run, update the classifier performance object with the results.
for i = 1:10 [train,test] = crossvalind('LeaveMOut',Y,5); mdl = fitcknn(X(train,:),Y(train),'NumNeighbors',3); predictions = predict(mdl,X(test,:)); classperf(cp,predictions,test); end
Report the classification error rate, which is a ratio of the number of incorrectly classified samples divided by the total number of classified samples.
cp.ErrorRate
ans = 0.0467
Input Arguments
groundTruth
— True labels
vector of integers | logical vector | string vector | cell array of character vectors
True labels for all observations in your data set, specified as a vector of integers, logical vector, string vector, or cell array of character vectors.
classifierOutput
— Classification results
vector of integers | logical vector | string vector | cell array of character vectors
Classification results from a classifier, specified as a vector of integers, logical
vector, string vector, or cell array of character vectors. When
classifierOutput
is a cell array of character vectors or string
vector, an empty character vector or string represents an inconclusive result. For a
vector of integers, NaN
represents an inconclusive result.
If you do not specify
testIdx
,classifierOutput
must be the same size and data type asgroundTruth
.If you specify
testIdx
as a vector of integers,classifierOutput
must have the same number of elements astestIdx
. IftestIdx
is a logical vector, the number of elements inclassifierOutput
must equalsum(testIdx)
.
cp
— Classifier performance information
classperformance
object
Classifier performance information, specified as a classperformance
object. For details, see classperformance Properties.
testIdx
— Subset of true labels
vector of integers | logical vector
Subset of true labels (groundTruth
), specified as a vector of
integers or logical vector. The testIdx
argument indicates a subset
of true labels (from a test set). The function uses testIdx
as an
index vector to get a subset of labels from groundTruth
, such as
groundTruth(testIdx)
.
If
testIdx
is a logical vector, its length must equal the total number of observations (cp.NumberOfObservations
).If
testIdx
is a vector of integers, it cannot contain duplicate integers, and each integer must be greater than0
but less than or equal to the total number of observations.
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: cp = classperf(groundTruth,classifierOutput,'Positive',[1 2
3])
specifies the labels for the target (diseased) classes.
Positive
— Labels for target classes
vector of integers | logical vector | string vector | cell array of character vectors
Labels for the target classes, specified as the comma-separated pair consisting of
'Positive'
and a vector of integers, logical vector, string
vector, or cell array of character vectors.
If
groundTruth
is a vector of integers, the positive label and negative label (specified by the'Negative'
name-value pair argument) must be vectors of integers.If
groundTruth
is a string vector or cell array of character vectors, the positive label and negative label can be string vectors, cell arrays of character vectors, or vectors of positive integers. The entries must be a subset of
.grp2idx
(groundTruth)
By default, the positive label corresponds to the first class returned by
grp2idx(groundTruth)
and the negative label corresponds to all
other classes.
The function uses the positive label to set the TargetClasses
property of the cp
object.
The positive and negative labels are disjoint subsets of
unique(groundTruth)
. For example, suppose you have a data set
that contains data from six patients. Five patients have ovarian, lung, prostate,
skin, or brain cancer, and one patient does not have cancer. Then ClassLabels
= {'Ovarian', 'Lung', 'Prostate', 'Skin', 'Brain', 'Healthy'}
. You can
test a classifier for lung cancer only by setting the positive label to
[2]
and the negative label to [1 3 4 5 6]
.
Alternatively, you can test for any type of cancer by setting the positive label to
[1 2 3 4 5]
and the negative label to
[6]
.
In clinical tests, the function counts inconclusive values (empty character vector
''
or NaN
) as false negatives to calculate the
specificity and as false positives to calculate the sensitivity. The function dose not
count any tested observation with its true class not within the union of positive
label and negative label. However, if the true class of a tested observation is within
the union but its predicted class is not covered by groundTruth
,
the function counts that observation as inconclusive.
Example: 'Positive',[1 2]
Negative
— Labels for control classes
vector of integers | logical vector | string vector | cell array of character vectors
Labels for the control classes, specified as the comma-separated pair consisting
of 'Negative'
and a vector of integers, logical vector, string
vector, or cell array of character vectors.
If
groundTruth
is a vector of integers, the positive label and negative label (specified by the'Negative'
name-value pair argument) must be vectors of integers.If
groundTruth
is a string vector or cell array of character vectors, the positive label and negative label can be string vectors, cell arrays of character vectors, or vectors of positive integers. The entries must be a subset of
.grp2idx
(groundTruth)
By default, the positive label corresponds to the first class returned by
grp2idx(groundTruth)
and the negative label corresponds to all
other classes.
The function uses the negative label to set the
ControlClasses
property of the cp
object.
For details on how the function uses the positive and negative labels, see Positive.
Example: 'Negative',[3]
Version History
Introduced before R2006a
See Also
classperformance Properties | crossvalind
| classify
| grp2idx
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)