knearest neighbor classifier template
returns
a knearest neighbor (KNN) learner template suitable
for training ensembles or errorcorrecting output code (ECOC) multiclass
models.t
= templateKNN()
If you specify a default template, then the software uses default values for all input arguments during training.
Specify t
as a learner in fitcensemble
or fitcecoc
.
creates
a template with additional options specified by one or more namevalue
pair arguments.t
= templateKNN(Name,Value
)
For example, you can specify the nearest neighbor search method, the number of nearest neighbors to find, or the distance metric.
If you display t
in the Command Window, then
all options appear empty ([]
), except those that
you specify using namevalue pair arguments. During training, the
software uses default values for empty options.
Create a nondefault knearest neighbor template for use in fitcensemble
.
Load Fisher's iris data set.
load fisheriris
Create a template for a 5nearest neighbor search, and specify to standardize the predictors.
t = templateKNN('NumNeighbors',5,'Standardize',1)
t = Fit template for classification KNN. NumNeighbors: 5 NSMethod: '' Distance: '' BucketSize: '' IncludeTies: [] DistanceWeight: [] BreakTies: [] Exponent: [] Cov: [] Scale: [] StandardizeData: 1 Version: 1 Method: 'KNN' Type: 'classification'
All properties of the template object are empty except for NumNeighbors
, Method
, StandardizeData
, and Type
. When you specify t
as a learner, the software fills in the empty properties with their respective default values.
Specify t
as a weak learner for a classification ensemble.
Mdl = fitcensemble(meas,species,'Method','Subspace','Learners',t);
Display the insample (resubstitution) misclassification error.
L = resubLoss(Mdl)
L = 0.0600
Create a nondefault knearest neighbor template for use in fitcecoc
.
Load Fisher's iris data set.
load fisheriris
Create a template for a 5nearest neighbor search, and specify to standardize the predictors.
t = templateKNN('NumNeighbors',5,'Standardize',1)
t = Fit template for classification KNN. NumNeighbors: 5 NSMethod: '' Distance: '' BucketSize: '' IncludeTies: [] DistanceWeight: [] BreakTies: [] Exponent: [] Cov: [] Scale: [] StandardizeData: 1 Version: 1 Method: 'KNN' Type: 'classification'
All properties of the template object are empty except for NumNeighbors
, Method
, StandardizeData
, and Type
. When you specify t
as a learner, the software fills in the empty properties with their respective default values.
Specify t
as a binary learner for an ECOC multiclass model.
Mdl = fitcecoc(meas,species,'Learners',t);
By default, the software trains Mdl
using the oneversusone coding design.
Display the insample (resubstitution) misclassification error.
L = resubLoss(Mdl,'LossFun','classiferror')
L = 0.0467
Specify optional
commaseparated pairs of Name,Value
arguments. Name
is
the argument name and Value
is the corresponding value.
Name
must appear inside quotes. You can specify several name and value
pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
'NumNeighbors',4,'Distance','minkowski'
specifies
a 4nearest neighbor classifier template using the Minkowski distance
measure.'BreakTies'
— Tiebreaking algorithm'smallest'
(default)  'nearest'
 'random'
Tiebreaking algorithm used by the predict
method
if multiple classes have the same smallest cost, specified as the
commaseparated pair consisting of 'BreakTies'
and
one of the following:
'smallest'
— Use the smallest
index among tied groups.
'nearest'
— Use the class
with the nearest neighbor among tied groups.
'random'
— Use a random
tiebreaker among tied groups.
By default, ties occur when multiple classes have the same number
of nearest points among the K
nearest neighbors.
Example: 'BreakTies','nearest'
'BucketSize'
— Maximum data points in node50
(default)  positive integer valueMaximum number of data points in the leaf node of the kdtree,
specified as the commaseparated pair consisting of 'BucketSize'
and
a positive integer value. This argument is meaningful only when NSMethod
is 'kdtree'
.
Example: 'BucketSize',40
Data Types: single
 double
'Cov'
— Covariance matrixnancov(X)
(default)  positive definite matrix of scalar valuesCovariance matrix, specified as the commaseparated pair consisting
of 'Cov'
and a positive definite matrix of scalar
values representing the covariance matrix when computing the Mahalanobis
distance. This argument is only valid when 'Distance'
is 'mahalanobis'
.
You cannot simultaneously specify 'Standardize'
and
either of 'Scale'
or 'Cov'
.
Data Types: single
 double
'Distance'
— Distance metric'cityblock'
 'chebychev'
 'correlation'
 'cosine'
 'euclidean'
 'hamming'
 function handle  ...Distance metric, specified as the commaseparated pair consisting
of 'Distance'
and a valid distance metric name
or function handle. The allowable distance metric names depend on
your choice of a neighborsearcher method (see NSMethod
).
NSMethod  Distance Metric Names 

exhaustive  Any distance metric of ExhaustiveSearcher 
kdtree  'cityblock' , 'chebychev' , 'euclidean' ,
or 'minkowski' 
This table includes valid distance metrics of ExhaustiveSearcher
.
Distance Metric Names  Description 

'cityblock'  City block distance. 
'chebychev'  Chebychev distance (maximum coordinate difference). 
'correlation'  One minus the sample linear correlation between observations (treated as sequences of values). 
'cosine'  One minus the cosine of the included angle between observations (treated as vectors). 
'euclidean'  Euclidean distance. 
'hamming'  Hamming distance, percentage of coordinates that differ. 
'jaccard'  One minus the Jaccard coefficient, the percentage of nonzero coordinates that differ. 
'mahalanobis'  Mahalanobis distance, computed using a positive definite covariance matrix
C . The default value of C is the sample
covariance matrix of X , as computed by
nancov(X) . To specify a different value for C ,
use the 'Cov' namevalue pair argument. 
'minkowski'  Minkowski distance. The default exponent is 2 .
To specify a different exponent, use the 'Exponent' namevalue
pair argument. 
'seuclidean'  Standardized Euclidean distance. Each coordinate difference between X
and a query point is scaled, meaning divided by a scale value S .
The default value of S is the standard deviation computed from
X , S = nanstd(X) . To specify
another value for S , use the Scale namevalue
pair argument. 
'spearman'  One minus the sample Spearman's rank correlation between observations (treated as sequences of values). 
@ 
Distance function handle. function D2 = distfun(ZI,ZJ) % calculation of distance ...

If you specify CategoricalPredictors
as 'all'
,
then the default distance metric is 'hamming'
.
Otherwise, the default distance metric is 'euclidean'
.
For definitions, see Distance Metrics.
Example: 'Distance','minkowski'
Data Types: char
 string
 function_handle
'DistanceWeight'
— Distance weighting function'equal'
(default)  'inverse'
 'squaredinverse'
 function handleDistance weighting function, specified as the commaseparated
pair consisting of 'DistanceWeight'
and either
a function handle or one of the values in this table.
Value  Description 

'equal'  No weighting 
'inverse'  Weight is 1/distance 
'squaredinverse'  Weight is 1/distance^{2} 
@  fcn is a function that accepts a
matrix of nonnegative distances, and returns a matrix the same size
containing nonnegative distance weights. For example, 'squaredinverse' is
equivalent to @(d)d.^(2) . 
Example: 'DistanceWeight','inverse'
Data Types: char
 string
 function_handle
'Exponent'
— Minkowski distance exponent2
(default)  positive scalar valueMinkowski distance exponent, specified as the commaseparated
pair consisting of 'Exponent'
and a positive scalar
value. This argument is only valid when 'Distance'
is 'minkowski'
.
Example: 'Exponent',3
Data Types: single
 double
'IncludeTies'
— Tie inclusion flagfalse
(default)  true
Tie inclusion flag, specified as the commaseparated pair consisting
of 'IncludeTies'
and a logical value indicating
whether predict
includes all the neighbors whose
distance values are equal to the K
th smallest distance.
If IncludeTies
is true
, predict
includes
all these neighbors. Otherwise, predict
uses exactly K
neighbors.
Example: 'IncludeTies',true
Data Types: logical
'NSMethod'
— Nearest neighbor search method'kdtree'
 'exhaustive'
Nearest neighbor search method, specified as the commaseparated
pair consisting of 'NSMethod'
and 'kdtree'
or 'exhaustive'
.
'kdtree'
— Creates and uses a
kdtree to find nearest neighbors.
'kdtree'
is valid when the distance metric is one of the
following:
'euclidean'
'cityblock'
'minkowski'
'chebychev'
'exhaustive'
— Uses the exhaustive search algorithm.
When predicting the class of a new point xnew
, the software
computes the distance values from all points in X
to
xnew
to find nearest neighbors.
The default is 'kdtree'
when X
has 10
or
fewer columns, X
is not sparse, and the distance
metric is a 'kdtree'
type; otherwise, 'exhaustive'
.
Example: 'NSMethod','exhaustive'
'NumNeighbors'
— Number of nearest neighbors to find1
(default)  positive integer valueNumber of nearest neighbors in X
to find
for classifying each point when predicting, specified as the commaseparated
pair consisting of 'NumNeighbors'
and a positive
integer value.
Example: 'NumNeighbors',3
Data Types: single
 double
'Scale'
— Distance scalenanstd(X)
(default)  vector of nonnegative scalar valuesDistance scale, specified as the commaseparated pair consisting
of 'Scale'
and a vector containing nonnegative
scalar values with length equal to the number of columns in X
.
Each coordinate difference between X
and a query
point is scaled by the corresponding element of Scale
.
This argument is only valid when 'Distance'
is 'seuclidean'
.
You cannot simultaneously specify 'Standardize'
and
either of 'Scale'
or 'Cov'
.
Data Types: single
 double
'Standardize'
— Flag to standardize predictorsfalse
(default)  true
Flag to standardize the predictors, specified as the commaseparated
pair consisting of 'Standardize'
and true
(1
)
or false
(0)
.
If you set 'Standardize',true
, then the software
centers and scales each column of the predictor data (X
)
by the column mean and standard deviation, respectively.
The software does not standardize categorical predictors, and throws an error if all predictors are categorical.
You cannot simultaneously specify 'Standardize',1
and
either of 'Scale'
or 'Cov'
.
It is good practice to standardize the predictor data.
Example: 'Standardize',true
Data Types: logical
t
— kNN classification templatekNN classification template suitable for training ensembles or
errorcorrecting output code (ECOC) multiclass models, returned as a
template object. Pass t
to fitcensemble
or fitcecoc
to specify how to
create the KNN classifier for the ensemble or ECOC model,
respectively.
If you display t
to the Command Window, then
all, unspecified options appear empty ([]
). However,
the software replaces empty options with their corresponding default
values during training.
ClassificationKNN
 ExhaustiveSearcher
 fitcecoc
 fitcensemble
A modified version of this example exists on your system. Do you want to open this version instead?
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
Select web siteYou can also select a web site from the following list:
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.