edge
Classification edge for classification tree model
Description
returns the classification
edge
E
= edge(tree
,Tbl
,ResponseVarName
)E
for the trained classification tree model tree
using the predictor data in table Tbl
and the class labels in
Tbl.ResponseVarName
. The classification edge is a numeric scalar value
that represents the weighted average value of the classification
margin.
computes the edge using the observation weights specified in E
= edge(___,Weights=weights
)weights
in
addition to any of the input argument combinations in the previous syntaxes.
Examples
Compute Classification Margin and Edge
Compute the classification margin and edge for the Fisher iris data, trained on its first two columns of data, and view the last 10 entries.
load fisheriris
X = meas(:,1:2);
tree = fitctree(X,species);
E = edge(tree,X,species)
E =
0.6299
M = margin(tree,X,species);
M(end-10:end)
ans = 0.1111 0.1111 0.1111 -0.2857 0.6364 0.6364 0.1111 0.7500 1.0000 0.6364 0.2000
The classification tree trained on all the data is better.
tree = fitctree(meas,species); E = edge(tree,meas,species) E = 0.9384 M = margin(tree,meas,species); M(end-10:end)
ans = 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565 0.9565
Input Arguments
tree
— Trained classification tree
ClassificationTree
model object | CompactClassificationTree
model object
Trained classification tree, specified as a ClassificationTree
model object trained with fitctree
, or a CompactClassificationTree
model object
created with compact
.
X
— Predictor data
numeric matrix
Tbl
— Sample data
table
Sample data, specified as a table. Each row of Tbl
corresponds to
one observation, and each column corresponds to one predictor variable. Optionally,
Tbl
can contain additional columns for the response variable
and observation weights. Tbl
must contain all the predictors used
to train tree
. Multicolumn variables and cell arrays other than
cell arrays of character vectors are not allowed.
If Tbl
contains the response variable used to train
tree
, then you do not need to specify
ResponseVarName
or Y
.
If you train tree
using sample data contained in a table, then
the input data for edge
must also be in a table.
Data Types: table
ResponseVarName
— Response variable name
name of variable in Tbl
Response variable name, specified as the name of a variable in Tbl
. If
Tbl
contains the response variable used to train
tree
, then you do not need to specify
ResponseVarName
.
You must specify ResponseVarName
as a character vector or string scalar.
For example, if the response variable is stored as Tbl.Response
, then
specify it as "Response"
. Otherwise, the software treats all columns
of Tbl
, including Tbl.Response
, as
predictors.
The response variable must be a categorical, character, or string array, a logical or numeric vector, or a cell array of character vectors. If the response variable is a character array, then each element must correspond to one row of the array.
Data Types: char
| string
Y
— Class labels
categorical array | character array | string array | logical vector | numeric vector | cell array of character vectors
Class labels, specified as a categorical, character, or string array, a logical or numeric
vector, or a cell array of character vectors. Y
must be
of the same type as the class labels used to train
tree
, and its number of elements must equal the number
of rows of X
.
Data Types: categorical
| char
| string
| logical
| single
| double
| cell
weights
— Observation weights
ones(size(X,1),1)
(default) | numeric vector | name of variable in Tbl
Observation weights, specified as a numeric vector or the name of a variable in
Tbl
.
If you specify weights
as a numeric vector, then the size of
weights
must be equal to the number of rows in
X
or Tbl
.
If you specify weights
as the name of a variable in
Tbl
, then the name must be a character vector or string scalar.
For example, if the weights are stored as Tbl.W
, then specify
weights
as "W"
. Otherwise, the software treats
all columns of Tbl
, including Tbl.W
, as
predictors.
When you supply weights, edge
computes the weighted classification edge. The
software weighs the observations in each row of X
or
Tbl
with the corresponding weight in
weights
.
Data Types: single
| double
| char
| string
More About
Margin
The classification margin is the difference between the
classification score for the true class and maximal classification
score for the false classes. Margin is a column vector with the same number of rows as the
matrix X
.
Score (tree)
For trees, the score of a classification of a leaf node is the posterior probability of the classification at that node. The posterior probability of the classification at a node is the number of training sequences that lead to that node with the classification, divided by the number of training sequences that lead to that node.
For an example, see Posterior Probability Definition for Classification Tree.
Edge
The edge is the weighted mean value of the classification margin.
The weights are the class probabilities in
tree
.Prior
. If you supply
weights
, those weights are normalized to sum to the prior probabilities
in the respective classes, and are then used to compute the weighted average.
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
The
edge
function fully supports tall arrays. For more information,
see Tall Arrays.
GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.
Usage notes and limitations:
The
edge
function does not support decision tree models trained with surrogate splits.
For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
Version History
Introduced in R2011a
See Also
margin
| loss
| predict
| fitctree
| ClassificationTree
| CompactClassificationTree
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)