plotDependence
Syntax
Description
plotDependence(
returns a dependence plot for the predictor specified by explainer
,predictor
)predictor
and
the Shapley values in the shapley
object explainer
.
The plot contains Shapley values for the query points in
explainer.QueryPoints
.
If
predictor
specifies a categorical predictor (explainer.
), then the function displays a box plot of the corresponding Shapley values for each category. Each box plot displays: the median, the lower and upper quartiles, any outliers (computed using the interquartile range), and the minimum and maximum values that are not outliers.CategoricalPredictors
If
predictor
specifies a noncategorical predictor, then the function displays a scatter plot of the corresponding Shapley values.
If explainer.
is a classification model, the function displays a
plot for class BlackboxModel
explainer.BlackboxModel.ClassNames(1)
by default.
plotDependence(
specifies additional options using one or more name-value arguments. For example, use color
to display a second predictor in the plot by specifying the
explainer
,predictor
,Name=Value
)ColorPredictor
name-value argument.
plotDependence(
displays
the dependence plot in the target axes ax
,___)ax
. Specify
ax
as the first argument in any of the previous syntaxes.
returns a
p
= plotDependence(___)Box
or Scatter
object. Use p
to
query or modify the properties (BoxChart Properties or Scatter Properties) of an object after you
create it.
Examples
Shapley Dependence Plot for One Predictor
Train a classification model and create a shapley
object. Use the fit
object function to compute the Shapley values for the specified query points. Then for each predictor, visualize the dependence of the Shapley values on the predictor values by using the plotDependence
object function.
Load the CreditRating_Historical
data set. The data set contains customer IDs and their financial ratios, industry labels, and credit ratings.
tbl = readtable("CreditRating_Historical.dat");
Display the first three rows of the table.
head(tbl,3)
ID WC_TA RE_TA EBIT_TA MVE_BVTD S_TA Industry Rating _____ _____ _____ _______ ________ _____ ________ ______ 62394 0.013 0.104 0.036 0.447 0.142 3 {'BB'} 48608 0.232 0.335 0.062 1.969 0.281 8 {'A' } 42444 0.311 0.367 0.074 1.935 0.366 1 {'A' }
Train a blackbox model of credit ratings by using the fitcecoc
function. Use the variables from the second through seventh columns in tbl
as the predictor variables. A recommended practice is to specify the class names to set the order of the classes.
blackbox = fitcecoc(tbl,"Rating", ... PredictorNames=tbl.Properties.VariableNames(2:7), ... CategoricalPredictors="Industry", ... ClassNames={'AAA','AA','A','BBB','BB','B','CCC'});
Create a shapley
object that explains the predictions for multiple query points. For faster computation, shapley
subsamples 100 observations from the predictor data in blackbox
to compute the Shapley values. Specify the sampled observations as the query points in the call to the fit
object function.
rng("default") % For reproducibility explainer = shapley(blackbox); queryPoints = explainer.X(explainer.SampledObservationIndices,:); explainer = fit(explainer,queryPoints);
Visualize the Shapley values for a specified predictor by using the plotDependence
object function.
predictor = "MVE_BVTD";
plotDependence(explainer,predictor)
By default, the function shows the Shapley values for the first class, AAA
. For noncategorical predictors, the function displays a scatter plot, where the x-axis corresponds to the predictor values and the y-axis corresponds to the Shapley values for the predictor.
For class AAA
, the Shapley values for the MVE_BVTD
predictor tend to increase as the predictor values increase from 0
to 4
. For MVE_BVTD
values greater than 4
, the corresponding Shapley values tend to remain constant (between 1.5
and 2
).
For categorical predictors, plotDependence
displays box plots for each category in the categorical predictor. The function determines categorical predictors based on the CategoricalPredictors
property of the shapley
object.
Visualize the Shapley values for the categorical predictor Industry
. Specify the class.
class = "A"; plotDependence(explainer,"Industry",ClassName=class)
For class A
, the distribution of the Shapley values varies across different industries. For example, industry 3
has exclusively positive Shapley values, whereas industry 9
has exclusively negative Shapley values.
Shapley Dependence Plot with Additional Color Predictor
Train a regression model and create a shapley
object using multiple query points. Then for each predictor, visualize the dependence of the Shapley values on the predictor values. Use color to see the dependence on a second predictor.
Load the carbig
data set, which contains measurements of cars made in the 1970s and early 1980s.
load carbig
Create a table containing the predictor variables Acceleration
, Cylinders
, and so on, as well as the response variable MPG
.
tbl = table(Acceleration,Cylinders,Displacement, ...
Horsepower,Model_Year,Weight,MPG);
Removing missing values in a training set helps to reduce memory consumption and speed up training for the fitrkernel
function. Remove missing values in tbl
.
tbl = rmmissing(tbl);
Train a blackbox model of MPG
by using the fitrkernel
function. Specify the Cylinders
and Model_Year
variables as categorical predictors. Standardize the remaining predictors.
mdl = fitrkernel(tbl,"MPG",CategoricalPredictors=[2 5], ... Standardize=true);
Create a shapley
object that explains the predictions for multiple query points. Because mdl
does not contain training data, specify to compute Shapley values using the predictor data in tbl
. For faster computation, specify to subsample 200 observations from tbl
. Use all observations in tbl
as query points.
explainer = shapley(mdl,tbl,NumObservationsToSample=200, ...
QueryPoints=tbl);
Visualize the Shapley values for a specific predictor by using the plotDependence
object function. Use color to display a second predictor. Note that if you want to specify a color predictor, the x-axis predictor must be a noncategorical predictor.
predictor = "Weight"; colorPredictor = "Horsepower"; plotDependence(explainer,predictor,ColorPredictor=colorPredictor)
For Weight
values between 2000
and 4000
, the corresponding Shapley values tend to decrease as the Weight
values increase. Based on the color of the points in the plot, Horsepower
values tend to increase as Weight
values increase.
Input Arguments
explainer
— Object explaining blackbox model
shapley
object
Object explaining the blackbox model, specified as a shapley
object. explainer
must contain Shapley values; that is,
explainer.Shapley
must be nonempty.
predictor
— Predictor variable
positive integer scalar | character vector | string scalar
Predictor variable to plot, specified as a positive integer scalar, character vector, or string scalar.
If you specify a positive integer scalar, it must be the index value corresponding to a column in the predictor data
explainer.X
.If you specify a character vector or string scalar, it must be the name of a predictor variable. When
explainer.
is a machine learning model object, the name must match one of the names in theBlackboxModel
PredictorNames
property of the model (explainer.BlackboxModel.PredictorNames
). Whenexplainer.BlackboxModel
is a custom model specified as a function handle, the name must match one of the variable names inexplainer.X
.
Example: "x1"
Data Types: single
| double
| char
| string
ax
— Axes for plot
Axes
object
Axes for the plot, specified as an Axes
object. If you do not specify ax
, then plotDependence
creates the plot using the current axes. For more information on creating an Axes
object, see axes
.
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: plotDependence(explainer,"x1",ColorPredictor="x3",ColorMap="abyss")
creates a scatter plot of Shapley values for the numeric predictor x1
and
uses the x3
predictor to color the points with the
abyss
colormap.
ClassName
— Class label to plot
explainer.BlackboxModel.ClassNames(1)
(default) | numeric scalar | logical scalar | character vector | string scalar | categorical scalar
Class label to plot, specified as a numeric scalar, logical scalar, character vector, string
scalar, or categorical scalar. The value and data type of ClassName
must match one of the class names in the ClassNames
property of the
machine learning model in explainer
(explainer.BlackboxModel.ClassNames
). The software accepts
character vectors, string scalars, and categorical scalars interchangeably.
This argument is valid only when the machine learning model (BlackboxModel
) in explainer
is a classification model.
Example: ClassName="AAA"
Data Types: single
| double
| logical
| char
| string
| categorical
ColorPredictor
— Predictor variable to plot using color
[]
(default) | positive integer scalar | character vector | string scalar
Predictor variable to plot using color, specified as a positive integer scalar, character vector, or string scalar.
If you specify a positive integer scalar, it must be the index value corresponding to a column in the predictor data
explainer.X
.If you specify a character vector or string scalar, it must be the name of a predictor variable. When
explainer.
is a machine learning model object, the name must match one of the names in theBlackboxModel
PredictorNames
property of the model (explainer.BlackboxModel.PredictorNames
). Whenexplainer.BlackboxModel
is a custom model specified as a function handle, the name must match one of the variable names inexplainer.X
.
For more information on how plotDependence
maps color
predictor values to the colormap, see Color Assignment for Color Predictor Values.
This argument is valid only when the variable predictor
is
not a categorical predictor.
Example: "x2"
Data Types: single
| double
| char
| string
ColorMap
— Colormap for plot
"default"
(default) | "bluered"
| colormap name | three-column matrix of RGB triplets
Colormap for the plot, specified as "default"
,
"bluered"
, a colormap name, or a three-column matrix of RGB triplets.
A value of
"default"
sets the colormap to the default colormap for the target axesax
, and a value of"bluered"
sets the colormap to a color scale that ranges from blue to red.A colormap name specifies a predefined colormap, and a three-column matrix of RGB triplets specifies a custom colormap. For more information on the available colormaps and the creation of a matrix of RGP triplets, see
map
.
This argument is valid only when the variable predictor
is
not a categorical predictor, and the color predictor variable
ColorPredictor
is specified.
Example: ColorMap="parula"
Example: ColorMap="bluered"
Data Types: single
| double
| char
| string
Output Arguments
p
— Dependence plot
BoxChart
object | Scatter
object
Dependence plot, returned as a BoxChart
or Scatter
object.
If
predictor
specifies a categorical predictor, thenp
is aBoxChart
object. For more information, see BoxChart Properties.If
predictor
specifies a noncategorical predictor, thenp
is aScatter
object. For more information, see Scatter Properties.
More About
Shapley Values
In game theory, the Shapley value of a player is the average marginal contribution of the player in a cooperative game. In the context of machine learning prediction, the Shapley value of a feature for a query point explains the contribution of the feature to a prediction (response for regression or score of each class for classification) at the specified query point.
The Shapley value of a feature for a query point is the contribution of the feature to the deviation from the average prediction. For a query point, the sum of the Shapley values for all features corresponds to the total deviation of the prediction from the average. That is, the sum of the average prediction and the Shapley values for all features corresponds to the prediction for the query point.
For more details, see Shapley Values for Machine Learning Model.
Tips
Use
plotDependence
whenexplainer
contains Shapley values for many query points.
Algorithms
Color Assignment for Color Predictor Values
plotDependence
maps color predictor values
(ColorPredictor
) to the colormap (ColorMap
) as follows:
If the color predictor is numeric, the function maps the minimum and maximum values to the appropriate colormap endpoints, and maps the remaining values to the interior of the colormap range.
If the color predictor is nonnumeric, the function maps categories to discrete colors in the colormap.
Version History
Introduced in R2024b
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)