Answer to the initial question:
The short answer is that you can obtain the Shapley values of a model created by newff by using function handles (more info below). Consider also if it might be more convenient to use the Classification Learner app and fitcnet, if the classification neural networks trained by fitcnet will meet your use case. A table of the model object types that are supported as the "blackbox model" argument to the shapley function can be seen in the shapley documentation at https://www.mathworks.com/help/stats/shapley.html#mw_c2327b12-104d-48ef-8a71-1f0e8769549b . Models created with newff are not on that list. An arbitrary model (such as the model created with newff) can be used with the shapley function by using a function handle for model prediction. That is mentioned in the shapley doc section referenced immediately above: "Function handle — You can specify a function handle that accepts predictor data and returns a column vector containing a prediction for each observation in the predictor data. The prediction is a predicted response for regression or a predicted score of a single class for classification. You must provide the predictor data using X." See the "Specify Blackbox Model Using Function Handle" section of the shapley documentation page for an example: https://www.mathworks.com/help/stats/shapley.html#mw_c8688099-08ba-41d5-8a6c-0785a609b341 Other observations:
- The function newff is quite old and was "Obsoleted in R2010b NNET 7.0". Consider using instead either feedforwardnet from the Deep Learning Toolbox, or fitcnet from the SMLT toolbox. Models created with fitcnet can be used as the "blackbox" model with the shapley function, so, if the neural networks available using fitcnet cover your use case, that could be the easiest path to do the Shapley analysis. You can also use the Classification Learner app to build fitcnet models.
- The example provided in the question has very little data, with only one training example for each target class. More observation data will be needed for meaningful training, validation, and testing.
Answer using the data provided in the first comment:
The data and model provided in the ANN1LogSig.mat file attached to the first comment show that this is a regression problem with 10 predictors. Using this data we can both: (1) Get the Shapley summary swarmchart for the net model provided, and, (2) Use Regression Learner app to easily build many models on this data, and do Shapley analysis.
(1) Get the Shapley summary swarmchart for the net model provided, using R2024a or higher.
After loading the data and model, the swarmchart can be created in just 3 lines of code.
data = load('ANN1LogSig.mat');
explainer = shapley(f, data.trainInputs', queryPoints=data.testInputs');
figure(1); swarmchart(explainer,ColorMap="BlueRed");
figure(2); plot(explainer)
figure(3);plot(explainer, QueryPointIndices=1);
(2) Use Regression Learner app to easily build many models on this data, and do Shapley analysis.
data = load('ANN1LogSig.mat');
allDataMatrix=data.ALL_DataNormal;
trainDataMatrix=allDataMatrix(1:605,:);
testDataMatrix=allDataMatrix(606:711,:);
In the session start dialog for Regression Learner, choose the trainDataMatrix variable, and choose 10-fold cross-validation. Do not set aside a test set in the session start dialog. After the session start, import the "testDataMatrix" using the "test" tab in the app.
Regression Learner makes it very easy train many types of models. Use the "All" preset to try many model types, then try the various "Optimizable" presets to optimize hyperparameters for some model types. After training and testing many models (without writing any code), looking at the "Compare Results" plot, we see that the Gaussian Process Regression models have the best RMSE using cross validation:
After choosing Model 7 based on it having the lowest validation RMSE, we can check the corresponding performance on the test data in order to compare with the test RMSE performance of the net model in the ANN1LogSig.mat.
Model 7, which has the lowest validation RMSE, has a test RMSE of 0.0337. This test RMSE of 0.0337 is less than half of the test RMSE of 0.0754 given by net model in the ANN1LogSig.mat.
>> data = load('ANN1LogSig.mat');
Starting in R2024b, additional Shapley plots, including the Shapley Summary swarmchart, are available in the Classification Learner and Regression Learner apps. Users can try this in the R2024b prerelease which is scheduled to be available in the second half of June 2024. Below is a view of a Shapley Summary plot of the GPR model mentioned above in the Regression Learner app. In the meantime, one can always export a model from the Classification or Regression Learner apps, and then use the command line shapley and shapley.swarmchart commands to create the Shapley Summary plot, as illustrated above.
If this answer helps you, please remember to accept the answer.