Is removing features from classification learner app considerd sensitivity analysis?

5 views (last 30 days)
Hi,
I am running my predictive model using classification learner app.
I want to know which features contribute the most to my model's output accuracy
Is removing features one by one from the "feature selection" pane considered a sensitivity analysis on the model?

Answers (1)

Shubham
Shubham on 28 Feb 2024
Hi Maryam,
In the context of predictive modeling, sensitivity analysis generally refers to the study of how various inputs of a model influence its output. In the case of a classification model, sensitivity analysis can help determine which features (inputs) have the most significant impact on the model's accuracy.
Removing features one by one from the "Feature Selection" pane in the Classification Learner app in MATLAB can be part of a sensitivity analysis, but it's a somewhat rudimentary approach. This method is often referred to as "feature ablation" or "sequential feature removal." It involves assessing the change in model performance as each feature is removed, which can help identify features that are important for the model's predictions.
However, there are a few points to consider:
  1. Features can be correlated, and removing one feature might not show its full effect if another correlated feature still exists in the model.
  2. Removing features one by one doesn't account for interactions between features. Sometimes, the combination of features is important, and individual features may not seem useful when considered alone.
  3. Each time you remove a feature, the model should be re-trained from scratch to accurately assess the impact of that feature on the model's performance.
  4. The importance of features can depend on the type of model used. Some models might rely heavily on certain features, while others might not.
  5. It's important to consider the right metrics when evaluating model performance. Accuracy alone might not be sufficient, especially for imbalanced datasets. Precision, recall, F1-score, ROC AUC, and other metrics might be more informative in some cases.
  6. Permutation Importance: This method involves randomly shuffling each feature and measuring the change in the model's performance. A significant decrease in performance indicates that the feature is important.
  7. Partial Dependence Plots (PDPs): These plots show the dependence between the target response and a set of features, marginalizing over the values of all other features.
In summary, while sequentially removing features and observing the change in accuracy can provide some insights, it is a limited approach. For a more robust understanding of feature importance and the sensitivity of your model to its inputs, you should consider using a combination of the above methods. These methods are more systematic and can provide a deeper understanding of how each feature contributes to the model's predictions.

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!