Why SVM-based fitcecoc function makes unexplainable misclassifications when 'fitPosterior' label is true?
2 views (last 30 days)
I am using the fitcecoc function with SVM template (and RBF kernel), and the 'onevsone' design matrix. The input dataset is purely constructed from 16-dimensional floating-point numbers (decimals) and the output should be one of 12 different class labels.
I know from my training and testing datasets, that few classes are overlapping so I expect some degree of misclassifications.
I noticed an interesting observation, that is, when the label 'fitPosterior' is false, the overall ECOC model (~70% accurate) makes misclassifications that can be explained in the light of the few overlapping classes. I verified this by removing one overlapping class and retraining the whole ECOC model, and the performance reflected an improvement.
Interestingly, when I enabled the 'fitPosterior' label to get some probabilities (not just hard output labels), the ECOC model overall performance relatively improved (~84% accurate) but with some persistent misclassifications. The difference this time is that these misclassifications are not with the overlapping classes anymore. Instead, the model misclassifies the incoming testing instances with very different classes (of little to no overlap).
To wrap up, I find it difficult trying to understand:
(1) Why the performance with 'fitPosterior' enabled showed relative improvement compared to with it disabled? Why this improved performance was associated with reduced explainability and bizzare misclassifications (without overlap between confused classes).
(2) How does 'fitPosterior' works as an algorithm? Is there any way through which we can have some control over how this "Posterior Probability Estimation" gets trained.
Sahil Jain on 22 Dec 2021
Hi Omar. By default, the software minimizes the Kullback-Leibler divergence to estimate class posterior probabilities. Other than KL divergence, Quadratic Programming can also be used (requires optimization toolbox). To know more about the algorithm, please refer to the Algorithms section of the "predict" function. To understand the behaviour of the algorithm, I'd suggest going through the references linked in the section.