Confusion Matrix of cross validation of an ECOC SVM classifier
5 views (last 30 days)
Hi, I am using MATLAB 2015 and statistics and machine learning toolbox. I want to do a 10-fold cross validation for an ECOC svm classifier with 19 classes. I want to report test result by obtaining confusion matrix. I used following piece of code:
template = templateSVM('KernelFunction', 'rbf', 'KernelScale', 'auto', 'BoxConstraint', 1, 'Standardize', 1);
trainedClassifier = fitcecoc(Features, Labels, 'Learners', template, 'Coding', 'onevsone');
partModel = crossval(trainedClassifier, 'KFold', 10);
Accuracy = 1 - kfoldLoss(partModel, 'LossFun', 'ClassifError');
[validationPredictions, validationScores] = kfoldPredict(partModel);
Is the confmat, the average confusion matrix of 10 folds which are taken out during cross validation? I mean, should I use this confusion matrix as test performance of classifier? If it is not, what should I do for doing a test on the performance of classifier? Thanks in advance Mehrdad
Daniel on 23 Jun 2016
Edited: Daniel on 23 Jun 2016
Yes. If you examine the crossval object you will see an attribute called Trained. It contains k classifiers where k is the number of folds you specified. Each classifier trains on its respective partition which is specified by the Partition attribute of the crossval object. The kFoldPredict function returns the aggregate predictions of all k folds into the validationPredictions variable so your confusion matrix shows the classification results of all of your data.
For nice plots, use plotconfusion function. If your labels are non-numeric I have had good luck with the heatmap plotting toolbox which can be found in FileExchange
Tom Lane on 5 Jul 2015
The confusion matrix is one measure of classifier accuracy. You should supply the confmat function with the known labels and the predicted values. It appears you want to supply Labels in place of Features(:,1). Your other argument looks okay; the predicted value for row J is computed using the classifier for which row J is part of the 10% held out, and the training is done on the other 90%.