In the following code, Why is the classification accuracy (acc1, acc2) calculated differently?
Show older comments
The following function (knn_test) takes the following parameters: X indicates dataset sampels, Y indicates dataset labels, and filterindex corresponds column filter. If I want to filter the data, filterIndex is set to 1, whichever column is to be filtered.
I want to validate two holdout method. One with crossval function, and the other with cvpartition function. But when I call this method, acc1 and acc2 variables show different values.
I added break point to the code and debugged the code. I examined CVKNNModels' partition property and it was the same with the c particion in the Model 2.
What could have gone wrong with the following code?
Why did these two accuracy variables take different values?
If I want to use this function for holdout classification, which model should I choose?
Thanks.
function [acc1,acc2]=knn_test(X,Y,filterIndex)
columnfilterIndex = find(filterIndex==1);
%Model 1
tra1Data = X(:,[columnfilterIndex]);
tra1Label=Y;
KNNModel1=fitcknn(tra1Data, tra1Label, 'Distance', 'Euclidean', 'NumNeighbors', 3, 'DistanceWeight', 'Equal', 'Standardize', true);
rng('default');
CVKNNModel = crossval(KNNModel1,'holdout',0.3);
loss=kfoldLoss(CVKNNModel);
acc1=1-loss;
%Model 2
rng('default');
c = cvpartition(Y,'HoldOut',0.3);
tra2Data=X(c.training,[columnfilterIndex]);
tra2Label=Y(c.training,:);
test2Data=X(c.test,[columnfilterIndex]);
test2Label=Y(c.test,:);
KNNModel2 = fitcknn(tra2Data,tra2Label,'Distance', 'Euclidean','NumNeighbors',3, 'DistanceWeight', 'Equal','Standardize', true);
pre_test = predict(KNNModel2,test2Data);
correctPredictions = (pre_test == test2Label);
acc2 = sum(correctPredictions)/length(correctPredictions);
%perf=classperf(uint8(test2Label),uint8(pre_test));
%acc2=perf.CorrectRate;
2 Comments
michio
on 12 Nov 2019
Could you provide a script that can reproduce the issue? I run the following and acc1 and acc2 are the same.
load ionosphere
[acc1,acc2]=knn_test(X,Y,ones(size(X,2),1))
where (note the line: correctPredictions = (string(pre_test) == string(test2Label)); to avoid error)
function [acc1,acc2]=knn_test(X,Y,filterIndex)
columnfilterIndex = find(filterIndex==1);
%Model 1
tra1Data = X(:,[columnfilterIndex]);
tra1Label=Y;
KNNModel1=fitcknn(tra1Data, tra1Label, 'Distance', 'Euclidean', 'NumNeighbors', 3, 'DistanceWeight', 'Equal', 'Standardize', true);
rng('default');
CVKNNModel = crossval(KNNModel1,'holdout',0.3);
loss=kfoldLoss(CVKNNModel);
acc1=1-loss;
%Model 2
rng('default');
c = cvpartition(Y,'HoldOut',0.3);
tra2Data=X(c.training,[columnfilterIndex]);
tra2Label=Y(c.training,:);
test2Data=X(c.test,[columnfilterIndex]);
test2Label=Y(c.test,:);
KNNModel2 = fitcknn(tra2Data,tra2Label,'Distance', 'Euclidean','NumNeighbors',3, 'DistanceWeight', 'Equal','Standardize', true);
pre_test = predict(KNNModel2,test2Data);
% correctPredictions = (pre_test == test2Label);
correctPredictions = (string(pre_test) == string(test2Label));
acc2 = sum(correctPredictions)/length(correctPredictions);
%perf=classperf(uint8(test2Label),uint8(pre_test));
%acc2=perf.CorrectRate;
Accepted Answer
More Answers (1)
Y. K.
on 12 Nov 2019
0 votes
Categories
Find more on Classification Trees in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!