Why Matlab produces different results with KNN classifier for same dataset?

Question

Umit Kilic on 7 Dec 2020

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/685318-why-matlab-produces-different-results-with-knn-classifier-for-same-dataset

Answered: Walter Roberson on 7 Dec 2020

Hi everyone, I use Matlab to code some metaheuristic algorithms for feature selection. Before Matlab, I mostly used Weka.

I selected all features and sent to the KNN classifier to compare with the Weka to satisfy my curiosity. Lets say we have 50 features in the dataset and all of them are selected. I run the code 10 times and here is what KNN produces in Matlab as Error Rate with K=3 and 10-fold.

1) 0.24765

2) 0.22571

3) 0.23197

4) 0.26019

5) 0.23511

6) 0.23511

7) 0.25078

8) 0.23197

9) 0.23511

10) 0.24138

Here is what Weka produces for every run: 0.26333

Why Matlab produces different results? Why it is different than Weka? I used same dataset with same features and the same parameters (K=3 and 10-fold). I am confused. Here is my code snippet that Error Rate from KNN is generated:

function [errorrate]=jFitnessFunction(feat,label,X)
% feat: features
% label: labels
% X: selected features' indexes (binary vector)
% Parameter setting for k-value of KNN
k=3; 
% Parameter setting for number of cross-validation
kfold=10;
[errorrate]=jwrapperKNN(feat(:,X==1),label,k,kfold);
end
% Perform KNN with k-folds cross-validation
function [ER]=jwrapperKNN(feat,label,k,kfold)
Model=fitcknn(feat,label,'NumNeighbors',k,'Distance','euclidean'); 
C=crossval(Model,'KFold',kfold);
% Error rate 
ER=kfoldLoss(C);
end

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Walter Roberson on 7 Dec 2020

1
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/685318-why-matlab-produces-different-results-with-knn-classifier-for-same-dataset#answer_568413

Kfold validation is random.

fitcknn is not doing a one-time knn examination to find the k nearest neighbours of the given data. Instead it is building a classifier using K centroids, and it is trying to avoid overtraining so it takes random subsets and reports back the centroids that gave the best classification.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Why Matlab produces different results with KNN classifier for same dataset?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Why Matlab produces different results with KNN classifier for same dataset?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments