How to resolve the loss function error?
Show older comments
I am trying to use this below matlab function.
figure
plot(loss(bag,Xtest,Ytest,'mode','cumulative'))
xlabel('Number of trees')
ylabel('Test classification error')
But this is giving some errors as follows
Error using classreg.learning.internal.classCount
You passed an unknown class '19.746' of type double.
Error in classreg.learning.classif.ClassificationModel/cleanRows (line 271)
C = classreg.learning.internal.classCount(this.ClassSummary.ClassNames,Y);
Error in classreg.learning.classif.ClassificationModel/prepareDataForLoss (line 364)
[X,C,W,Y,rowData] = cleanRows(this,X,Y,W,rowData,obsInRows);
Error in classreg.learning.classif.CompactClassificationEnsemble/loss (line 388)
[X,C,W] = prepareDataForLoss(this,X,Y,W,[],true,true);
Error in Test_RF_Classify (line 25)
plot(loss(bag,Xtest,Ytest,'mode','cumulative'))
I request you to kindly suggest me how to resolve it. I am attaching the code alongwith input file.
I will appreciate your kind help.
Sanchit
Answers (1)
the cyclist
on 25 Jul 2023
Edited: the cyclist
on 25 Jul 2023
0 votes
There are a number of potential errors with your code, but I would say that the most fundamental one is that you are using a classification algorithm, but you have a numerical response variable.
Classification algorithms are used to predict categorical or nominal variables (e.g. "Prefers to watch Barbie" vs. "Prefers to watch Oppenheimer").
[The specific MATLAB error arises because the category '19.746' did not appear in the training set, so it cannot appear in the test set.]
You might try to use fitrensemble to fit a regression model instead. Your code will actually run to completion if you do so. But I did not look at your code enough to see if the results would be sensible.
6 Comments
Sanchit
on 26 Jul 2023
the cyclist
on 26 Jul 2023
That's not really a MATLAB question, right?
You would need to explain the "rule" for which values of Y should become 1, which should become 2, and which should become 3.
Also, be aware thatfitcensemble will only do binary classification (i.e. two categories), not three categories.
Is this a school assignment? It seems like the data are not actually very meaningful.
Sanchit
on 26 Jul 2023
data = readmatrix('sample.csv'); % I used readmatrix instead of readtable
X = data(:, 1:end-1);
Y = data(:, end);
% Redefine Y as requested
Y90 = prctile(Y,90);
Y(Y <= Y90) = 1;
Y(Y > Y90) = 2;
numGroundTruth = numel(Y);
numTrainingSamples = round(0.7 * numGroundTruth);
trainingIndexes = randsample(numGroundTruth, numTrainingSamples);
testIndexes = setdiff((1:numGroundTruth)', trainingIndexes);
rng(1); % For reproducibility
Xtrain = X(trainingIndexes, :);
Xtest = X(testIndexes, :);
Ytrain = Y(trainingIndexes, :);
Ytest = Y(testIndexes, :);
%Create a bagged classification ensemble of 200 trees from the training data.
t = templateTree('Reproducible',true); % For reproducibility of random predictor selections
bag = fitcensemble(Xtrain,Ytrain,'Method','Bag','NumLearningCycles',200,'Learners',t)
%Plot the loss (misclassification) of the test data as a function of the number of trained trees in the ensemble.
figure
plot(loss(bag,Xtest,Ytest,'mode','cumulative'))
xlabel('Number of trees')
ylabel('Test classification error')
%Cross Validation, Generate a five-fold cross-validated bagged ensemble.
cv = fitcensemble(X,Y,'Method','Bag','NumLearningCycles',200,'Kfold',5,'Learners',t)
%Examine the cross-validation loss as a function of the number of trees in the ensemble.
figure
plot(loss(bag,Xtest,Ytest,'mode','cumulative'))
hold on
plot(kfoldLoss(cv,'mode','cumulative'),'r.')
hold off
xlabel('Number of trees')
ylabel('Classification error')
legend('Test','Cross-validation','Location','NE')
%Out-of-Bag Estimates
%Generate the loss curve for out-of-bag estimates, and plot it along with the other curves.
figure
plot(loss(bag,Xtest,Ytest,'mode','cumulative'))
hold on
plot(kfoldLoss(cv,'mode','cumulative'),'r.')
plot(oobLoss(bag,'mode','cumulative'),'k--')
hold off
xlabel('Number of trees')
ylabel('Classification error')
legend('Test','Cross-validation','Out of bag','Location','NE')
%The out-of-bag estimates are again comparable to those of the other methods.
the cyclist
on 28 Jul 2023
You should open a new question for that, and accept the answer here if you found it helpful.
Categories
Find more on Classification in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

