How to resolve the loss function error?

I am trying to use this below matlab function.
figure
plot(loss(bag,Xtest,Ytest,'mode','cumulative'))
xlabel('Number of trees')
ylabel('Test classification error')
But this is giving some errors as follows
Error using classreg.learning.internal.classCount
You passed an unknown class '19.746' of type double.
Error in classreg.learning.classif.ClassificationModel/cleanRows (line 271)
C = classreg.learning.internal.classCount(this.ClassSummary.ClassNames,Y);
Error in classreg.learning.classif.ClassificationModel/prepareDataForLoss (line 364)
[X,C,W,Y,rowData] = cleanRows(this,X,Y,W,rowData,obsInRows);
Error in classreg.learning.classif.CompactClassificationEnsemble/loss (line 388)
[X,C,W] = prepareDataForLoss(this,X,Y,W,[],true,true);
Error in Test_RF_Classify (line 25)
plot(loss(bag,Xtest,Ytest,'mode','cumulative'))
I request you to kindly suggest me how to resolve it. I am attaching the code alongwith input file.
I will appreciate your kind help.
Sanchit

Answers (1)

the cyclist
the cyclist on 25 Jul 2023
Edited: the cyclist on 25 Jul 2023
There are a number of potential errors with your code, but I would say that the most fundamental one is that you are using a classification algorithm, but you have a numerical response variable.
Classification algorithms are used to predict categorical or nominal variables (e.g. "Prefers to watch Barbie" vs. "Prefers to watch Oppenheimer").
[The specific MATLAB error arises because the category '19.746' did not appear in the training set, so it cannot appear in the test set.]
You might try to use fitrensemble to fit a regression model instead. Your code will actually run to completion if you do so. But I did not look at your code enough to see if the results would be sensible.

6 Comments

Thank you very much for your valuable feedback. How do I modify the dependent variable into number 1 and 2 or 3 so that it does the classification?
Appreciate your kind suggestions.
Sanchit
That's not really a MATLAB question, right?
You would need to explain the "rule" for which values of Y should become 1, which should become 2, and which should become 3.
Also, be aware thatfitcensemble will only do binary classification (i.e. two categories), not three categories.
Is this a school assignment? It seems like the data are not actually very meaningful.
May I request you to kindly take values greater than or equal to 90 percintle as 2 and less than 90 percintle as 1 as a rule so it becomes binary? I wanted to be sure about the smooth execution of matlab code for classification. I am very sorry for this inconvience caused to you. I would be grateful to you for your kind help.
Sanchit
data = readmatrix('sample.csv'); % I used readmatrix instead of readtable
X = data(:, 1:end-1);
Y = data(:, end);
% Redefine Y as requested
Y90 = prctile(Y,90);
Y(Y <= Y90) = 1;
Y(Y > Y90) = 2;
numGroundTruth = numel(Y);
numTrainingSamples = round(0.7 * numGroundTruth);
trainingIndexes = randsample(numGroundTruth, numTrainingSamples);
testIndexes = setdiff((1:numGroundTruth)', trainingIndexes);
rng(1); % For reproducibility
Xtrain = X(trainingIndexes, :);
Xtest = X(testIndexes, :);
Ytrain = Y(trainingIndexes, :);
Ytest = Y(testIndexes, :);
%Create a bagged classification ensemble of 200 trees from the training data.
t = templateTree('Reproducible',true); % For reproducibility of random predictor selections
bag = fitcensemble(Xtrain,Ytrain,'Method','Bag','NumLearningCycles',200,'Learners',t)
bag =
ClassificationBaggedEnsemble ResponseName: 'Y' CategoricalPredictors: [] ClassNames: [1 2] ScoreTransform: 'none' NumObservations: 32 NumTrained: 200 Method: 'Bag' LearnerNames: {'Tree'} ReasonForTermination: 'Terminated normally after completing the requested number of training cycles.' FitInfo: [] FitInfoDescription: 'None' FResample: 1 Replace: 1 UseObsForLearner: [32×200 logical] Properties, Methods
%Plot the loss (misclassification) of the test data as a function of the number of trained trees in the ensemble.
figure
plot(loss(bag,Xtest,Ytest,'mode','cumulative'))
xlabel('Number of trees')
ylabel('Test classification error')
%Cross Validation, Generate a five-fold cross-validated bagged ensemble.
cv = fitcensemble(X,Y,'Method','Bag','NumLearningCycles',200,'Kfold',5,'Learners',t)
cv =
ClassificationPartitionedEnsemble CrossValidatedModel: 'Bag' PredictorNames: {'x1' 'x2' 'x3' 'x4' 'x5' 'x6' 'x7' 'x8' 'x9'} ResponseName: 'Y' NumObservations: 46 KFold: 5 Partition: [1×1 cvpartition] NumTrainedPerFold: [200 200 200 200 200] ClassNames: [1 2] ScoreTransform: 'none' Properties, Methods
%Examine the cross-validation loss as a function of the number of trees in the ensemble.
figure
plot(loss(bag,Xtest,Ytest,'mode','cumulative'))
hold on
plot(kfoldLoss(cv,'mode','cumulative'),'r.')
hold off
xlabel('Number of trees')
ylabel('Classification error')
legend('Test','Cross-validation','Location','NE')
%Out-of-Bag Estimates
%Generate the loss curve for out-of-bag estimates, and plot it along with the other curves.
figure
plot(loss(bag,Xtest,Ytest,'mode','cumulative'))
hold on
plot(kfoldLoss(cv,'mode','cumulative'),'r.')
plot(oobLoss(bag,'mode','cumulative'),'k--')
hold off
xlabel('Number of trees')
ylabel('Classification error')
legend('Test','Cross-validation','Out of bag','Location','NE')
%The out-of-bag estimates are again comparable to those of the other methods.
Sanchit
Sanchit on 28 Jul 2023
Edited: Sanchit on 28 Jul 2023
Thank you very much for your kind help. In an another problem, I want to compute the daily mean of 8 variables over Lat x Lon x Time. My netcdf file contains these 8 variables for 03,06,09 and 12 GMT observations over 9 years data. I am attaching the matlab code with input file. I request to please have a look on code and suggest me how to fix it.
I would be grateful to you for your kind help.
Sanchit
You should open a new question for that, and accept the answer here if you found it helpful.

Sign in to comment.

Products

Release

R2023a

Asked:

on 25 Jul 2023

Commented:

on 28 Jul 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!