Clear Filters
Clear Filters

Regression Learner Results don't match predict function results

4 views (last 30 days)
Hello there,
I used Regresion Learner app to find best model for my data set. Fine Tree model perform best on my dataset. So I went to export function and try using it in my script. I tried following:
1.
Mdl = fitrtree(X_train, Y_train,'MinLeafSize', 4, 'Surrogate', 'off');
fitted= predict(Mdl, X_test);
or
2.
[Mdl, ~] = trainRegressionModel(X_train, bt_Y_train);
fitted= Mdl.predictFcn(X_test);
The first case is if I use clasiffier I get out of the function which I get from Regrasion Learner -> Generate Function. The second case is using this generated function as is.
I use my custom function to calculate classifier performance (scores or error). See below
function [mse, rmse, r2, mae] = fit_error(tru, fitted)
s = sum((tru-fitted).^2);
n = length(tru);
mse = s/n;
rmse = sqrt(s/n);
r2 = 1 - (s/sum((tru-mean(tru)).^2));
mae = sum(abs(fitted-tru))/n;
end
The results I get from Regression Learner app, for example R2 and RMSE, are 1.0 and 0.22477 respectively. However, when I use exported function in my code I get R2 of 0.28 and RMSE of 4.14. I don't undestand why threre is such a huge difference in results. Anyone experianced the same problem?
I use exactly the same data set. I also try spliting it differently. I do 5 fold cross validation in both my script and with Regression Learner. In my script I don't use built in function for cross validation, I just do circular rotation on set of 5 even chunks of data, use 4 for training and 1 for testing, repeat that 5 times and then calculate the average of 5 results.
I use the same script with different data set (from a different sensor) and get results that match Regression Learner app, which makes me think that there is no error in my code.

Answers (1)

KALASH
KALASH on 4 Apr 2024
Hi Alex,
What I understand from your question is that the result from the regression learner app and the generated model which you used in your script is different.
But you stated that you do not face any issue in any other dataset, so you must provide me your dataset to be able to investigate the problem and provide a solution.
Meanwhile here are some pointers to ensure your model and app’s internal implementation is the same:
  • Hyperparameters: Ensure that the parameters like “minleafsplits” and “surrogate” are the same as even small changes can lead to significant differences.
  • Data splitting: Check if the split is done in the same way. Ensure that the cross-validation technique and the random seed are the same in the app as well as your script.
  • Feature engineering: Check if any feature selection or transformation was done differently in the app and your script.
  • Model evaluation: Although the root mean square error calculation seems right in your custom function, double check if the app’s evaluation is working in the same way.
Hope this helps!

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!