- Partition the Data for k-Fold Cross-Validation
Validating a Model Created by the fitrauto command
4 views (last 30 days)
Show older comments
Hello there I am wondering how I would go about preforming a k-fold cross validation on the data for creating a model using the "fitrauto" function. I am trying to partition the data like so, so that I can validate and test the model generated by the AutoML model function. Here is what I have so far. I am also wondering how I go about calling the mdl after it is created and stored in "mdl" and using it to make predictions.
rng(1);
ValidationPartitions = cvpartition(height(AllFinalTrials),"KFold",5)
dataValidate = AllFinalTrials(ValidationPartitions.test);
FianlTrainset = AllFinalTrials(ValidationPartitions.training);
0 Comments
Answers (1)
prabhat kumar sharma
on 2 May 2024
Hi Isabelle,
I understand you want to perform k-fold cross-validation on your dataset for creating a model using the fitrauto function in MATLAB, and then to use the model for making predictions.
You can follow these steps. Note that fitrauto is designed for automated machine learning (AutoML), simplifying the process of model selection and training.
You've started correctly by partitioning your dataset using cvpartition. However, cvpartition creates an object that helps you to index into your data for each fold. You'll need to loop over the folds to train and validate your model.
You can refer this documentation to undertand cvpartition in detail: https://www.mathworks.com/help/stats/cvpartition.html
Here's the refrence code piece to structure your loop:
rng(1);
k = 5;
cv = cvpartition(height(AllFinalTrials), "KFold", k);
for i = 1:k
% Indices for the training set for the i-th fold
trainIdx = cv.training(i);
% Indices for the test set for the i-th fold
testIdx = cv.test(i);
% Creating training and test datasets for the i-th fold
TrainSet = AllFinalTrials(trainIdx, :);
TestSet = AllFinalTrials(testIdx, :);
% Train the model using the training set of the i-th fold
mdl = fitrauto(TrainSet);
% After training the model, you can make predictions on the TestSet
% Assuming your target variable is the last column in the dataset
XTest = TestSet(:, 1:end-1);
YTest = TestSet(:, end);
predictions = predict(mdl, XTest);
% Here, you can calculate the performance metrics for your model
% For instance, calculating the root mean squared error (RMSE)
rmse = sqrt(mean((predictions - YTest).^2));
fprintf('Fold %d, RMSE: %.4f\n', i, rmse);
end
2. Using the Model for Making Predictions
After training your model using fitrauto, you can make predictions on new data using the predict function.
Here's a simple example assuming mdl is your trained model and newData is the new data you want to predict outcomes for (excluding the target variable)
% Assuming newData is a table or a matrix with the same features as the training data
predictions = predict(mdl, newData);
I hope it helps!
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!