How to split dataset into 2/3 for training and 1/3 for testing include plot the graph?
1 view (last 30 days)
Show older comments
This is my coding...but i got error and cannot get the correct answer.
can you guy help me...Pleaseeeee
clear all, close all, clc
load hald; % Load Portlant Cement dataset
A = ingredients;
b = heat;
N=13; %number of row
idx=1:13;
PD=2/3;
%split data for training and testing
Ptrain=idx(1:round(PD*N));Ttrain=idx(1:round(PD*N));
Ptest=idx(round(PD*N)+1:end,:);Ttest=idx(round(PD*N)+1:end,:);
dataPTrain=hald(Ptrain);
dataPTest=hald(Ptest);
[U,S,V] = svd(A,'econ');
x = V*inv(S)*U'*b; % Solve Ax=b using the SVD
plot(dataPTrain,'k','LineWidth',2); hold on % Plot data
plot(dataPTest,'r-o','LineWidth',1.,'MarkerSize',2); % Plot regression
l1 = legend('Heat data','Regression')
%% Alternative 1 (regress)
x = regress(b,A);
%% Alternative 2 (pinv)
x = pinv(A)*b;
0 Comments
Answers (2)
Sulaymon Eshkabilov
on 15 Jan 2023
You should use random partition of your total data set, e.g.:
rng("default"); % For reproducibility
n = length(X); %
C = cvpartition(n, "HoldOut", 65); % 65% for training and the remaining 35% for testing
INDEXtrain = training(C,1);
INDEXtest = ~ INDEXtrain;
X_test = X(INDEXtest,:);
Y_test = Y(INDEXtest,:);
X_train = X(INDEXtrain,:);
Y_train = Y(INDEXtrain,:);
Voss
on 15 Jan 2023
This:
Ptest=idx(round(PD*N)+1:end,:);Ttest=idx(round(PD*N)+1:end,:)
should be this:
Ptest=idx(round(PD*N)+1:end);Ttest=idx(round(PD*N)+1:end)
because idx is a row vector, and the way you had it was trying to index beyond row 1 (the only row it has) of idx.
See Also
Categories
Find more on Model Building and Assessment in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!