Randomly split data set and make sure that a sample of each class is included

2 views (last 30 days)
Hello,
I am randomly splitting my dataset into train,validation and test sets as shown below:
% Random split Complete Data Set
Q = size(completeRDS,1);
Q1 = floor(Q*0.75);
Q2 = Q-Q1;
Q2 = round(Q2/2);
Q3 = Q2;
ind = randperm(Q);
ind1 = ind(1:Q1);
ind2 = ind(Q1+(1:Q2));
ind3 = ind(Q2+(1:Q3));
% Complete Data Set
trainData = completeRDS(ind1,1);
trainSeqLabels = seqLabels(ind1,1);
valData = completeRDS(ind3,1);
valLabels = seqLabels(ind3,1);
testData = completeRDS(ind2,1);
testLabels = seqLabels(ind2,1);
I have just realised that my test set has not included a sample from each class that I am trying to classify. It has not included 2 of the classes
How can i split my data set and make sure that the test set includes at least 1 sample of each class that i am trying to classify?
Thanks in advance

Answers (1)

MaryD
MaryD on 8 Jan 2020
If those are images I think best way is to load your data with class labels and then use splitEachLabel( __,'randomize') function.

Categories

Find more on Data Import and Analysis in Help Center and File Exchange

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!