How to use excel datastore to in a classification problem

In have 5 excel spreadsheet, named, material1.xlsx, ...material5.xlsx. Each of these spreadsheet is placed in a folder called Material1 ... Material5.
Each spreadsheet has 200 rows, with each rows having 100 columns. Meaning each spreadsheet has 200 samples, and each sample has 100 measurements against it,
E.g. if I use readtable
T = xlsread('C:\Users\ernes\OneDrive\Documents\MATLAB\Material1\Material1.xlsx');
So sample 170 is
TM = T(170,:);
as an example, the size(TM) = 1 x 100;
Thus in a nutshell, I want to classify these 5 materials. I first want to train a network that can do this classification task, the foldernames are also labels.
How do I do this using spreadsheetdatastores?
I have only trained networks using imageDatastore, for example in the MNIST images, here is how the loading of the images into the datastore is done
path = fullfile(matlabroot,'toolbox','nnet/nndemos/nndatasets/DigitDataset/');
imds = imageDatastore(path,'IncludeSubfolders',true,'LabelSource','foldernames');

 Accepted Answer

As listed in the doc at https://www.mathworks.com/help/matlab/ref/matlab.io.datastore.spreadsheetdatastore.html, the purpose of spreadsheetDatastore is as follows: "Use a spreadsheetDatastore object to manage large collections of spreadsheet files where the collection does not necessarily fit in memory. You can create a spreadsheetDatastore object using the spreadsheetDatastore function, specify its properties, and then import the data using object functions."
Since you have only 5 spreadsheets which are each 200x100 in size, all of the data from all 5 spreadsheets can fit easily in memory. So, you might just want to load all of the data into a single table, and then do classification experiments with the Classification Learner app from the Statistics and Machine Learning Toolbox. Classification Learner includes the ability to train many types of machine learning models, including neural networks. You could also train neural networks using the Deep Learning Toolbox.
Even though your data is small enough to fit in memory, if you still want to use spreadsheetDatastore, the doc page https://www.mathworks.com/help/matlab/ref/matlab.io.datastore.spreadsheetdatastore.html should provide the info that you need.
If this answer helps you, please remember to accept the answer.

5 Comments

Hi Drew, thank you very much for the speedy response, I like the idea of having all the data into one excel sheet, however, how would the network know which class if the data coming from? For example, it means I will have 1000 x 100 rows, each row representing a single measurement, which during training I suppose they will be shuffled, how will the network know from which class is the data coming from?
For tabular data that is input to the Classification Learner app, simply add another column to the table to indicate the class label. Or, you can have a separate vector which indicates the class label.
For neural networks using dlnetwork, here is an example that puts labels into a separate vector, labelsTest: openExample('nnet/MakePredictionsUsingDlnetworkObjectExample')
Thank you Drew,
Appreciated, let me get to it and give you feedback how it goes.
Let me attach sample data to help visualize my problem. I cannot wrap my head around how to include categorical data, so I will use 30 samples, with ten samples data from each material case.
So in short, the Y(1,x1:end) for each material are measurement at energy level x1 - x10. In this spreadsheet, for each material ten measurement Y(1,:) to Y(10,:) were made,
So my input to the NN is Y(k,:), and after training, I want when I input Y(k,;) for the network to predict which material it is. e.g, material1 or material2 or simply 1, for material 1, 2 for material 2 etc
Here I am not sure my neural network knows which are labels.
@ernest modise comments:
I really this help. I have searched YouTube, I only see meaningful application of image based problems

Sign in to comment.

More Answers (0)

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Products

Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!