Clear Filters
Clear Filters

Creating and using Datastore for LSTM time sequence data

7 views (last 30 days)
I have time sequence data files more than 10000 numbers stored individually at csv files. Each sequence data file consists of a sample of data from 6300 features taken at 5 time sequences. Each column is a measurement data from a feature. The labels are stored in separate file sequencially.
-0.7 -1.7 -5.09 -4.79 ....
-0.7 -1.7 -5.09 -4.79 ....
-1.06 -1.59 -5.08 -4.76 .....
-1.42 -1.86 -5.61 -4.86 ....
-1.34 -2.01 -5.1 -4.62 .....
numFeatures= 6300;
numHiddenUnits = 100;
numClasses = 3;
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(numHiddenUnits,'OutputMode','last')
fullyConnectedLayer(numClasses)
softmaxLayer
classificationLayer];
options = trainingOptions('adam', ...
'MiniBatchSize',20,...
'MaxEpochs',10, ...
'Shuffle','once',...
'GradientThreshold',0.001, ...
'Verbose',1, ...
'Plots','training-progress');
I want to use the data for LSTM classification. I could not load all the data for training purpose.
Matlab asks for cell data for each time sequence sample data for training.
So, How can I load the files and train the network using the datastore for such large data?

Accepted Answer

Angelo Yeo
Angelo Yeo on 11 Feb 2024
tabularTextDatastore supports to manage a large set of "csv" files. To quote from the doc:
Use a TabularTextDatastore object to manage large collections of text files containing column-oriented or tabular data where the collection does not necessarily fit in memory.
  1 Comment
Narayan
Narayan on 12 Feb 2024
Thank you Mr. Angelo.
1. I would like to know further how can the datastore be used for training such that it selects the minibatches itself.
2. The best ways to store the labels for the training data for classification. So, there will be no miss match during shuffling of data.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!