Error named "Invalid training data. Predictors and responces must have the same number of observations" in LSTM

81 views (last 30 days)
Hi,
I am new to matlab and I am trying to create a LSTM network for feature classification and I am getting an error 'Invalid training data. Predictors and responses must have the same number of observations.' The input data was converted into cell array.
Here is the part of the code:
winetable = readtable('winequality-red.csv');
wine=table2cell(winetable);
fixedacid = (wine(:,1));
volatileacid = (wine(:,2));
citricacid = (wine(:,3));
residualsugar = (wine(:,4));
chlorides = (wine(:,5));
freesulferdioxide = (wine(:,6));
totalsulferdioxide = (wine(:,7));
density = (wine(:,8));
ph = (wine(:,9));
sulphates = (wine(:,10));
alcohol = (wine(:,11));
quality = (wine(:,12));
goodwine= [volatileacid density ph alcohol quality];
%Train the data
XTrain=goodwine(1:1280,1:4);
YTrain=goodwine(1:1280,5);
%Test the data
XTest=goodwine(1281:1599,1:4);
YTest=goodwine(1281:1599,5);
%GRU DATA INPUT
features=4;
responces=6;
numHiddenUnits=100;
classes=["very bad" "bad" "poor" "fine" "good" "very good"]
%LSTM LAYERS
layers = [sequenceInputLayer(features)
lstmLayer(numHiddenUnits)
fullyConnectedLayer(responces)
softmaxLayer
classificationLayer( ...
'classes',classes)];
%OPTIONS FOR THE TRAINING
ops = trainingOptions('adam', ...
'MaxEpochs',1000, ...
'GradientThreshold',0.001, ...
'InitialLearnRate',0.0001);
%MACHINE
net = trainNetwork(XTrain,YTrain, layers, ops);
The error indicates that:
Error using trainNetwork (line 184)
"Invalid training data. Predictors and responces must have the same number of observations"
Any help or advice for this problem will be greatly appritiated
Thank you!

Answers (1)

Sahil Jain
Sahil Jain on 21 Dec 2021
Hi Ken. There are a few things you can consider to solve your problem. Fundamentally, LSTMs are used for classifying sequence data. However, in your case, I believe the data is not sequential but rather consists of four features which predict a class. The "sequenceInputLayer" is expecting an input of size (num_features x sequence_length). In your case, since there are no sequences, this causes a problem. From my understanding, the network is assuming "num_features" to be 1280 and "sequence_length" to be 4 for processing. You may need to rethink the use of LSTM for this case. Also, in this case, since each input vector is supposed to give one output class, you should set 'OutputMode' as 'last' for the LSTM layer. I would recommend going through the Sequence Classification Using Deep Learning example to understand how inputs are passed to LSTMs.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!