Invalid training data for LSTM network.
3 views (last 30 days)
Show older comments
Hi all,
I am creating a LSTM network and I am getting an error 'Invalid training data. Predictors and responses must have the same number of observations.' I converted the input training data to cell arrays and the output to a categorical one, yet I am getting an error.
Here is a sample of the code:
DataParts = zeros(size(Train1_inputX1,1), size(Train1_inputX1,2),1,2); %(4500,400,1,2)
DataParts(:,:,:,1) = real(cell2mat(Train1_inputX1));
DataParts(:,:,:,2) = imag(cell2mat(Train1_inputX1)) ;
XTrain=num2cell(reshape(DataParts, [400,1,2,4050])); %Train data
DataParts1 = zeros(size(testX1_input,1), size(testX1_input,2),1, 2);
DataParts1(:,:,:,1) = real(cell2mat(testX1_input));
DataParts1(:,:,:,2) = imag(cell2mat(testX1_input)) ;
Ttrain=num2cell(reshape(DataParts1,[400,1,2,500])); %Test data
DataParts2 = zeros(size(ValX1_input,1), size(ValX1_input,2),1, 2);
DataParts2(:,:,:,1) = real(cell2mat(ValX1_input));
DataParts2(:,:,:,2) = imag(cell2mat(ValX1_input));
Vtrain =num2cell(reshape(DataParts2,[400,1,2,450])); %450 is the number of segments %400 is the number of samples
Valoutfinal= categorical(ValX1_output); %450 values
testoutfinal = categorical(testX1_output); %500 values
Trainoutfinal= categorical(Train1_outputX1);%4050 values
%% NETWORK ARCHITECTURE
inputSize = [400 1 2];
numHiddenUnits = 800;
numClasses = 4;
layers = [ ...
sequenceInputLayer(inputSize,'Name','input')
flattenLayer('Name','flatten')
bilstmLayer(numHiddenUnits ,'OutputMode','last','Name','lstm')
fullyConnectedLayer(numClasses , 'Name','fc')
softmaxLayer('Name','softmax')
classificationLayer('Name','classification')];
% Specify training options.
maxEpochs = 100;
miniBatchSize = 27;
options = trainingOptions('sgdm', ...
'ExecutionEnvironment','cpu', ...
'GradientThreshold',1, ...
'MaxEpochs',maxEpochs, ...
'MiniBatchSize',miniBatchSize, ...
'SequenceLength','longest', ...
'Shuffle','never', ...
'Verbose',0, ...
'Plots','training-progress');
%% Train network
net = trainNetwork(Ttrain,Trainoutfinal,layers,options);
Any help is greatly appreciated.
Thanks a mil.
0 Comments
Accepted Answer
Avadhoot
on 10 Apr 2024
I understand that you are facing the "Invalid training data" error for your LSTM model. This error suggests that there is a mismatch between the number of observations in the predictors and the responses. After going through your code, I have found the following:
1) The reshaping of "DataParts" into "XTrain", "Ttrain" and "VTrain" looks incorrect. You have reshaped the entire dataset into a single cell. But each sequence or sample must be in a separate cell according to the correct format. You'll need to apply similar corrections to your validation and test sets.
You can take a look at the below code to understand how it must be done:
% Assuming Train1_inputX1 is a cell array where each cell contains a sequence
XTrain = cell(size(Train1_inputX1));
for i = 1:length(Train1_inputX1)
tempData = Train1_inputX1{i}; % Extract the sequence
% Assuming each sequence is a 2D matrix where rows are time steps
tempData = [real(tempData), imag(tempData)]; % Combine real and imaginary parts
XTrain{i} = tempData.'; % Transpose to make it [Features, TimeSteps]
end
2) Ensure that the number of sequences in your predictors matches the number of labels in your responses. The error suggests there's a mismatch.
After making the above corrections the error must go away, provided the corrected data preparation aligns the number of observations in predictors and responses.
I hope it helps.
0 Comments
More Answers (0)
See Also
Categories
Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!