I am unable to resolve this error: "Invalid training data. Predictors must be a N-by-1 cell array of sequences, where N is the number of sequences. All sequences must have the same feature dimension and at least one time step." How can i correct

30 views (last 30 days)
I am unable to resolve this error: "Error using trainNetwork: Invalid training data. Predictors must be a N-by-1 cell array of sequences, where N is the number of sequences. All sequences must have the same feature dimension and at least one time step." How can i correct this error.

Answers (4)

Shawn Fernandes
Shawn Fernandes on 19 Aug 2019
Hi Sanjana,
Good morning.
That is, X_train is 134949x1 cell array that opens into Nx1 cell array which inturn holds N cells of 70x1 double
Please expand the N x 1 cell array into N cells of 70 x1 double, for the entire 134949 x1 cells.
For example,
X_train(1:5)
ans =
5×1 cell array
{2×1 cell}
{6×1 cell}
{3×1 cell}
{4×1 cell}
{4×1 cell}
Should be
(2 + 6 +3 + 4 +4) cells x 1
X_train(1:19)
ans =
19×1 cell array
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
{70×1 double}
Make sure that that corresponding X train and Y train expansion match.
Hope this helps
  1 Comment
Sanjana Sankar
Sanjana Sankar on 21 Aug 2019
Hi Shawn.
Thanks again for your response.
My 70x1 cells are actually encoded data for a letter and the corresponding Y_train would be its phoneme(pronunciation), that is, its not a one-to-one matching. So I cannot expand the Nx1 cell array. That'll not help my case. Is there any other way to work around this problem?

Sign in to comment.


Shawn Fernandes
Shawn Fernandes on 24 Apr 2018
Hi,
Please ensure that the X has the cell structure in the format N x 1.
Each cell has the format M x L, where M is the number of features, that remains fixed for all the cells, and L is the variable length of the training data.
The above observation was for LSTM training.
Hope this helps.
  3 Comments
Shawn Fernandes
Shawn Fernandes on 24 Jan 2019
Hi Alessio,
Please try to compare your data format to the data format of the included matlab LSTM example, and then proceed from there.
https://www.mathworks.com/help/deeplearning/examples/classify-sequence-data-using-lstm-networks.html
openExample('nnet/ClassifySequenceDataUsingLSTMNetworksExample')
Load the Japanese Vowels training data. XTrain is a cell array containing 270 sequences of dimension 12 of varying length. Y is a categorical vector of labels "1","2",...,"9", which correspond to the nine speakers. The entries in XTrain are matrices with 12 rows (one row for each feature) and varying number of columns (one column for each time step).
[XTrain,YTrain] = japaneseVowelsTrainData;
XTrain(1:5)
ans = 5x1 cell array
{12x20 double}
{12x26 double}
{12x22 double}
{12x20 double}
{12x21 double}
Hope this helps
Sanjana Sankar
Sanjana Sankar on 6 Aug 2019
How should Y_train be if I'm doing sequence to sequence?
Can it also be of cell structure?
X_train(1:5)
ans =
5×1 cell array
{2×1 cell}
{6×1 cell}
{3×1 cell}
{4×1 cell}
{4×1 cell}
Y_train(1:5)
ans =
5×1 cell array
{2×1 cell}
{5×1 cell}
{3×1 cell}
{4×1 cell}
{4×1 cell}
But I am getting the same error.
"Error using trainNetwork: Invalid training data. Predictors must be a N-by-1 cell array of sequences, where N is the number of sequences. All sequences must have the same feature dimension and at least one time step."
How do i rectify this?

Sign in to comment.


Shawn Fernandes
Shawn Fernandes on 6 Aug 2019
Hi Sanjana,
Please change the data format of the contents of cell structure from cell to double
  2 Comments
Sanjana Sankar
Sanjana Sankar on 19 Aug 2019
Edited: Sanjana Sankar on 19 Aug 2019
Hi Shawn,
Thanks! But my data is in a double nested cell. That is, X_train is 134949x1 cell array that opens into Nx1 cell array which inturn holds N cells of 70x1 double. How should I proceed?

Sign in to comment.


Markus Hohlagschwandtner
Markus Hohlagschwandtner on 11 Dec 2020
The first column of the train data, which is the testcase number, must be numbered consecutively.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!