Deep Learning Toolbox - Structuring the Training Data from Imported Data

14 views (last 30 days)
Hi,
I am attempting to create a ROM of a gas exchange process by training a LSTM network. I am using the ROM example LSTM ROM as a starting point. My data is captured in a csv format (i.e, dataraw_03), where I have run 4 simulations in ANSYS, varying the equivalance ratio (0.3 to 0.6) in each simulation to capture the dynamics.
I have transposed the 4 data sets (i.e., dataT_0.3) and removed the simulation time from the data (i.e., data_0.3). From how I understand the ROM LSTM example workflow I have to now format and combine the 4 data sets into a single 4*1 cell?
I am stuck on how to get the 4 data sets into a single cell (i.e., data) so I can then prepare the data for training, which will then be downsampled the ANSYS data and partitioning the data for training and test data. Another post processing issue is ANSYS uses a variable step solver, so the 4 data set do varying in length (time).
Any help as to how best to structure and prepare the 4 data sets for training the LSTM network would be great. I have included the code, up until creating the cell array for the training.
Thanks in advance,
Patrick
%Import the ANSYS Raw Data
dataraw_03 = xlsread('export_Ethanol_20%_440t_0.3.csv');
dataraw_04 = xlsread('export_Ethanol_20%_440t_0.4.csv');
dataraw_05 = xlsread('export_Ethanol_20%_440t_0.5.csv');
dataraw_06 = xlsread('export_Ethanol_20%_440t_0.6.csv');
%Transpose the data
dataT_03 = dataraw_03';
dataT_04 = dataraw_04';
dataT_05 = dataraw_05';
dataT_06 = dataraw_06';
%Remove Time from the ANSYS data
data_03 = dataT_03(2:end, 1:end); % Remove the first row, time
data_04 = dataT_04(2:end, 1:end); % Remove the first row, time
data_05 = dataT_05(2:end, 1:end); % Remove the first row, time
data_06 = dataT_06(2:end, 1:end); % Remove the first row, time
%Create a single Cell Array from the 4 Sepertate ANSYS Simulations (.csv) for LSTM Training
numObservations = 4; % 4 ANSYS Simulations Conducted
EquivRatio = linspace(0.3,0.6,numObservations); % Equivalnce ratio swept from 0.3 to 0.6
data = cell(numObservations,1);
% Stuck at this stage???
%for i = 1:numObservations
%EquivRatio = EquivRatio(i);
%data{i} = data_03...;
%end

Accepted Answer

David Ho
David Ho on 25 Aug 2022
Hello PB75,
If I understand it correctly, you would like to arrange data_03, data_04, data_05 and data_06 into a cell array that can then be prepared as input data for a network with a sequenceInputLayer.
If that's the case, you can concatenate them into a 4x1 cell like this:
data = {data_03; data_04; data_05; data_06};
You can then follow the remaining data processing steps from the example that you linked. With the default options, the software will automatically pad the sequences so that all observations in a mini-batch have the same length, meaning that it can handle sequences of differing lengths.
However, if the observations have all been taken over the same time interval, you may wish to interpolate the data using a function such as interp1 before training so that each time step represents the same period.
  6 Comments
PB75
PB75 on 30 Aug 2022
Hi David, Thanks for your answer. Yes it looks like the data captured in ANSYS has duplicate entrys in the time column.
Nanxin
Nanxin on 29 Oct 2022
Hi David,
Thanks for your answer which also helps me.
Further more, I want to know how to prepare the data for predicting if I have several samples. For example, we have the DATA:data_07 , data_08, ... data_04000 (means so much data need to prepared using this model) which dimensions all are 6x545. How can I INPUT those data and PREDICTANDUPDATESTATE the net state?
Many thanks!

Sign in to comment.

More Answers (0)

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!