Deep Learning with Time Series and Sequence Data in order to reproduce wind data

3 views (last 30 days)
Hello,
Thanks for the link, I spent the day trying to make the example work for me. I have learned a lot, solved some problems but I don't get a usable result and despite my efforts, I can't solve my problem. Here is the steps I followed : https://www.mathworks.com/help/deeplearning/ug/time-series-forecasting-using-deep-learning.html
Here are the wind data I give to Matlab for training and what I expect in output as well (predicting the wind speed per hour over a year).
I think that learning works. The program is running well so far. When we get to the prediction phase, it doesn't work anymore. The vector YTest :
YTest = predict(net,XTest,SequencePaddingDirection="left");
The YTest vector is not complete, and "NaN" appear very quickly. I work with vectors of dimensions 8760x1 (nb hours in a year). First thing I don't understand.
Therefore the part about RMSE doesn't work, but it's not very serious.
I then tried to run the "close loop forecasting", but the figure it displays doesn't make sense. This is the most blocking point, I don't know how to fix it, the values are weird. I remind that my wind data are between 0m/s and 30m/s. Moreover there is no forecasted curve that appears.
Here is my code, if you have time to look at it to help me I will be very grateful.
load data_vent
data_vent(1:3);
numChannels = size(data_vent{1},1);
figure
tiledlayout(3,1)
for i = 1:3
nexttile
stackedplot(data_vent{i}')
xlabel("Time Step")
end
numObservations = numel(data_vent);
idxTrain = 1:floor(0.9*numObservations);
idxTest = floor(0.9*numObservations)+1:numObservations;
dataTrain = data_vent(idxTrain);
dataTest = data_vent(idxTest);
for n = 1:numel(dataTrain)
X = dataTrain{n};
XTrain{n} = X(:,1:end-1);
TTrain{n} = X(:,2:end);
end
muX = mean(cat(1,XTrain{:}),1);
sigmaX = std(cat(1,XTrain{:}),0,1);
muT = mean(cat(1,TTrain{:}),1);
sigmaT = std(cat(1,TTrain{:}),0,1);
for n = 1:numel(XTrain)
XTrain{n} = (XTrain{n} - muX) ./ sigmaX;
TTrain{n} = (TTrain{n} - muT) ./ sigmaT;
end
layers = [
sequenceInputLayer(numChannels)
lstmLayer(128)
fullyConnectedLayer(numChannels)
regressionLayer];
options = trainingOptions("adam", ...
MaxEpochs=8760, ...
SequencePaddingDirection="left", ...
Shuffle="every-epoch", ...
Plots="training-progress", ...
Verbose=0);
net = trainNetwork(XTrain,TTrain,layers,options);
for n = 1:size(dataTest,1)
X = dataTest{n};
XTest{n} = (X(:,1:end-1) - muX) ./ sigmaX;
TTest{n} = (X(:,2:end) - muT) ./ sigmaT;
end
YTest = predict(net,XTest,SequencePaddingDirection="left");
for i = 1:size(YTest,1)
rmse(i) = sqrt(mean((YTest{i} - TTest{i}).^2,"all"));
end
figure
histogram(rmse)
xlabel("RMSE")
ylabel("Frequency")
mean(rmse)
idx = 2;
X = XTest{idx};
T = TTest{idx};
figure
stackedplot(X',DisplayLabels="Channel " + (1:numChannels))
xlabel("Time Step")
title("Test Observation " + idx)
net = resetState(net);
offset = 75;
[net,~] = predictAndUpdateState(net,X(:,1:offset));
numTimeSteps = size(X,2);
numPredictionTimeSteps = numTimeSteps - offset;
Y = zeros(numChannels,numPredictionTimeSteps);
for t = 1:numPredictionTimeSteps
Xt = X(:,offset+t);
[net,Y(:,t)] = predictAndUpdateState(net,Xt);
end
figure
t = tiledlayout(numChannels,1);
title(t,"Open Loop Forecasting")
for i = 1:numChannels
nexttile
plot(T(i,:))
hold on
plot(offset:numTimeSteps,[T(i,offset) Y(i,:)],'--')
ylabel("Channel " + i)
end
xlabel("Time Step")
nexttile(1)
legend(["Input" "Forecasted"])
net = resetState(net);
offset = size(X,2);
[net,Z] = predictAndUpdateState(net,X);
numPredictionTimeSteps = 2000;
Xt = Z(:,end);
Y = zeros(numChannels,numPredictionTimeSteps);
for t = 1:numPredictionTimeSteps
[net,Y(:,t)] = predictAndUpdateState(net,Xt);
Xt = Y(:,t);
end
numTimeSteps = offset + numPredictionTimeSteps;
figure
t = tiledlayout(numChannels,1);
title(t,"Closed Loop Forecasting")
for i = 1:numChannels
nexttile
plot(T(i,1:offset))
hold on
plot(offset:numTimeSteps,[T(i,offset) Y(i,:)],'--')
ylabel("Channel " + i)
end
xlabel("Time Step")
nexttile(1)
legend(["Input" "Forecasted"])

Answers (1)

Pratik
Pratik on 6 Nov 2023
Hi Armel,
In my understanding, you are facing issues while getting predictions after training the model and want to figure out the reason for the same.
The code you provided and the one in the example you referred from are similar. Thus, it is highly unlikely that there is some error in the model or training. I would suggest to please check the data you are using, the format it is stored in and how you process it.
You can also try pre-processing the data to remove inconsistencies or formatting issues.
Since there is ‘NaN’ in predicted output there could be missing data or ‘NaN’ in the dataset. Working with missing data is a common task in data preprocessing. Although sometimes missing values signify a meaningful event in the data, they often represent unreliable or unusable data points.
To deal with missing data “fillmissing” function can be used, for example in the code below linear interpolation can be used to fill the missing data. Here ‘A’ is an array with missing values, ‘F’ is the filled vector.
[F,TF] = fillmissing(A,'linear','SamplePoints',x);
Please refer to the “fillmissing” documentation for more information about the function:
Please refer to this documentation for more information about dealing with missing data:
Please refer to the following “Data Preprocessing” documentation for more information about data preprocessing:
Hope this helps!

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!