NARX Neural Network Tool not actually predicting?

1 view (last 30 days)
I'm trying to predict the next period of some nearly periodic data using the previous few periods (this is a case where we may be missing some data less than a period long due to sensor saturation), so I divided up the data set into sequential training, validation, and test blocks (MATLAB defaults to a random distribution of data) using
net.divideFcn = 'divideblock';
Using MATLAB's Neural Network Time Series Tool and the NARX problem (I have an input series x and target y, and y's history is known), I was wondering if MATLAB is actually predicting the 'test' data set or whether it uses that data as part of training too. So I decided to test it myself.
Here's an example using 10 periods of a sin function (want to predict the last 1.5 periods). In the second picture, I trained an NN using the same input data except I zeroed the last 1.5 periods):
My data set is not too different (it's not as smooth as a sin(), but it is very periodic), but here's my question:
  • In the second picture, even when I replaced the input data set with zeroes, why does the NN prediction still follow the zeroes when it's not supposed to look at the red test data at all?
  • Or, why isn't the red prediction in the two pictures identical, if the blue training set and green validation sets are identical?
I may be misunderstanding how this works, so apologies if this is a stupid question.
The full code (everything is default except the block division instead of random):
t = 0:0.01:20*pi;
r = sin(t);
r(5341:end) = 0; % only for the data set in the second picture
X = tonndata(t',false,false);
T = tonndata(r',false,false);
trainFcn = 'trainlm';
inputDelays = 1:2;
feedbackDelays = 1:2;
hiddenLayerSize = 10;
net = narxnet(inputDelays,feedbackDelays,hiddenLayerSize,'open',trainFcn);
[x,xi,ai,t] = preparets(net,X,{},T);
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
net.divideFcn = 'divideblock';
[net,tr] = train(net,x,t,xi,ai);
y = net(x,xi,ai);
e = gsubtract(t,y);
performance = perform(net,t,y);
figure, plotresponse(t,y)

Accepted Answer

Greg Heath
Greg Heath on 23 Apr 2017
You forgot to initialize the net with nonzero delay values. The obvious choice is the last two values of the val subset:
1. Learning does not involve the test subset.
2. Learning directly uses training subset values to estimate weight values.
3. Learning does not directly use validation subset values to estimate weight values. Instead, learning only uses performance on the validation subset to decide when to stop estimating weights.
4. The consequence of learning is that outputs are estimated with a linear combination of the default delay values:
test(1) = r(5341) = 0
test(2) = r(5342) = 0
test(3) = a*r(5342) + b* r(5341) + c*test(2) + d*test(1) = 0
etc;
Therefore, you have to initialize the test subset data with nonzero values.
test(1) = val(5338+1)
test(2) = val(5338+2)
and for k >= 3
test(k) = a*r(5338+k) + b* r(5338+k-1) + c*test(k-1) + d*test(k-2)
Hope this helps.
*Thank you for formally accepting my answer^
Greg
  1 Comment
Chirag Patel
Chirag Patel on 5 May 2017
Greg,
I've tried initializing the test data using the previous two values of the val data (or even a continuation of the sin function), but the results are the same; the test estimate still follows the following zeroes.
How would I go about filling in the rest of the test data if it's supposed to be unknown? That is, if I'm supposed to provide the test data on my own instead of just setting it to zeroes, won't finding the a, b, c, d coefficients amount to knowing the supposedly-missing data?

Sign in to comment.

More Answers (0)

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!