NAR network outputing previous (t-1) value. Why?

Question

0 votes

Hello,

I have a problem with the MATLAB NAR network, which I have noticed elsewhere (I'll explain a bit later on). I'll explain this in layman's terms:

Basically, when I load a time series into MATLAB (for example, a stock price over time, spanning, say 5 years), I have a suitable timeseries where I would like the NAR network (say, with delay=5), to learn the past (previous) five closing prices and it's relation to the next closing price.

The NAR will learn this, one step at a time, looking at the previous 5 prices, and calculating their relation to the current price (which the network is also shown). The network will advance through all the 5 years of data, learning by example (current position price vs last 5 prices), etc., etc.

That all seems good and well. However, with data that the NAR has already seen, whenever I ask the network 5 previous prices (which it has seen), it should output (calculate) the next price (which it has also seen during training). This is what I would expect of any network (unless I'm totally wrong here).

But instead, the NAR outputs the previous (t-1) price. So, basically (where p = price):

I expect: (p(t-5), p(t-4), p(t-3), p(t-2), p(t-1)) = p(t)

but the NAR gives me: (p(t-5), p(t-4), p(t-3), p(t-2), p(t-1)) = p(t-1)

(this with data that the NAR has already seen)

Why is this?

I also built an Elman network using Encog, and got basically the same results. Tried a Deep Belief network using Accord.NET and the same thing. Tried standard feedforward, Jordan, SVM, RBF, etc. Nothing does it. Why?

They are all acting like naive predictors.

Independently from my code/data I have used, I've tried with a simple timeseries (1, 2, 3, 4, 5 .... 2000) and all networks learn perfectly, but not with stock prices.

I've also tried using deltas, log, sqrt, etc. with no luck (on stock data).

I've tried several delays: d=5, d=7, d=10, d=20, d=30, d=40, d=50, d=100 and only d=50 turned up not exactly a naive predictor, but results were significantly off using just training data.

These experiments have been made only with training data.

Why? Is stock price data "unlearnable"?

I've seen this question asked some other places, but no satisfactory answer.

As a sidenote, all MATLAB code was done using nnstart.

Thanks!

21 Comments
Show 19 older comments Hide 19 older comments

Molasar on 7 Dec 2016

Edited: Molasar on 7 Dec 2016

Open in MATLAB Online

Ok.

This is what I have done:

1) Download csv file from Yahoo Finance for Coca-Cola (KO) from 2010-08-02 to 2016-11-22

2) Import csv to SQL Server 2008 R2 database table via SSIS. Table now has 1591 rows and 7 columns (Date, Open, High, Low, Close, Volume, Adj Close)

3) Import a subset of the data to MATLAB like so:

conn = database.ODBCConnection('xxxxxx','yyyyyy','zzzzzz');
fromtarget = ' ''2011-11-14''';
totarget= ' ''2016-11-10''';
setdbprefs('DataReturnFormat','cellarray');
sqlquery = strcat('select [Adj Close] from [dddddd].[dbo].[KO] WHERE [Date] BETWEEN ', fromtarget, ' and ', totarget, ' ORDER BY [Date]');
curs = exec(conn,sqlquery);
curs = fetch(curs);
inputtmp = curs.Data;
targets = rot90(inputtmp);
close(curs);
close(conn);
clearvars inputtmp;

I now have a 1x1257 cell matrix.

4) type nnstart and enter

5) Select Time Series app

6) Select NAR and click Next

7) Select targets from the Targets dropdown and click Next

8) Select 15% validation and 15% testing and click Next

9) Select 20 hidden neurons (no exact reason) and delays = 5 and click Next

10) Leave LM training algo and click Train. Off she goes. Click Next

11) Click Next again

12) Click Next again

13) Select all Save Data checkboxes except MATLAB struct and click Save Results

14) Click Simple Script

15) Click Finish

16) Modify code Line 53 to

figure, plotresponse(t(end-40:end),y(end-40:end))

17) Run code. Save script as KO_NAR.m

I get a response plot. I edit plot and change Targets to Line and erase Errors. This is what I get:

Clearly, the NAR response is t-1

(continued...)

Molasar on 7 Dec 2016

Edited: Molasar on 7 Dec 2016

Open in MATLAB Online

Here is the full code:

% Solve an Autoregression Time-Series Problem with a NAR Neural Network
% Script generated by Neural Time Series app
% Created 06-Dec-2016 21:03:39
%
% This script assumes this variable is defined:
%
%   targets - feedback time series.
T = targets;
% Choose a Training Function
% For a list of all training functions type: help nntrain
% 'trainlm' is usually fastest.
% 'trainbr' takes longer but may be better for challenging problems.
% 'trainscg' uses less memory. Suitable in low memory situations.
trainFcn = 'trainlm';  % Levenberg-Marquardt backpropagation.
% Create a Nonlinear Autoregressive Network
feedbackDelays = 1:5;
hiddenLayerSize = 20;
net = narnet(feedbackDelays,hiddenLayerSize,'open',trainFcn);
% Prepare the Data for Training and Simulation
% The function PREPARETS prepares timeseries data for a particular network,
% shifting time by the minimum amount to fill input states and layer
% states. Using PREPARETS allows you to keep your original time series data
% unchanged, while easily customizing it for networks with differing
% numbers of delays, with open loop or closed loop feedback modes.
[x,xi,ai,t] = preparets(net,{},{},T);
% Setup Division of Data for Training, Validation, Testing
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
% Train the Network
[net,tr] = train(net,x,t,xi,ai);
% Test the Network
y = net(x,xi,ai);
e = gsubtract(t,y);
performance = perform(net,t,y)
% View the Network
view(net)
% Plots
% Uncomment these lines to enable various plots.
%figure, plotperform(tr)
%figure, plottrainstate(tr)
%figure, ploterrhist(e)
%figure, plotregression(t,y)
figure, plotresponse(t(end-40:end),y(end-40:end))
%figure, ploterrcorr(e)
%figure, plotinerrcorr(x,e)
% Closed Loop Network
% Use this network to do multi-step prediction.
% The function CLOSELOOP replaces the feedback input with a direct
% connection from the outout layer.
netc = closeloop(net);
netc.name = [net.name ' - Closed Loop'];
view(netc)
[xc,xic,aic,tc] = preparets(netc,{},{},T);
yc = netc(xc,xic,aic);
closedLoopPerformance = perform(net,tc,yc)
% Step-Ahead Prediction Network
% For some applications it helps to get the prediction a timestep early.
% The original network returns predicted y(t+1) at the same time it is
% given y(t+1). For some applications such as decision making, it would
% help to have predicted y(t+1) once y(t) is available, but before the
% actual y(t+1) occurs. The network can be made to return its output a
% timestep early by removing one delay so that its minimal tap delay is now
% 0 instead of 1. The new network returns the same outputs as the original
% network, but outputs are shifted left one timestep.
nets = removedelay(net);
nets.name = [net.name ' - Predict One Step Ahead'];
view(nets)
[xs,xis,ais,ts] = preparets(nets,{},{},T);
ys = nets(xs,xis,ais);
stepAheadPerformance = perform(nets,ts,ys)

If I do this exercise with an Encog Elman network, here is what I get:

Clearly, once again, I am getting t-1 from the Elman. Same goes for just about any other network type I have tried...

Why? Any input, insights, opinions, thoughts are welcome.

What I am expecting is a plot with both traces similar or somewhat similar, aligned.

Thanks

Molasar on 5 Jan 2017

Open in MATLAB Online

Ok, loaded the Data_GlobalIdx2 dataset.

Used column 6 of the Data array.

Same result. Here is my code (mostly generated automatically by MATLAB):

% Solve an Autoregression Time-Series Problem with a NAR Neural Network
% Script generated by Neural Time Series app
% Created 04-Jan-2017 20:23:24
%
% This script assumes this variable is defined:
%
%   targets - feedback time series.
clearvars;
load Data_GlobalIdx2;
Data(:,1:5)=[];
Data(:,2)=[];
targets = rot90(Data);
T = tonndata(targets,true,false);
% Choose a Training Function
% For a list of all training functions type: help nntrain
% 'trainlm' is usually fastest.
% 'trainbr' takes longer but may be better for challenging problems.
% 'trainscg' uses less memory. Suitable in low memory situations.
trainFcn = 'trainlm';  % Levenberg-Marquardt backpropagation.
% Create a Nonlinear Autoregressive Network
feedbackDelays = 1:5;
hiddenLayerSize = 20;
net = narnet(feedbackDelays,hiddenLayerSize,'open',trainFcn);
% Prepare the Data for Training and Simulation
% The function PREPARETS prepares timeseries data for a particular network,
% shifting time by the minimum amount to fill input states and layer
% states. Using PREPARETS allows you to keep your original time series data
% unchanged, while easily customizing it for networks with differing
% numbers of delays, with open loop or closed loop feedback modes.
[x,xi,ai,t] = preparets(net,{},{},T);
% Setup Division of Data for Training, Validation, Testing
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
% Train the Network
[net,tr] = train(net,x,t,xi,ai);
% Test the Network
y = net(x,xi,ai);
e = gsubtract(t,y);
performance = perform(net,t,y)
% View the Network
view(net)
% Plots
% Uncomment these lines to enable various plots.
%figure, plotperform(tr)
%figure, plottrainstate(tr)
%figure, ploterrhist(e)
%figure, plotregression(t,y)
figure, plotresponse(t(end-40:end),y(end-40:end))
%figure, ploterrcorr(e)
%figure, plotinerrcorr(x,e)
% Closed Loop Network
% Use this network to do multi-step prediction.
% The function CLOSELOOP replaces the feedback input with a direct
% connection from the outout layer.
netc = closeloop(net);
netc.name = [net.name ' - Closed Loop'];
view(netc)
[xc,xic,aic,tc] = preparets(netc,{},{},T);
yc = netc(xc,xic,aic);
closedLoopPerformance = perform(net,tc,yc)
% Step-Ahead Prediction Network
% For some applications it helps to get the prediction a timestep early.
% The original network returns predicted y(t+1) at the same time it is
% given y(t+1). For some applications such as decision making, it would
% help to have predicted y(t+1) once y(t) is available, but before the
% actual y(t+1) occurs. The network can be made to return its output a
% timestep early by removing one delay so that its minimal tap delay is now
% 0 instead of 1. The new network returns the same outputs as the original
% network, but outputs are shifted left one timestep.
nets = removedelay(net);
nets.name = [net.name ' - Predict One Step Ahead'];
view(nets)
[xs,xis,ais,ts] = preparets(nets,{},{},T);
ys = nets(xs,xis,ais);
stepAheadPerformance = perform(nets,ts,ys)

Response plot:

Still shifted. Why?

Molasar on 7 Jan 2017

Edited: Molasar on 7 Jan 2017

I see what you mean; however, I don't think I have really misled with my original statement as the two previously posted plots show an almost identical t-1 response.

I'll give this some thought. But it seems to me a naive predictor will beat a NAR hands down with this kind of problem...

The question now begs to be asked: How then can I get any useful information using a NAR or any other ANN with stock data? Almost certainly a topic for a new thread...

Brendan Hamm on 9 Jan 2017

I would likely consider exogenous variables. Possibly macro variables, volume/momentum, or even data derived from Twitter posts.

Sign in to comment.

Sign in to answer this question.

Follow Question

NAR network outputing previous (t-1) value. Why?

21 Comments
Show 19 older comments Hide 19 older comments

Answers (0)

Categories

Products

Tags

Community Treasure Hunt

NAR network outputing previous (t-1) value. Why?

21 Comments Show 19 older comments Hide 19 older comments

Answers (0)

Categories

Products

Tags

See Also

Community Treasure Hunt

21 Comments
Show 19 older comments Hide 19 older comments