# forecast

Forecast responses of univariate regression model with ARIMA time series errors

## Syntax

``````[Y,YMSE] = forecast(Mdl,numperiods)``````
``````[Y,YMSE,U] = forecast(Mdl,numperiods)``````
``[___] = forecast(___,Name=Value)``
``Tbl = forecast(Mdl,numperiods,Presample=Presample,PresampleRegressionDisturbanceVariable=PresampleRegressionDisturbanceVariable)``
``Tbl = forecast(Mdl,numperiods,InSample=InSample,PredictorVariables=PredictorVariables)``
``Tbl = forecast(Mdl,numperiods,Presample=Presample,PresampleRegressionDisturbanceVariable=PresampleRegressionDisturbanceVariable,InSample=InSample,PredictorVariables=PredictorVariables)``
``Tbl = forecast(___,Name=Value)``

## Description

``````[Y,YMSE] = forecast(Mdl,numperiods)``` returns the `numperiods`-by-1 numeric vector of consecutive forecasted responses `Y` and the corresponding numeric vector of forecast mean square errors (MSE) `YMSE` of the fully specified, univariate regression model with ARIMA time series errors `Mdl`.```

example

``````[Y,YMSE,U] = forecast(Mdl,numperiods)``` also forecasts a `numperiods`-by-1 numeric vector of unconditional disturbances `U`.```
````[___] = forecast(___,Name=Value)` specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. `forecast` returns the output argument combination for the corresponding input arguments. For example, `forecast(Mdl,10,Y0=y0,X0=Pred0,XF=Pred)` specifies the presample response path `y0`, and the presample and forecast sample predictor data `Pred0` and `Pred`, respectively, to forecast a model with a regression component. ```

example

````Tbl = forecast(Mdl,numperiods,Presample=Presample,PresampleRegressionDisturbanceVariable=PresampleRegressionDisturbanceVariable)` returns the table or timetable `Tbl` containing a variable for each of the paths of response, forecast MSE, and unconditional disturbance series resulting from forecasting the regression model with ARIMA errors `Mdl` over a `numperiods` forecast horizon. `Presample` is a table or timetable containing presample unconditional disturbance data in the variable specified by `PresampleRegressionDisturbanceVariable`. Alternatively, `Presample` can contain presample error model innovation data in the variable specified by `PresampleInnovationVariable` or a combination of presample response and predictor data in the variables specified by `PresampleResponseVariable` and `PresamplePredictorVariables`. You can specify either alternative instead of `PresampleRegressionDisturbanceVariable` using name-value syntax; `forecast` infers presample unconditional disturbance data from either alternative specification. (since R2023b)```

example

````Tbl = forecast(Mdl,numperiods,InSample=InSample,PredictorVariables=PredictorVariables)` specifies the variables `PredictorVariables` in the in-sample table or timetable of data `InSample` containing the predictor data for the model regression component. (since R2023b)```

example

````Tbl = forecast(Mdl,numperiods,Presample=Presample,PresampleRegressionDisturbanceVariable=PresampleRegressionDisturbanceVariable,InSample=InSample,PredictorVariables=PredictorVariables)` specifies presample unconditional disturbance data to initialize the error model and in-sample predictor data for the regression component. You can choose different presample data from `Presample` when it is applicable. (since R2023b)```

example

````Tbl = forecast(___,Name=Value)` uses additional options specified by one or more name-value arguments, using any input argument combination in the previous three syntaxes. (since R2023b)For example, `forecast(Mdl,20,Presample=PSTbl,PresampleResponseVariables="GDP",PresamplePredictorVariables="CPI",InSample=Tbl,PredictorVariables="CPI")` returns a timetable containing variables for the forecasted responses, forecast MSE, and forecasted unconditional disturbance paths, forecasted 20 periods into the future. `forecast` initializes the model by using the presample response and predictor data in the `GDP` and `CPI` variables of the timetable `PSTbl`. `forecast` applies the predictor data in the `PredictorVariables` variables of the table or timetable `Tbl` to the model regression component.```

example

## Examples

collapse all

Return a vector of responses, forecasted over a 30-period horizon, from the following regression model with ARMA(2,1) errors:

`$\begin{array}{l}\begin{array}{c}{y}_{t}={X}_{t}\left[\begin{array}{c}0.1\\ -0.2\end{array}\right]+{u}_{t}\\ {u}_{t}=0.5{u}_{t-1}-0.8{u}_{t-2}+{\epsilon }_{t}-0.5{\epsilon }_{t-1},\end{array}\end{array}$`

where ${\epsilon }_{t}$ is Gaussian with variance 0.1.

Specify the model. Simulate responses from the model and two predictor series.

```Mdl0 = regARIMA(Intercept=0,AR={0.5 -0.8},MA=-0.5, ... Beta=[0.1; -0.2],Variance=0.1); rng(1,"twister"); % For reproducibility T = 130; numperiods = 30; Pred = randn(T,2); y = simulate(Mdl0,T,X=Pred);```

Fit the model to the first 100 observations, and reserve the remaining 30 observations to evaluate forecast performance.

```Mdl = regARIMA(2,0,1); estidx = 1:(T-numperiods); % Estimation sample indices fhidx = (T-numperiods+1):T; % Forecast horizon EstMdl = estimate(Mdl,y(estidx),X=Pred(estidx,:));```
``` Regression with ARMA(2,1) Error Model (Gaussian Distribution): Value StandardError TStatistic PValue _________ _____________ __________ __________ Intercept 0.0074068 0.012554 0.58999 0.5552 AR{1} 0.55422 0.087265 6.351 2.1391e-10 AR{2} -0.78361 0.080794 -9.6988 3.0499e-22 MA{1} -0.46483 0.1394 -3.3345 0.00085446 Beta(1) 0.092779 0.024497 3.7873 0.00015228 Beta(2) -0.17339 0.021143 -8.2008 2.3874e-16 Variance 0.073721 0.011006 6.6984 2.1066e-11 ```

`EstMdl` is a new `regARIMA` model containing the estimates. The estimates are close to their true values.

Use `EstMdl` to forecast a 30-period horizon.

```[yF,yMSE] = forecast(EstMdl,numperiods,Y0=y(estidx), ... X0=Pred(estidx,:),XF=Pred(fhidx,:));```

`yF` is a 30-by-1 vector of forecasted responses and `yMSE` is a 30-by-1 vector of corresponding forecast MSEs. To initialize the model for forecasting, `forecast` infers required presample unconditional disturbances from the specified presample response and predictor data.

Visually compare the forecasts to the holdout data using a plot.

```figure plot(y,Color=[.7,.7,.7]); hold on plot(fhidx,yF,"b",LineWidth=2); plot(fhidx,yF + 1.96*sqrt(yMSE),"r:",LineWidth=2); plot(fhidx,yF - 1.96*sqrt(yMSE),"r:",LineWidth=2); h = gca; ph = patch([repmat(T-numperiods+1,1,2) repmat(T,1,2)], ... [h.YLim fliplr(h.YLim)],[0 0 0 0],"b"); ph.FaceAlpha = 0.1; legend("Observed","Forecast","95% forecast interval", ... Location="best"); title("30-Period Forecasts and 95% Forecast Intervals") axis tight hold off```

Many observations in the holdout sample fall beyond the 95% forecast intervals. Two reasons for this are:

• The predictors are randomly generated in this example. `estimate` treats the predictors as fixed. The 95% forecast intervals based on the estimates from `estimate` do not account for the variability in the predictors.

• By shear chance, the estimation period seems less volatile than the forecast period. `estimate` uses the less volatile estimation period data to estimate the parameters. Therefore, forecast intervals based on the estimates should not cover observations that have an underlying innovations process with larger variability.

Forecast stationary, log GDP using a regression model with ARMA(1,1) errors, including CPI as a predictor.

Fit a regression model with ARMA(1,1) errors by regressing the US gross domestic product (GDP) growth rate onto consumer price index (CPI) quarterly changes. Forecast the model into a 2-year (8-quarter) horizon. Supply a timetable of data and specify the series for the fit.

Load the US macroeconomic data set. Compute the series of GDP quarterly growth rates and CPI quarterly changes.

```load Data_USEconModel DTT = price2ret(DataTimeTable,DataVariables="GDP"); DTT.GDPRate = 100*DTT.GDP; DTT.CPIDel = diff(DataTimeTable.CPIAUCSL); T = height(DTT) ```
```T = 248 ```
```figure tiledlayout(2,1) nexttile plot(DTT.Time,DTT.GDPRate) title("GDP Rate") ylabel("Percent Growth") nexttile plot(DTT.Time,DTT.CPIDel) title("Index")```

The series appear stationary, albeit heteroscedastic.

Prepare Timetable for Estimation

When you plan to supply a timetable, you must ensure it has all the following characteristics:

• The selected response variable is numeric and does not contain any missing values.

• The timestamps in the `Time` variable are regular, and they are ascending or descending.

Remove all missing values from the timetable.

```DTT = rmmissing(DTT); T_DTT = height(DTT)```
```T_DTT = 248 ```

Because each sample time has an observation for all variables, `rmmissing` does not remove any observations.

Determine whether the sampling timestamps have a regular frequency and are sorted.

`areTimestampsRegular = isregular(DTT,"quarters")`
```areTimestampsRegular = logical 0 ```
`areTimestampsSorted = issorted(DTT.Time)`
```areTimestampsSorted = logical 1 ```

`areTimestampsRegular = 0` indicates that the timestamps of `DTT` are irregular. `areTimestampsSorted = 1` indicates that the timestamps are sorted. Macroeconomic series in this example are timestamped at the end of the month. This quality induces an irregularly measured series.

Remedy the time irregularity by shifting all dates to the first day of the quarter.

```dt = DTT.Time; dt = dateshift(dt,"start","quarter"); DTT.Time = dt; areTimestampsRegular = isregular(DTT,"quarters")```
```areTimestampsRegular = logical 1 ```

`DTT` is regular.

Create Model Template for Estimation

Suppose that a regression model of CPI quarterly changes onto the GDP rate, with ARMA(1,1) errors, is appropriate.

Create a model template for a regression model with ARMA(1,1) errors template. Specify the response variable name.

```Mdl = regARIMA(1,0,1); Mdl.SeriesName = "GDPRate";```

`Mdl` is a partially specified `regARIMA` object.

Partiton Data

Partition the data set into estimation and forecast samples.

```fh = 8; DTTES = DTT(1:(T_DTT-fh),:); DTTFS = DTT((T_DTT-fh+1):end,:);```

Fit Model to Data

Fit a regression model with ARMA(1,1) errors to the estimation sample. Specify the entire series GDP rate and CPI quarterly changes series, and specify the predictor variable name.

`EstMdl = estimate(Mdl,DTTES,PredictorVariables="CPIDel");`
``` Regression with ARMA(1,1) Error Model (Gaussian Distribution): Value StandardError TStatistic PValue __________ _____________ __________ __________ Intercept 0.016489 0.0017307 9.5272 1.6152e-21 AR{1} 0.57835 0.096952 5.9653 2.4415e-09 MA{1} -0.15125 0.11658 -1.2974 0.19449 Beta(1) 0.0025095 0.0014147 1.7738 0.076089 Variance 0.00011319 7.5405e-06 15.01 6.2792e-51 ```

`EstMdl` is a fully specified, estimated `regARIMA` object. By default, `estimate` backcasts for the required `Mdl.P = 1` presample regression model residual and sets the required `Mdl.Q = 1` presample error model residual to 0.

Forecast Estimated Model

Forecast the GDP rate over a 8-quarter horizon. Use the estimation sample as a presample for the forecast.

```Tbl = forecast(EstMdl,fh,Presample=DTTES,PresampleResponseVariable="GDPRate", ... PresamplePredictorVariables="CPIDel",InSample=DTTFS, ... PredictorVariables="CPIDel")```
```Tbl=8×7 timetable Time Interval GDP GDPRate CPIDel GDPRate_Response GDPRate_MSE GDPRate_RegressionInnovation _____ ________ ___________ __________ ______ ________________ ___________ ____________________________ Q2-07 91 0.00018278 0.018278 1.675 0.015765 0.00011319 -0.0049278 Q3-07 91 0.00016916 0.016916 1.359 0.01705 0.00013383 -0.00285 Q4-07 94 6.1286e-05 0.0061286 3.355 0.02326 0.00014074 -0.0016483 Q1-08 91 9.3272e-05 0.0093272 1.93 0.020379 0.00014305 -0.00095329 Q2-08 91 0.00011103 0.011103 3.367 0.024387 0.00014382 -0.00055134 Q3-08 92 8.9585e-05 0.0089585 1.641 0.020288 0.00014408 -0.00031887 Q4-08 92 -0.00016145 -0.016145 -7.098 -0.0015075 0.00014417 -0.00018442 Q1-09 90 -8.6878e-05 -0.0086878 1.137 0.019236 0.0001442 -0.00010666 ```

`Tbl` is a 8-by-7 timetable containing the forecasted responses `GDPRate_Response` and their forecast MSEs `GDPRate_MSE`, the forecasted unconditional disturbances `GDPRate_RegressionInnovation`, and all variables in `DTTFS`.

Plot the forecasts and 95% forecast intervals.

```Tbl.Lower = Tbl.GDPRate_Response - 1.96*sqrt(Tbl.GDPRate_MSE); Tbl.Upper = Tbl.GDPRate_Response + 1.96*sqrt(Tbl.GDPRate_MSE); figure h1 = plot(DTT.Time(end-65:end),DTT.GDPRate(end-65:end), ... Color=[.7,.7,.7]); hold on h2 = plot(Tbl.Time,Tbl.GDPRate_Response,"b",LineWidth=2); h3 = plot(Tbl.Time,Tbl.Lower,"r:",LineWidth=2); plot(DTTFS.Time,Tbl.Upper,"r:",LineWidth=2); ha = gca; title("GDP Rate Forecasts and 95% Forecast Intervals") ph = patch([repmat(Tbl.Time(1),1,2) repmat(Tbl.Time(end),1,2)],... [ha.YLim fliplr(ha.YLim)],... [0 0 0 0],"b"); ph.FaceAlpha = 0.1; legend([h1 h2 h3],["Observed GDP rate" "Forecasted GDP rate", ... "95% forecast interval"],Location="best") axis tight hold off```

Fit a regression model with ARIMA(1,1,1) errors by regressing the quarterly log US GDP onto the log CPI. Compute MMSE forecasts of the log GDP series using the estimated model. Supply data in timetables.

Load the US macroeconomic data set. Compute the log GDP series.

```load Data_USEconModel DTT = DataTimeTable; DTT.LogGDP = log(DTT.GDP); T = height(DTT);```

Remedy the time irregularity by shifting all dates to the first day of the quarter.

```dt = DTT.Time; dt = dateshift(dt,"start","quarter"); DTT.Time = dt;```

Reserve 2 years (8 quarters) of data at the end of the series to compare against the forecasts.

```numperiods = 8; DTTES = DTT(1:(T-numperiods),:); % Estimation sample DTTFS = DTT((T-numperiods+1):T,:); % Forecast horizon```

Suppose that a regression model of the quarterly log GDP on CPI, with ARMA(1,1) errors, is appropriate.

Create a model template for a regression model with ARMA(1,1) errors template. Specify the response variable name.

```Mdl = regARIMA(1,1,1); Mdl.SeriesName = "LogGDP";```

The intercept is not identifiable in a regression model with integrated errors. Fix its value before estimation. One way to do this is to estimate the intercept using simple linear regression. Use the estimation sample.

```coeff = [ones(T-numperiods,1) DTTES.CPIAUCSL]\DTTES.LogGDP; Mdl.Intercept = coeff(1);```

Consider performing a sensitivity analysis by using a grid of intercepts.

Reserve 2 years (8 quarters) of data at the end of the series to compare against the forecasts.

```numperiods = 8; estidx = 1:(T-numperiods); % Estimation sample frstHzn = (T-numperiods+1):T; % Forecast horizon ```

Fit a regression model with ARMA(1,1,1) errors to the estimation sample. Specify the predictor variable name.

`EstMdl = estimate(Mdl,DTTES,PredictorVariables="CPIAUCSL");`
``` Regression with ARIMA(1,1,1) Error Model (Gaussian Distribution): Value StandardError TStatistic PValue __________ _____________ __________ ___________ Intercept 5.8303 0 Inf 0 AR{1} 0.92869 0.028414 32.684 2.6126e-234 MA{1} -0.39063 0.057599 -6.7819 1.1858e-11 Beta(1) 0.0029335 0.0014645 2.0031 0.045166 Variance 0.00010668 6.9256e-06 15.403 1.554e-53 ```

`EstMdl` is a fully specified, estimated `regARIMA` object. By default, `estimate` backcasts for the required `Mdl.P = 2` presample regression model residual and sets the required `Mdl.Q = 1` presample error model residual to 0.

Infer estimation sample unconditional disturbances to initialize the model for forecasting. Specify the predictor variable name.

`Tbl0 = infer(EstMdl,DTTES,PredictorVariables="CPIAUCSL");`

Forecast the estimated model over an 8-quarter horizon. Use the inferred unconditional disturbances as presample data. Specify the forecast sample predictor data and its variable name, and specify the presample unconditional disturbance variable name.

```Tbl = forecast(EstMdl,numperiods,Presample=Tbl0, ... PresampleRegressionDisturbanceVariable="LogGDP_RegressionResidual", ... InSample=DTTFS,PredictorVariables="CPIAUCSL");```

Plot the forecasted log GDP with approximate 95% forecast intervals. Also, separately plot the unconditional disturbances.

```Tbl.Lower = Tbl.LogGDP_Response - 1.96*sqrt(Tbl.LogGDP_MSE); Tbl.Upper = Tbl.LogGDP_Response + 1.96*sqrt(Tbl.LogGDP_MSE); figure tiledlayout(2,1) nexttile plot(DTT.Time(end-40:end),DTT.LogGDP(end-40:end),Color=[.7,.7,.7]) hold on h1 = plot(Tbl.Time,[Tbl.Lower Tbl.Upper],"r:",LineWidth=2); h2 = plot(Tbl.Time,Tbl.LogGDP_Response,"k",LineWidth=2); h = gca; ph = patch([repmat(Tbl.Time(1),1,2) repmat(Tbl.Time(end),1,2)], ... [h.YLim fliplr(h.YLim)],[0 0 0 0],"b"); ph.FaceAlpha = 0.1; legend([h1(1) h2],["95% percentile intervals" "MMSE forecast"], ... Location="northwest") axis tight grid on title("Log GDP Forecast Over 2-year Horizon") hold off nexttile plot(DTT.Time,[Tbl0.LogGDP_RegressionResidual; Tbl.LogGDP_RegressionInnovation]) hold on h = gca; ph = patch([repmat(Tbl.Time(1),1,2) repmat(Tbl.Time(end),1,2)], ... [h.YLim fliplr(h.YLim)],[0 0 0 0],"b"); ph.FaceAlpha = 0.1; axis tight grid on title("Unconditional Disturbances") hold off```

The unconditional disturbances, ${u}_{t}$, are nonstationary, therefore the widths of the forecast intervals grow with time.

## Input Arguments

collapse all

Fully specified regression model with ARIMA errors, specified as a `regARIMA` model object created by `regARIMA` or `estimate`.

The properties of `Mdl` cannot contain `NaN` values.

Forecast horizon, or the number of time points in the forecast period, specified as a positive integer.

Data Types: `double`

Since R2023b

Presample data containing presample responses yt, predictors xt, unconditional disturbances ut, or error model innovations εt, to initialize the model, specified as a table or timetable with `numprevars` variables and `numpreobs` rows. You can select a response, error model innovation, unconditional disturbance, or multiple predictor variables from `Presample` by using the `PresampleResponseVariable`, `PresampleErrorInnovationVariable`, `PresampleRegressionDisturbanceVariable`, or `PresamplePredictorVariables` name-value argument, respectively.

`numpreobs` is the number of presample observations. `numpaths` is the maximum number of independent presample paths among the specified variables, from which `forecast` initializes the resulting `numpaths` forecasts (see Algorithms).

For all selected variables except predictor variables, each variable contains a single path (`numpreobs`-by-1 vector) or multiple paths (`numpreobs`-by-`numpaths` matrix) of presample response, error model innovation, or unconditional disturbance data.

Each selected predictor variable contains a single path of observations. `forecast` applies all selected predictor variables to each forecasted path.

Each row is a presample observation, and measurements in each row occur simultaneously. The last row contains the latest presample observation. `forecast` uses only the latest required rows. For more details, see Time Base Partitions for Forecasting.

Presample unconditional disturbances ut are required to initialize the error model for forecasting. You can specify presample unconditional disturbances in one of the following ways:

• Specify `numpreobs``Mdl.P` presample response and predictor data to enable `forecast` to infer presample unconditional disturbances.

• Specify `numpreobs``Mdl.P` presample unconditional disturbances without presample error model innovations. `forecast` ignores specified presample response and predictor data.

• Specify `numpreobs``Mdl.Q` presample error model innovations without presample unconditional disturbances. `forecast` ignores specified presample response and predictor data.

• Specify `numpreobs``max(Mdl.P,Mdl.Q)` presample error model innovations and unconditional disturbances only. `forecast` ignores specified presample response and predictor data.

If `Presample` is a timetable, all the following conditions must be true:

• `Presample` must represent a sample with a regular datetime time step (see `isregular`).

• The datetime vector of sample timestamps `Presample.Time` must be ascending or descending.

If `Presample` is a table, the last row contains the latest presample observation.

By default, `forecast` sets all necessary presample unconditional disturbances in one of the following ways:

• If `forecast` cannot infer enough unconditional disturbances from specified presample response and predictor data, `forecast` sets all necessary presample unconditional disturbances to zero.

• If you specify at least `Mdl.P + Mdl.Q` presample unconditional disturbances, `forecast` infers all necessary presample error model innovations from the specified presample unconditional disturbances. Otherwise, `forecast` sets all necessary presample error model innovations to zero.

Since R2023b

Presample unconditional disturbance variable ut to select from `Presample` containing presample unconditional disturbance data, specified as one of the following data types:

• String scalar or character vector containing a variable name in `Presample.Properties.VariableNames`

• Variable index (positive integer) to select from `Presample.Properties.VariableNames`

• A logical vector, where ```PresampleRegressionDisturbanceVariable(j) = true``` selects variable `j` from `Presample.Properties.VariableNames`

The selected variable must be a numeric vector and cannot contain missing values (`NaN`s).

If you specify presample unconditional disturbance data in `Presample`, you must specify `PresampleRegressionDisturbanceVariable`.

Example: `PresampleRegressionDisturbanceVariable="StockRateU0"`

Example: ```PresampleRegressionDisturbanceVariable=[false false true false]``` or `PresampleRegressionDisturbanceVariable=3` selects the third table variable as the presample unconditional disturbance variable.

Data Types: `double` | `logical` | `char` | `cell` | `string`

Since R2023b

Forecasted (future) predictor data for the model regression component, specified as a table or timetable. `InSample` contains `numvars` variables, including `numpreds` predictor variables xt.

`forecast` returns the forecasted variables in the output table or timetable `Tbl`, which is commensurate with `InSample`.

Each row corresponds to an observation in the forecast horizon, the first row is the earliest observation, and measurements in each row, among all paths, occur simultaneously. `InSample` must have at least `numperiods` rows to cover the forecast horizon. If you supply more rows than necessary, `forecast` uses only the first `numperiods` rows.

Each selected predictor variable is a numeric vector without missing values (`NaN`s). `forecast` applies the specified predictor variables to all forecasted paths.

If `InSample` is a timetable, the following conditions apply:

• `InSample` must represent a sample with a regular datetime time step (see `isregular`).

• The datetime vector `InSample.Time` must be ascending or descending.

• `Presample` must immediately precede `InSample`, with respect to the sampling frequency.

If `InSample` is a table, the last row contains the latest observation.

By default, `forecast` does not include the regression component in the model, regardless of the value of `Mdl.Beta`.

Since R2023b

Predictor variables xt to select from `InSample` containing predictor data for the model regression component in the forecast horizon, specified as one of the following data types:

• String vector or cell vector of character vectors containing `numpreds` variable names in `InSample.Properties.VariableNames`

• A vector of unique indices (positive integers) of variables to select from `InSample.Properties.VariableNames`

• A logical vector, where `PredictorVariables(j) = true ` selects variable `j` from `InSample.Properties.VariableNames`

The selected variables must be numeric vectors and cannot contain missing values (`NaN`s).

By default, `forecast` excludes the regression component, regardless of its presence in `Mdl`.

Example: ```PredictorVariables=["M1SL" "TB3MS" "UNRATE"]```

Example: `PredictorVariables=[true false true false]` or `PredictorVariable=[1 3]` selects the first and third table variables to supply the predictor data.

Data Types: `double` | `logical` | `char` | `cell` | `string`

### Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose `Name` in quotes.

Example: For example, `forecast(Mdl,10,Y0=y0,X0=Pred0,XF=Pred)` specifies the presample response path `y0`, and the presample and forecast sample predictor data `Pred0` and `Pred`, respectively, to forecast a model with a regression component.

Presample response data yt to infer presample unconditional disturbances ut, specified as a `numpreobs`-by-1 numeric column vector or a `numpreobs`-by-`numpaths` numeric matrix. When you supply `Y0`, supply all optional data as numeric arrays, and `forecast` returns results in numeric arrays.

Presample unconditional disturbances ut are required to initialize the error model for forecasting. `forecast` infers presample unconditional disturbances from `Y0` and specified presample predictor data `X0`. Therefore, if you specify presample unconditional disturbances `U0`, `forecast` ignores `Y0` and `X0`.

`numpreobs` is the number of presample observations. `numpaths` is the number of independent presample paths, from which `forecast` initializes the resulting `numpaths` forecasts (see Algorithms).

Each row is a presample observation, and measurements in each row occur simultaneously. The last row contains the latest presample observation. `numpreobs` must be at least `Mdl.P` to initialize the model. If `numpreobs` > `Mdl.P`, `forecast` uses only the latest `Mdl.P` rows. For more details, see Time Base Partitions for Forecasting.

Columns of `Y0` correspond to separate, independent presample paths.

• If `Y0` is a column vector, it represents a single path of the response series. `forecast` applies it to each forecasted path. In this case, all forecast paths `Y` derive from the same initial responses.

• If `Y0` is a matrix, each column represents a presample path of the response series. `numpaths` is the maximum among the second dimensions of the specified presample observation matrices `Y0`, `E0`, and `U0`.

By default, `forecast` defers to specified or default presample unconditional disturbances `U0`.

Data Types: `double`

Presample predictor data xt used to infer the presample unconditional disturbances ut, specified as a `numpreobs`-by-`numpreds` numeric matrix. Use `X0` only when you supply the numeric array of presample response data `Y0` and your model contains a regression component. `numpreds` = `numel(Mdl.Beta)`.

Presample unconditional disturbances ut are required to initialize the error model for forecasting. `forecast` infers presample unconditional disturbances from `X0` and specified presample response data `Y0`. Therefore, if you specify presample unconditional disturbances `U0`, `forecast` ignores `Y0` and `X0`.

Each row is a presample observation, and measurements in each row occur simultaneously. The last row contains the latest presample observation. `numpreobs` must be at least `Mdl.P` to initialize the model. If `numpreobs` > `Mdl.P`, `forecast` uses only the latest `Mdl.P` rows. For more details, see Time Base Partitions for Forecasting.

Each column is an individual predictor variable. `forecast` applies `X` to each path; that is, `X` represents one path of observed predictors.

If you specify `X0` but you do not specify forecasted predictor data `XF`, `forecast` issues an error.

By default, `forecast` drops the regression component from the model when it infers presample unconditional disturbances, regardless of the value of the regression coefficient `Mdl.Beta`.

Data Types: `double`

Presample unconditional disturbance data ut to initialize the autoregressive (AR) component of the ARIMA error model, specified as a `numpreobs`-by-1 numeric column vector or a `numpreobs`-by-`numpaths` numeric matrix. When you supply `U0`, supply all optional data as numeric arrays, and `forecast` returns results in numeric arrays.

Each row is a presample observation, and measurements in each row occur simultaneously. The last row contains the latest presample observation. `numpreobs` must be at least `Mdl.P` to initialize the model. If `numpreobs` > `Mdl.P`, `forecast` uses only the latest `Mdl.P` rows. For more details, see Time Base Partitions for Forecasting.

Columns of `U0` correspond to separate, independent presample paths.

• If `U0` is a column vector, it represents a single path of the unconditional disturbance series. `forecast` applies it to each forecasted path. In this case, all forecasted paths derive from the same initial responses.

• If `U0` is a matrix, each column represents a presample path of the unconditional disturbance series. `numpaths` is the maximum among the second dimensions of the specified presample observation matrices `Y0`, `E0`, and `U0`.

By default, if the presample data (`Y0` and `X0`) contains at least `Mdl.P` rows, `forecast` infers `U0` from the presample data. If you do not specify presample data, then all required presample unconditional disturbances are zero.

Data Types: `double`

Presample error model innovation data εt used to initialize either the moving average (MA) component of the ARIMA error model, specified as a `numpreobs`-by-1 column vector or `numpreobs`-by-`numpaths` numeric matrix. Use `E0` only when you supply the numeric array of presample response data `Y0`. `forecast` assumes that the presample innovations have a mean of zero.

Each row is a presample observation, and measurements in each row occur simultaneously. The last row contains the latest presample observation. `numpreobs` must be at least `Mdl.Q` to initialize the model. If `numpreobs` is greater than required, `forecast` uses only the latest required rows.

Columns of `E0` correspond to separate, independent presample paths.

• If `E0` is a column vector, it represents a single path of the innovation series. `forecast` applies it to each forecasted path. In this case, all forecasts derive from the same initial error model innovations.

• If `E0` is a matrix, each column represents a presample path of the error model innovation series. `numpaths` is the maximum among the second dimensions of the specified presample observation matrices `Y0`, `U0`, and `U0`.

By default, if `U0` contains at least `Mdl.P` + `Mdl.Q` rows, `forecast` infers `E0` from `U0`. If `U0` has an insufficient number of rows and `forecast` cannot infer sufficient observations of `U0` from the presample data (`Y0` and `X0`), `forecast` sets necessary presample error model innovations to zero.

Data Types: `double`

Since R2023b

Response variable yt to select from `Presample` containing the presample response data, specified as one of the following data types:

• String scalar or character vector containing a variable name in `Presample.Properties.VariableNames`

• Variable index (positive integer) to select from `Presample.Properties.VariableNames`

• A logical vector, where ```PreampleResponseVariable(j) = true``` selects variable `j` from `Presample.Properties.VariableNames`

`forecast` uses specified presample response and predictor data to infer presample unconditional disturbances. If you specify enough presample unconditional disturbances or error model innovations by using `Presample` and `PresampleRegressionDisturbanceVariable` or `PresampleInnovationVariable`, `forecast` ignores `PresamplePredictorVariables` and `PresampleResponseVariable`.

The selected variable must be a numeric vector and cannot contain missing values (`NaN`s).

If you specify presample response data by using the `Presample` name-value argument, you must specify `PresampleResponseVariable`.

Example: `PresampleResponseVariable="StockRate"`

Example: `PresampleResponseVariable=[false false true false]` or `PresampleResponseVariable=3` selects the third table variable as the response variable.

Data Types: `double` | `logical` | `char` | `cell` | `string`

Since R2023b

Presample predictor variables xt to select from `Presample` containing presample predictor data for the regression component in the presample period, specified as one of the following data types:

• String vector or cell vector of character vectors containing `numpreds` variable names in `Presample.Properties.VariableNames`

• A vector of unique indices (positive integers) of variables to select from `Presample.Properties.VariableNames`

• A logical vector, where ```PresamplePredictorVariables(j) = true ``` selects variable `j` from `Presample.Properties.VariableNames`

`forecast` uses specified presample response and predictor data to infer presample unconditional disturbances. If you specify enough presample unconditional disturbances or error model innovations by using `Presample` and `PresampleRegressionDisturbanceVariable` or `PresampleInnovationVariable`, `forecast` ignores `PresamplePredictorVariables` and `PresampleResponseVariable`.

The selected variables must be numeric vectors and cannot contain missing values (`NaN`s).

If you specify presample predictor data, you must also specify in-sample predictor data by using the `InSample` and `PredictorVariables` name-value arguments.

By default, `forecast` excludes the regression component, regardless of its presence in `Mdl`.

Example: ```PresamplePredictorVariables=["M1SL" "TB3MS" "UNRATE"]```

Example: `PresamplePredictorVariables=[true false true false]` or `PredictorVariable=[1 3]` selects the first and third table variables to supply the predictor data.

Data Types: `double` | `logical` | `char` | `cell` | `string`

Since R2023b

Presample error model innovation variable of εt to select from `Presample` containing presample error model innovation data, specified as one of the following data types:

• String scalar or character vector containing a variable name in `Presample.Properties.VariableNames`

• Variable index (positive integer) to select from `Presample.Properties.VariableNames`

• A logical vector, where ```PresampleInnovationVariable(j) = true``` selects variable `j` from `Presample.Properties.VariableNames`

The selected variable must be a numeric matrix and cannot contain missing values (`NaN`s).

If you specify presample error model innovation data in `Presample`, you must specify `PresampleInnovationVariable`.

Example: `PresampleInnovationVariable="StockRateDist0"`

Example: `PresampleInnovationVariable=[false false true false]` or `PresampleInnovationVariable=3` selects the third table variable as the presample error model innovation variable.

Data Types: `double` | `logical` | `char` | `cell` | `string`

Forecasted (or future) predictor data, specified as a numeric matrix with `numpreds` columns. `XF` represents the evolution of specified presample predictor data `X0` forecasted into the future (the forecast period). Use `XF` only when you supply the numeric array of presample response and predictor data `Y0` and `X0`, respectively.

Rows of `XF` correspond to time points in the future; `XF(t,:)` contains the `t`-period-ahead predictor forecasts. `XF` must have at least `numperiods` rows. If the number of rows exceeds `numperiods`, `forecast` uses only the first (earliest) `numperiods` forecasts. For more details, see Time Base Partitions for Forecasting.

Columns of `XF` are separate time series variables, and they correspond to the columns of `X0` and `Mdl.Beta`.

`forecast` treats `XF` as a fixed (nonstochastic) matrix.

By default, the `forecast` function generates forecasts from `Mdl` without a regression component, regardless of the value of the regression coefficient `Mdl.Beta`.

Note

• `NaN` values in `X0`, `Y0`, `U0`, `E0`, and `XF` indicate missing values. `forecast` removes missing values from specified data by list-wise deletion.

• For the presample, `forecast` horizontally concatenates the possibly jagged arrays `X0`, `Y0`, `U0`, and `E0` with respect to the last rows, and then it removes any row of the concatenated matrix containing at least one `NaN`.

• For in-sample data, `forecast` removes any row of `XF` containing at least one `NaN`.

This type of data reduction reduces the effective sample size and can create an irregular time series.

• For numeric data inputs, `forecast` assumes that you synchronize the presample data such that the latest observations occur simultaneously.

• `forecast` issues an error when any table or timetable input contains missing values.

• Set presample response and predictor data to the same response and predictor data as used in the estimation, simulation, or inference of `Mdl`. This assignment ensures correct inference of the required presample unconditional disturbances.

• To include a regression component in the response forecast, you must specify the forecasted predictor data. You can specify forecasted predictor data without also specifying presample predictor data, but `forecast` issues an error when you specify presample predictor data without also specifying forecasted predictor data.

## Output Arguments

collapse all

MMSE forecasted responses yt, returned as a `numperiods`-by-1 column vector or a `numperiods`-by-`numpaths` numeric matrix. `Y` represents a continuation of `Y0` (`Y(1,:)` occurs in the time point immediately after `Y0(end,:)`). `forecast` returns `Y` by default and when you supply optional data presample data in numeric arrays.

`Y(t,:)` contains the `t`-period-ahead forecasts, or the forecast of all paths for time point `t` in the forecast period.

`forecast` determines `numpaths` from the number of columns in the presample data sets `Y0`, `E0`, and `U0`. For details, see Algorithms. If each presample data set has one column, `Y` is a column vector.

Data Types: `double`

MSE of the forecasted responses `Y` (forecast error variances), returned as a `numperiods`-by-1 column vector or a `numperiods`-by-`numpaths` numeric matrix. `forecast` returns `YMSE` by default and when you supply optional data presample data in numeric arrays.

`YMSE(t,:)` contains the forecast error variances of all paths for time point `t` in the forecast period.

`forecast` determines `numpaths` from the number of columns in the presample data sets `Y0`, `E0`, and `U0`. For details, see Algorithms. If you do not specify any presample data sets, or if each data set is a column vector, `YMSE` is a column vector.

The square roots of `YMSE` are the standard errors of the forecasts `Y`.

Data Types: `double`

MMSE forecasts of ARIMA error model unconditional disturbances, returned as a `numperiods`-by-1 column vector or a `numperiods`-by-`numpaths` numeric matrix. `U` represents a continuation of `U0` (`U(1,:)` occurs in the time point immediately after `U0(end,:)`). `forecast` returns `U` by default and when you supply optional data presample data in numeric arrays.

`U(t,:)` contains the `t`-period-ahead forecasted unconditional disturbances, or the conditional mean forecast of the error model over all paths for time point `t` in the forecast period.

`forecast` determines `numpaths` from the number of columns in the presample data sets `Y0`, `E0`, and `U0`. For details, see Algorithms.

Data Types: `double`

Since R2023b

Paths of MMSE forecasts of responses yt, corresponding forecast MSEs, and MMSE forecasts of unconditional disturbances ut, returned as a table or timetable, the same data type as `Presample` or `InSample`. `forecast` returns `Tbl` only when you supply `Presample` or `InSample`.

`Tbl` contains the following variables:

• The forecasted response paths, which are in a `numperiods`-by-`numpaths` numeric matrix, with rows representing periods in the forecast horizon and columns representing independent paths, each corresponding to the input presample paths in `Presample` or preceding the in-sample period in `InSample`. `forecast` names the forecasted response variable `responseName_Response`, where `responseName` is `Mdl.SeriesName`. For example, if `Mdl.SeriesName` is `GDP`, `Tbl` contains a variable for the corresponding forecasted response paths with the name `GDP_Response`.

Each path in `Tbl.responseName_Response` represents the continuation of the corresponding presample response path in `Presample` (`Tbl.responseName_Response(1,:)` occurs in the next time point, with respect to the periodicity `Presample`, after the last presample response). `Tbl.responseName_Response(j,k)` contains the `j`-period-ahead forecasted response of path `k`.

• The forecast MSE paths, which are in a `numperiods`-by-`numpaths` numeric matrix, with rows representing periods in the forecast horizon and columns representing independent paths, each corresponding to the forecasted responses in `Tbl.responseName_Response`. `forecast` names the forecast MSEs `responseName_MSE`, where `responseName` is `Mdl.SeriesName`. For example, if `Mdl.SeriesName` is `GDP`, `Tbl` contains a variable for the corresponding forecast MSE with the name `GDP_MSE`.

• The forecasted unconditional disturbance paths, which are in a `numperiods`-by-`numpaths` numeric matrix, with rows representing periods in the forecast horizon and columns representing independent paths. `forecast` names the forecasted unconditional disturbance variable `responseName_RegressionInnovation`, where `responseName` is `Mdl.SeriesName`. For example, if `Mdl.SeriesName` is `GDP`, `Tbl` contains a variable for the corresponding forecasted unconditional disturbance paths with the name `GDP_RegressionInnovation`.

Each path in `Tbl.responseName_RegressionInnovation` represents a continuation of the presample unconditional disturbance process, either supplied by or inferred from `Presample`, or set by default (`Tbl.responseName_RegressionInnovation(1,:)` occurs in the next time point, with respect to the periodicity `Presample`, after the last presample unconditional disturbance). `Tbl.responseName_RegressionInnovation(j,k)` contains the `j`-period-ahead forecasted unconditional disturbance of path `k`.

• When you supply `InSample`, `Tbl` contains all variables in `InSample`.

If `Presample` is a timetable, the following conditions hold:

• The row order of `Tbl`, either ascending or descending, matches the row order of `Presample`.

• `Tbl.Time(1)` is the next time after `Presample.Time(end)` relative the sampling frequency, and `Tbl.Time(2:numobs)` are the following times relative to the sampling frequency.

collapse all

### Time Base Partitions for Forecasting

Time base partitions for forecasting are two disjoint, contiguous intervals of the time base; each interval contains time series data for forecasting a dynamic model. The forecast period (forecast horizon) is a `numperiods` length partition at the end of the time base during which `forecast` generates forecasts `Y` from the dynamic model `Mdl`. The presample period is the entire partition occurring before the forecast period. `forecast` can require observed responses `Y0`, regression data `X0`, unconditional disturbances `U0`, or innovations `E0` in the presample period to initialize the dynamic model for forecasting. The model structure determines the types and amounts of required presample observations.

A common practice is to fit a dynamic model to a portion of the data set, then validate the predictability of the model by comparing its forecasts to observed responses. During forecasting, the presample period contains the data to which the model is fit, and the forecast period contains the holdout sample for validation. Suppose that yt is an observed response series; x1,t, x2,t, and x3,t are observed exogenous series; and time t = 1,…,T. Consider forecasting responses from a dynamic model of yt containing a regression component `numperiods` = K periods. Suppose that the dynamic model is fit to the data in the interval [1,TK] (for more details, see `estimate`). This figure shows the time base partitions for forecasting.

For example, to generate forecasts `Y` from a regression model with AR(2) errors, `forecast` requires presample unconditional disturbances `U0` and future predictor data `XF`.

• `forecast` infers unconditional disturbances given enough readily available presample responses and predictor data. To initialize an AR(2) error model, `Y0` = ${\left[\begin{array}{cc}{y}_{T-K-1}& {y}_{T-K}\end{array}\right]}^{\prime }$ and `X0` = $\left[\begin{array}{ccc}{x}_{1,T-K-1}& {x}_{2,T-K-1}& {x}_{3,T-K-1}\\ {x}_{1,T-K-1}& {x}_{2,T-K}& {x}_{3,T-K}\end{array}\right]$.

• To model, `forecast` requires future exogenous data `XF` = $\left[\begin{array}{ccc}{x}_{1,\left(T-K+1\right):T}& {x}_{2,\left(T-K+1\right):T}& {x}_{3,\left(T-K+1\right):T}\end{array}\right]$.

This figure shows the arrays of required observations for the general case, with corresponding input and output arguments.

## Algorithms

• The `forecast` function sets the number of sample paths `numpaths` to the maximum number of columns among the specified presample data sets:

All specified presample data sets must have either one column or `numpaths` > 1 columns. Otherwise, `forecast` issues an error. For example, if you supply `Y0` and `E0`, and `Y0` has five columns representing five paths, then `E0` can have one column or five columns. If `E0` has one column, `forecast` applies `E0` to each path.

• `forecast` computes the forecasted response MSEs by treating the predictor data matrices as nonstochastic and statistically independent of the model innovations. Therefore, the forecast MSEs reflect the variances associated with the unconditional disturbances of the ARIMA error model alone.

• `forecast` uses presample response and predictor data to infer presample unconditional disturbances. Therefore, if you specify presample unconditional disturbances, `forecast` ignores any specified presample response and predictor data.

## References

[1] Box, George E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Davidson, R., and J. G. MacKinnon. Econometric Theory and Methods. Oxford, UK: Oxford University Press, 2004.

[3] Enders, Walter. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, Inc., 1995.

[4] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

[5] Pankratz, A. Forecasting with Dynamic Regression Models. John Wiley & Sons, Inc., 1991.

[6] Tsay, R. S. Analysis of Financial Time Series. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc., 2005.

## Version History

Introduced in R2013b

expand all