# Picking Instrumental Variables for System Identification

This example provides a simple illustration of the meaning and choice of instrumental variables (IVs) for system identification. In the System Identification Toolbox™, the ARX model estimation using the least-squares approach is performed using the `arx` command, while the instrumental variable based technique is used in the following functions:

• `ivx` - where you can provide arbitrary signal x(t) for the instruments.

• `iv4` - same as ivx but the optimal instruments are determined automatically

• `ivar` - AR/ARMA model for time-series data; instruments are chosen automatically

### Introduction

An identification problem typically involves determining the coefficients of a difference- or differential-equation that relates certain measured input u(t) to the measured output y(t). Such an equation is commonly denoted by a model such as a transfer function, a polynomial model, or a state-space model. An example is the ARX model:

`$\mathit{A}\left(\mathit{q}\right)\mathit{y}\left(\mathit{t}\right)=\mathit{B}\left(\mathit{q}\right)\mathit{u}\left(\mathit{t}\right)+\mathit{e}\left(\mathit{t}\right)$`

where $\mathit{A}\left(\mathit{q}\right),\mathit{B}\left(\mathit{q}\right)$ are polynomial composed of linear delay operators, and $\mathit{e}\left(\mathit{t}\right)$ are unmeasured disturbances typically described as a Gaussian white noise of zero mean and a constant variance (an i.i.d. sequence). If we pick $\mathit{A}\left(\mathit{q}\right)=1-{\mathit{a}\text{\hspace{0.17em}}\mathit{q}}^{-1},\text{\hspace{0.17em}}\mathit{B}\left(\mathit{q}\right)={\mathit{b}\text{\hspace{0.17em}}\mathit{q}}^{-1}$, the resulting ARX model is:

$\mathit{y}\left(\mathit{t}\right)-\mathit{a}\text{\hspace{0.17em}}\mathit{y}\left(\mathit{t}-1\right)=\mathit{b}\text{\hspace{0.17em}}\mathit{u}\left(\mathit{t}-1\right)+\mathit{e}\left(\mathit{t}\right)$, or:

$\mathit{y}\left(\mathit{t}\right)=\mathit{a}\text{\hspace{0.17em}}\mathit{y}\left(\mathit{t}-1\right)+\mathit{b}\text{\hspace{0.17em}}\mathit{u}\left(\mathit{t}-1\right)+\mathit{e}\left(\mathit{t}\right)$ ... (1)

Another example is an Output-Error (OE) model, typically, referred to as a discrete-time transfer function:

`$\mathit{y}\left(\mathit{t}\right)=\frac{\mathit{B}\left(\mathit{q}\right)}{\mathit{A}\left(\mathit{q}\right)}\mathit{u}\left(\mathit{t}\right)+\mathit{e}\left(\mathit{t}\right)$`

For $\mathit{A}\left(\mathit{q}\right)=1-{\mathit{a}\text{\hspace{0.17em}}\mathit{q}}^{-1},\text{\hspace{0.17em}}\mathit{B}\left(\mathit{q}\right)={\mathit{b}\text{\hspace{0.17em}}\mathit{q}}^{-1}$, the difference equation corresponding to this structure is:

$\mathit{y}\left(\mathit{t}\right)=\mathit{a}\text{\hspace{0.17em}}\mathit{y}\left(\mathit{t}-1\right)+\mathit{b}\text{\hspace{0.17em}}\mathit{u}\left(\mathit{t}-1\right)+\mathit{e}\left(\mathit{t}\right)-\mathit{a}\text{\hspace{0.17em}}\mathit{e}\left(\mathit{t}-1\right)$ ... (2)

A simple, but practical view of the instrumental variables is that they are auxiliary signals $\mathit{e}\left(\mathit{t}\right)$ (as such different from $\mathit{u}\left(\mathit{t}\right)$ or $\mathit{y}\left(\mathit{t}\right)$). that helps compensate for the effect of disturbances in determining accurate estimates of the model parameters (i.e., the constants $\mathit{a},\mathit{b}$ in the ARX and OE models above). The motivation behind the choice of $\mathit{x}\left(\mathit{t}\right)$ is to correlate out the effect of disturbance $\mathit{e}\left(\mathit{t}\right)$ from the equations. This is best seen when using the correlation based approach to solving stochastic difference equations. For example, multiply both the side of equation (1) with random variable $\mathit{x}\left(\mathit{t}\right)$ and compute expectations. This gives:

`$\mathit{E}\left\{\mathit{x}\left(\mathit{t}\right)\cdot \mathit{y}\left(\mathit{t}\right)\right\}=\mathit{a}\text{\hspace{0.17em}}\mathit{E}\left\{\mathit{x}\left(\mathit{t}\right)\cdot \mathit{y}\left(\mathit{t}-1\right)\right\}+\mathit{b}\text{\hspace{0.17em}}\mathit{E}\left\{\mathit{x}\left(\mathit{t}\right)\cdot \mathit{u}\left(\mathit{t}-1\right)\right\}+\mathit{E}\left\{\mathit{x}\left(\mathit{t}\right)\cdot \mathit{e}\left(\mathit{t}\right)\right\}$`

where $\mathit{E}\left\{\cdot \right\}$is the expectation operator. The expectation can be computed as a sample mean of the quantities over a given time range. For example, if the measurements of $\mathit{u}\left(\mathit{t}\right),\mathit{y}\left(\mathit{t}\right),\mathit{x}\left(\mathit{t}\right)$ are available over a time range of $\mathit{t}=1,2,...,\mathit{N}$seconds, then:

`$\mathit{E}\left\{\mathit{x}\left(\mathit{t}\right)\cdot \mathit{y}\left(\mathit{t}\right)\right\}\approx \frac{1}{\mathit{N}}\sum _{\mathit{t}=1}^{\mathit{N}}\mathit{x}\left(\mathit{t}\right)\mathit{y}\left(\mathit{t}\right)\equiv {\mathit{R}}_{\mathrm{xy}}\left(0\right)$`

where ${\mathit{R}}_{\mathrm{xy}}\left(\tau \right)$ denotes the correlation between the variables $\mathit{x}\left(\mathit{t}\right)$ and $\mathit{y}\left(\mathit{t}-\tau \right)$. The correlation equation then takes the form:

${\mathit{R}}_{\mathrm{xy}}\left(0\right)=\mathit{a}\text{\hspace{0.17em}}{\mathit{R}}_{\mathrm{xy}}\left(1\right)+\mathit{b}\text{\hspace{0.17em}}{\mathit{R}}_{\mathrm{xu}}\left(1\right)+{\mathit{R}}_{\mathrm{xe}}\left(0\right)$ ... (3)

Suppose we pick $\mathit{x}\left(\mathit{t}\right)$ to be $\mathit{y}\left(\mathit{t}-1\right)$, that is, the output variables shifted by one sample. Then we get:

`${\mathit{R}}_{\mathrm{yy}}\left(1\right)=\mathit{a}\text{\hspace{0.17em}}{\mathit{R}}_{\mathrm{yy}}\left(0\right)+\mathit{b}\text{\hspace{0.17em}}{\mathit{R}}_{\mathrm{yu}}\left(0\right)+{\mathit{R}}_{\mathrm{ye}}\left(1\right)$`

Since we can typically assume that $\mathit{e}\left(\mathit{t}\right)$ is uncorrelated with $\mathit{y}\left(\mathit{t}-1\right)$, we have ${\mathit{R}}_{\mathrm{ye}}\left(1\right)\approx 0$. Thus we have managed to correlate out the effect of $\mathit{e}\left(\mathit{t}\right)$ in equation (3) by pick the instrumental variable as $\mathit{x}\left(\mathit{t}\right)=\mathit{y}\left(\mathit{t}-1\right)$. Of course, this is not the only possible choice for $\mathit{x}\left(\mathit{t}\right)$. It can be any signal that shows strong correlation with the output $\mathit{y}\left(\mathit{t}\right)$ but is uncorrelated with the disturbance $\mathit{e}\left(\mathit{t}\right)$.

### Identification of ARX Models

Given samples of input and output signals over a time range, we need to solve for the coefficients of the delay operator polynomials $\mathit{A}\left(\mathit{q}\right),\mathit{B}\left(\mathit{q}\right)$. In ARX equation (1) we can minimize the sum of squared errors over a time span, that is, minimize $\mathit{V}=\frac{1}{\mathit{N}}{\sum }_{\mathit{t}}{\mathit{e}}^{2}\left(\mathit{t}\right)$. Since equation (1) is linear in the unknowns, minimization of $\mathit{V}$ can be achieved using linear least squares, typically performed using the backslash ("\") operator in MATLAB. Define the regressor matrix $\Phi$ as a matrix composed of the observations of $\mathit{y}\left(1-1\right),\mathit{u}\left(\mathit{t}-1\right)$, and $\mathit{Y}$ the vector of output observations, that is:

`$\Phi \doteq \left(\begin{array}{cc}{y}_{1}& {u}_{1}\\ {y}_{2}& {u}_{2}\\ ⋮& ⋮\\ {y}_{N-1}& {u}_{N-1}\end{array}\right),\phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}Y\doteq \left(\begin{array}{c}{y}_{2}\\ {y}_{3}\\ ⋮\\ {y}_{N}\end{array}\right)$`

then we have:

$\Phi \theta =Y$,

where $\theta \doteq \left(\begin{array}{c}a\\ b\end{array}\right)$is the vector of unknowns. Given numeric values for matrices $\Phi ,\mathit{Y}$, we can solve for $\theta$ as $\theta =\Phi \\mathit{Y}$. In most cases, the backslash operator computes the least squares solution $\theta =\left({\Phi }^{\mathit{T}}\Phi \right){\Phi }^{\mathit{T}}\\mathit{Y}$. This is the solution approach used by the `arx` command.

#### Stochastic Difference Equation Based Approach to Estimation

Note that the least-squares solution can be written as:

`$\theta =\left[\frac{1}{N}\sum _{t=1}^{N}\varphi \left(t\right){\varphi }^{T}\left(t\right){\right]}^{-1}\left[\frac{1}{N}\sum _{t=1}^{N}\varphi \left(t\right)y\left(t\right)\right]$`

where $\varphi \left(t\right)=\left(\begin{array}{c}y\left(t-1\right)\\ u\left(t-1\right)\end{array}\right)$ is the vector of model regressors. The two summations can be viewed as correlation matrices:

`${R}_{\varphi \varphi }\doteq \frac{1}{N}\sum _{t=1}^{N}\left(\begin{array}{cc}{y}^{2}\left(t-1\right)& y\left(t-1\right)u\left(t-1\right)\\ u\left(t-1\right)y\left(t-1\right)& {u}^{2}\left(t-1\right)\end{array}\right)=\left(\begin{array}{cc}{R}_{yy}\left(0\right)& {R}_{yu}\left(0\right)\\ {R}_{yu}\left(0\right)& {R}_{uu}\left(0\right)\end{array}\right)$`

`${R}_{\varphi y}\doteq \frac{1}{N}\sum _{t=1}^{N}\left(\begin{array}{c}y\left(t-1\right)y\left(t\right)\\ u\left(t-1\right)y\left(t\right)\end{array}\right)=\left(\begin{array}{c}{R}_{yy}\left(1\right)\\ {R}_{uy}\left(1\right)\end{array}\right)$`

This means that the least-squares solution can also be derived by using the correlation approach, where we pick $\mathit{y}\left(\mathit{t}-1\right)$, and $\mathit{u}\left(\mathit{t}-1\right)$ as instrumental variables. In particular, using ${\mathit{x}}_{1}\left(\mathit{t}\right)=\mathit{y}\left(\mathit{t}-1\right)$ as the first instrumental variable the correlation equation (obtained by multiplying both sides of the equation (1) by $\mathit{y}\left(\mathit{t}-1\right)$ and taking expectation) becomes:

${\mathit{R}}_{\mathrm{yy}}\left(1\right)=\mathit{a}\text{\hspace{0.17em}}{\mathit{R}}_{\mathrm{yy}}\left(0\right)+\mathit{b}\text{\hspace{0.17em}}{\mathit{R}}_{\mathrm{yu}}\left(0\right)$ .... 4(a)

Using ${\mathit{x}}_{2}\left(\mathit{t}\right)=\mathit{u}\left(\mathit{t}-1\right)$ as the instrumental variable gives:

${\mathit{R}}_{\mathrm{yu}}\left(1\right)=\mathit{a}\text{\hspace{0.17em}}{\mathit{R}}_{\mathrm{yu}}\left(0\right)+\mathit{b}\text{\hspace{0.17em}}{\mathit{R}}_{\mathrm{uu}}\left(0\right)$ .... 4(b)

Equations 4(a), and 4(b) can be solved simultaneously to obtain the estimates of the unknowns $\mathit{a}$ and $\mathit{b}.$ Thus the standard ARX algorithm amounts to picking the regressors themselves as the instrumental variables.

```% Load input/output data from a SISO system load IODataForIVExample IOData idplot(IOData)``` ```% Estimate an ARX Model of order na = 2, nb = 2, nk = 1. This amounts to using the regressors % y(t-1), y(t-2), u(t-1), u(t-2) na = 2; nb = 2; nk = 1; sysARX = arx(IOData,[na nb nk])```
```sysARX = Discrete-time ARX model: A(z)y(t) = B(z)u(t) + e(t) A(z) = 1 - 0.8281 z^-1 + 0.09117 z^-2 B(z) = 0.8454 z^-1 + 1.233 z^-2 Sample time: 0.1 seconds Parameterization: Polynomial orders: na=2 nb=2 nk=1 Number of free coefficients: 4 Use "polydata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using ARX on time domain data "IOData". Fit to estimation data: 53.67% (prediction focus) FPE: 4.55, MSE: 4.371 Model Properties ```

Now estimate a model using the instrumental variable approach. Use the model regressors as the 4 instrumental variables. For the ARX model, the IV solution takes the form:

`$\theta =\left[\frac{1}{N}\sum _{t=1}^{N}x\left(t\right){\varphi }^{T}\left(t\right){\right]}^{-1}\left[\frac{1}{N}\sum _{t=1}^{N}x\left(t\right)y\left(t\right)\right]$`

where $\mathit{x}\left(\mathit{t}\right)$ is the vector of instrumental variables. For the `ivx` command, the instrumental variable information is provided using the third input argument `X`.

You do not need to generate the regressor data manually. You can simply provide a filtered version of the output signals to be used for generating the instrumental variables. The `ivx` command automatically generates the output related instruments by shifting the provided signal according to the `na` value. For example, if `na=2`, then the `ivx` command internally generates X`(t-1), X(t-2)` as instrumental variables. For this example, `X` is same as the measured output - `IOData.y1`.

Note also that the shifted versions of the input signal are automatically added in accordance to the value of `nb` and `nk`. Thus if `nb=2, nk=1`, then the `ivx` command internally generates `u(t-1), u(t-2)` and adds them to the list of instrumental variables.

```X = IOData.y1; % X is the signal from which the first 2 IVs are generated. % Rhe other 2 are automatically added using the input signal IOData.u1. sysIV = ivx(IOData, [na nb nk], X)```
```sysIV = Discrete-time ARX model: A(z)y(t) = B(z)u(t) + e(t) A(z) = 1 - 0.8281 z^-1 + 0.09117 z^-2 B(z) = 0.8454 z^-1 + 1.233 z^-2 Sample time: 0.1 seconds Parameterization: Polynomial orders: na=2 nb=2 nk=1 Number of free coefficients: 4 Use "polydata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using IVX on time domain data "IOData". Fit to estimation data: 53.67% (prediction focus) FPE: 4.55, MSE: 4.371 Model Properties ```

As seen by this exercise, `sysIV` matches `sysARX`. Thus using the output signal itself as the instrumental variable generator, we can reproduce the least-squares solution of the `arx` command. We can verify these results by manual computation. Note that $\theta ={\left({\mathit{b}}_{1},{\mathit{b}}_{2},{\mathit{a}}_{1},{\mathit{a}}_{2}\right)}^{\mathit{T}}$ is the vector of unknowns.

```%The GETREG command can be used for quickly generating the regressor data. vars = IOData.Properties.VariableNames; L = linearRegressor(vars,1:2); D = getreg(L,IOData); head(D)```
``` Time u1(t-1) u1(t-2) y1(t-1) y1(t-2) _______ _______ _______ _______ _______ 0.1 sec NaN NaN NaN NaN 0.2 sec 1 NaN 0.39221 NaN 0.3 sec -1 1 0.84255 0.39221 0.4 sec -1 -1 0.3107 0.84255 0.5 sec -1 -1 -2.3174 0.3107 0.6 sec 1 -1 -6.3138 -2.3174 0.7 sec -1 1 -5.4381 -6.3138 0.8 sec -1 -1 -1.5059 -5.4381 ```
```% The NaNs in the first two rows of D correpsond to the unknown initial conditions. Remove them. D = D(3:end,:); % For linear regression, the corresponding response values are: Y = IOData.y1(3:end); % The ARX command computes the parameters by least squares. Theta = D{:,:}\Y; a = [1, -Theta(3:4)']; % the A(q) polynomial b = [0, Theta(1:2)']; % the B(q) polynomial % disp('A(q) values')```
```A(q) values ```
`disp([a; sysARX.a])`
``` 1.0000 -0.8281 0.0912 1.0000 -0.8281 0.0912 ```
```% disp('B(q) values')```
```B(q) values ```
`disp([b; sysARX.b])`
``` 0 0.8454 1.2331 0 0.8454 1.2331 ```

The value of `a` matches `sysARX.a`. Similarly, the values of `b` and `sysARX.b `match. The Examples section below explores other choices of instruments.

#### Using redundant sensor measurement as instrumental variable

Suppose the experimental setup allows multiple measurements of the output signal. For example, an audio signal can be measured by sensors at two nearby locations. This way, both sensors measure the same signal but are corrupted by noise independently of each other. The variable y2 represents the output value measured using an auxiliary sensor.

```load IODataForIVExample y2 t = IOData.Properties.RowTimes; plot(t,IOData.y1, t,y2) legend('Primary measurement', 'Auxiliary measurement')``` Use `y2 `as instrumental variable generating signal.

`sysIV2 = ivx(IOData, [na nb nk], y2)`
```sysIV2 = Discrete-time ARX model: A(z)y(t) = B(z)u(t) + e(t) A(z) = 1 - 1.169 z^-1 + 0.4005 z^-2 B(z) = 0.8912 z^-1 + 0.9849 z^-2 Sample time: 0.1 seconds Parameterization: Polynomial orders: na=2 nb=2 nk=1 Number of free coefficients: 4 Use "polydata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using IVX on time domain data "IOData". Fit to estimation data: 50.05% (prediction focus) FPE: 5.287, MSE: 5.08 Model Properties ```

#### Using instruments automatically generated by IV4

The `iv4` command automatically determines optimal instruments to use based on certain assumptions regarding the nature of the system and the diusturbances. You can call `iv4` with the same input arguments as used for the `arx` command.

```% Identify a model of the same order as before using the IV4 command sysOpt = iv4(IOData, [na nb nk])```
```sysOpt = Discrete-time ARX model: A(z)y(t) = B(z)u(t) + e(t) A(z) = 1 - 1.506 z^-1 + 0.7088 z^-2 B(z) = 0.9258 z^-1 + 0.6223 z^-2 Sample time: 0.1 seconds Parameterization: Polynomial orders: na=2 nb=2 nk=1 Number of free coefficients: 4 Use "polydata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using IV4 on time domain data "IOData". Fit to estimation data: 40.55% (prediction focus) FPE: 7.491, MSE: 7.197 Model Properties ```

`sysOpt` has worse 1-step prediction accuracy than `sysARX`, or `sysIV`. However it performs better on long-horizon predictions. Similarly, `sysIV2` performs better for longer-term predictions.

```% compare model performance for long-term (infinity horizon) predictions compare(IOData, sysARX, sysIV, sysIV2, sysOpt)``` The results can be validate further on an independent dataset as described in the Model Validation section.

### Identification of Output-Error (OE) Models

The ARX estimation technique provides correct estimates of the model parameters if the disturbances are indeed equation errors, that is, they match the manner in which $\mathit{e}\left(\mathit{t}\right)$ affects the output $\mathit{y}\left(\mathit{t}\right)$. However, in many situations, this is not the case. The simplest example is where the unmeasured disturbances are owing to sensor measurement noise. In this case, the true system takes the form:

`$\mathit{y}\left(\mathit{t}\right)=\frac{\mathit{B}\left(\mathit{q}\right)}{\mathit{A}\left(\mathit{q}\right)}\mathit{u}\left(\mathit{t}\right)+\mathit{e}\left(\mathit{t}\right)$`

that is, the output-error form. In this case, the `arx` command leads to biased estimates of the model parameters. The correct estimator to use in this case is the `oe` function. Here is an example:

`sysOE = oe(IOData, [nb na nk])`
```sysOE = Discrete-time OE model: y(t) = [B(z)/F(z)]u(t) + e(t) B(z) = 0.9447 z^-1 + 0.5256 z^-2 F(z) = 1 - 1.539 z^-1 + 0.7313 z^-2 Sample time: 0.1 seconds Parameterization: Polynomial orders: nb=2 nf=2 nk=1 Number of free coefficients: 4 Use "polydata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using OE on time domain data "IOData". Fit to estimation data: 63.5% FPE: 2.786, MSE: 2.712 Model Properties ```

The `oe` command estimates the coefficients of the numerator and the denominator polynomials by minimizing the sum of squares of the errors $\mathit{e}\left(\mathit{t}\right)$, $\mathit{V}=\frac{1}{\mathit{N}}{\sum }_{\mathit{t}}{\mathit{e}}^{2}\left(\mathit{t}\right)$. However, the minimization objective $\mathit{V}$ is non-convex in the model parameters. The minimization uses iterative numerical minimization techniques. This is significantly more complicated than the simple least-squares approach used by the `arx` algorithm. Moreover, best results are typically not guaranteed since the minimization algorithms can get stuck at a local minima of $\mathit{V}$. The results obtained by `oe` can be different depending upon the choice of the initial values of the parameters, the choice of the search method and its configuration.

Can a suitable choice of instrumental variables avoid the estimation bias and deliver the correct results while still using the non-iterative computation method of IV algorithms? Note that the OE model can be written as:

`$\mathit{A}\left(\mathit{q}\right)\mathit{y}\left(\mathit{t}\right)=\mathit{B}\left(\mathit{q}\right)\mathit{u}\left(\mathit{t}\right)+\mathit{A}\left(\mathit{q}\right)\mathit{e}\left(\mathit{t}\right)$`

which has ARMAX structure with $\mathit{C}\left(\mathit{q}\right)=\mathit{A}\left(\mathit{q}\right)$. For `na=nb=2, nk=1`, this model takes the form:

`$\mathit{y}\left(\mathit{t}\right)={\mathit{a}}_{1}\mathit{y}\left(\mathit{t}-1\right)+{\mathit{a}}_{2}\mathit{y}\left(\mathit{t}-2\right)+{\mathit{b}}_{1}\mathit{u}\left(\mathit{t}-1\right)+{\mathit{b}}_{2}\mathit{u}\left(\mathit{t}-2\right)+\mathit{e}\left(\mathit{t}\right)-{\mathit{a}}_{1}\mathit{e}\left(\mathit{t}-1\right)-{\mathit{a}}_{2}\mathit{e}\left(\mathit{t}-2\right)$`

To correlate out the influence of errors at a give time $\mathit{t}$, we can use a IV variable with a lag of at least 3. Another option is to use a variable that is not completely uncorrelated with the errors but then accounting for the effect of the correlations explicitly. These choices are explored next.

#### Using Instruments with Lags Larger than the Model Order

For the current example, the model order is $\mathrm{nx}=\mathrm{max}\left(\mathrm{na},\mathrm{nb}\right)=2$. A candidate IV set is: $\mathit{y}\left(\mathit{t}-3\right),\mathit{y}\left(\mathit{t}-4\right),\mathit{u}\left(\mathit{t}-1\right),\mathit{u}\left(\mathit{t}-2\right)$. This can be achieved by using $\mathit{X}=\mathit{y}\left(\mathit{t}-2\right)$ in the `ivx` command.

`sysIV_OE = ivx(IOData(3:end,:), [na nb nk], IOData.y1(1:end-2))`
```sysIV_OE = Discrete-time ARX model: A(z)y(t) = B(z)u(t) + e(t) A(z) = 1 - 1.577 z^-1 + 0.7681 z^-2 B(z) = 0.9414 z^-1 + 0.6868 z^-2 Sample time: 0.1 seconds Parameterization: Polynomial orders: na=2 nb=2 nk=1 Number of free coefficients: 4 Use "polydata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using IVX on time domain data. Fit to estimation data: 38.18% (prediction focus) FPE: 8.157, MSE: 7.835 Model Properties ```

We can verify this result by manual computations using the correlations matrices.

```% generate regressors data for [y(t-1), ..., y(t-4), u(t-1), u(t-2)] L = linearRegressor(vars,{1:2,1:4}); D = getreg(L,IOData); head(D)```
``` Time u1(t-1) u1(t-2) y1(t-1) y1(t-2) y1(t-3) y1(t-4) _______ _______ _______ _______ _______ _______ _______ 0.1 sec NaN NaN NaN NaN NaN NaN 0.2 sec 1 NaN 0.39221 NaN NaN NaN 0.3 sec -1 1 0.84255 0.39221 NaN NaN 0.4 sec -1 -1 0.3107 0.84255 0.39221 NaN 0.5 sec -1 -1 -2.3174 0.3107 0.84255 0.39221 0.6 sec 1 -1 -6.3138 -2.3174 0.3107 0.84255 0.7 sec -1 1 -5.4381 -6.3138 -2.3174 0.3107 0.8 sec -1 -1 -1.5059 -5.4381 -6.3138 -2.3174 ```
```% remove first 4 rows that contain the effect of initial conditions D = D(5:end,:); Phi = D{:,1:4}; % u(t-1), u(t-2), y(t-1), y(t-2) IV = D{:,[1 2 5 6]}; % u(t-1), u(t-2), y(t-3), y(t-4) Y = IOData.y1(5:end); Theta = (IV'*Phi)\(IV'*Y)```
```Theta = 4×1 0.9414 0.6868 1.5766 -0.7681 ```
`a = [1, -Theta(3:4)'] % A(q), matches sysIV_OE.a`
```a = 1×3 1.0000 -1.5766 0.7681 ```
`b = [0, Theta(1:2)'] % B(q), matches sysIV_OE.b`
```b = 1×3 0 0.9414 0.6868 ```
```% check model performance for inf-horizon prediction compare(IOData, sysARX, sysOE, sysIV_OE, sysOpt)``` `sysIV_OE` does better than `sysARX` but still not as good as `sysOE`. However, `sysOpt`, which was estimated with automatically chosen instruments, matches the performance of `sysOE`.

#### Using Error-Correlated Instrumental Variables

Under the assunption that $\mathit{e}\left(\mathit{t}\right)$, for $\mathit{t}=1,...,\mathit{N}$, is an identically distributed sequence of independent random variables: $\mathit{E}\left\{\mathit{e}\left(\mathit{t}\right)\mathit{e}\left(\mathit{t}\right)\right\}\approx \frac{1}{\mathit{N}}{\sum }_{\mathit{t}=1}^{\mathit{N}}{\mathit{e}}^{2}\left(\mathit{t}\right)={\sigma }^{2}$, where ${\sigma }^{2}$ is typically unknown.

$\mathit{E}\left\{\mathit{e}\left(\mathit{t}\right)\mathit{e}\left(\mathit{t}+\tau \right)\right\}\approx \frac{1}{\mathit{N}}{\sum }_{\mathit{t}=1}^{\mathit{N}-\tau }\mathit{e}\left(\mathit{t}\right)\mathit{e}\left(\mathit{t}+\tau \right)=0$.

Using the IVs that are the same as the model regressors, we get the following correlation equations:

`$\begin{array}{l}{\mathit{R}}_{\mathrm{uu}}\left(0\right){\mathit{b}}_{1}+{\mathit{R}}_{\mathrm{uu}}\left(1\right){\mathit{b}}_{2}+\text{\hspace{0.17em}}{\mathit{R}}_{\mathrm{yu}}\left(0\right){\mathit{a}}_{1}+{\mathit{R}}_{\mathrm{yu}}\left(1\right){\mathit{a}}_{2}={\mathit{R}}_{\mathrm{yu}}\left(1\right)\\ {{\mathit{R}}_{\mathrm{uu}}\left(1\right){\mathit{b}}_{1}+{\mathit{R}}_{\mathrm{uu}}\left(0\right){\mathit{b}}_{2}+\text{\hspace{0.17em}}\mathit{R}}_{\mathrm{yu}}\left(1\right){\mathit{a}}_{1}+{\mathit{R}}_{\mathrm{yu}}\left(0\right){\mathit{a}}_{2}={\mathit{R}}_{\mathrm{yu}}\left(2\right)\\ {\mathit{R}}_{\mathrm{yu}}\left(0\right){\mathit{b}}_{1}+{\mathit{R}}_{\mathrm{yu}}\left(1\right){\mathit{b}}_{2}+\left({\mathit{R}}_{\mathrm{yy}}\left(0\right)-{\sigma }^{2}\right){\mathit{a}}_{1}+{\mathit{R}}_{\mathrm{yy}}\left(1\right){\mathit{a}}_{2}+{\sigma }^{2}{\mathit{a}}_{1}{\mathit{a}}_{2}\text{\hspace{0.17em}}={\mathit{R}}_{\mathrm{yy}}\left(1\right)\\ {{\mathit{R}}_{\mathrm{yu}}\left(1\right){\mathit{b}}_{1}+{\mathit{R}}_{\mathrm{yu}}\left(0\right){\mathit{b}}_{2}+\text{\hspace{0.17em}}\mathit{R}}_{\mathrm{yy}}\left(1\right){\mathit{a}}_{1}+\left({\mathit{R}}_{\mathrm{yy}}\left(0\right)-{\sigma }^{2}\right){\mathit{a}}_{2}={\mathit{R}}_{\mathrm{yy}}\left(2\right)\end{array}$`

The presence of the unknown ${\sigma }^{2}$ in the first two equations means that a linear least-squares solution cannot be used (there are terms involving products of unknowns). This removes the benefit of using instrumental variables; `ivx` cannot overcome the effect of this correlation if using $\mathit{X}=\mathit{y}\left(\mathit{t}\right)$.

### Model Validation

```% Create a validation dataset using the output signal from the auxiliary sensor. vData = IOData; vData.y1 = y2; % compare 1-step ahead prediction performance compare(vData, sysARX, sysIV, sysIV2, sysIV_OE, sysOpt, sysOE, 1)``` ```% compare 10-step ahead prediction performance compare(vData, sysARX, sysIV, sysIV2, sysIV_OE, sysOpt, sysOE, 10)``` ```% compare infinite-horizon performance of the models compare(vData, sysARX, sysIV, sysIV2, sysIV_OE, sysOpt, sysOE)``` 