Main Content

**Class: **GeneralizedLinearMixedModel

Compare generalized linear mixed-effects models

returns
the results of a likelihood
ratio test that compares the generalized linear mixed-effects
models `results`

= compare(`glme`

,`altglme`

)`glme`

and `altglme`

.
To conduct a valid likelihood ratio test, both models must use the
same response vector in the fit, and `glme`

must
be nested in `altglme`

. Always input the smaller
model first, and the larger model second.

`compare`

tests the following null and alternate
hypotheses:

*H*_{0}: Observed response vector is generated by`glme`

.*H*_{1}: Observed response vector is generated by model`altglme`

.

returns
the results of a likelihood ratio test using additional options specified
by one or more `results`

= compare(`glme`

,`altglme`

,`Name,Value`

)`Name,Value`

pair arguments. For example,
you can check if the first input model, `glme`

,
is nested in the second input model, `altglme`

.

`glme`

— Generalized linear mixed-effects model`GeneralizedLinearMixedModel`

objectGeneralized linear mixed-effects model, specified as a `GeneralizedLinearMixedModel`

object.
For properties and methods of this object, see `GeneralizedLinearMixedModel`

.

You can create a `GeneralizedLinearMixedModel`

object
by fitting a generalized linear mixed-effects model to your sample
data using `fitglme`

. To conduct
a valid likelihood ratio test on two models that have response distributions
other than normal, you must fit both models using the `'ApproximateLaplace'`

or `'Laplace'`

fit
method. Models with response distributions other than normal that
are fitted using `'MPL'`

or `'REMPL'`

cannot
be compared using a likelihood ratio test.

`altglme`

— Alternative generalized linear mixed-effects model`GeneralizedLinearMixedModel`

objectAlternative generalized linear mixed-effects model, specified
as a `GeneralizedLinearMixedModel`

object. `altglme`

be
must fit to the same response vector as `glme`

,
but with different model specifications. `glme`

must
be nested in `altglme`

, such that you can obtain `glme`

from `altglme`

by
setting some of the model parameters of `altglme`

to
fixed values such as 0.

You can create a `GeneralizedLinearMixedModel`

object
by fitting a generalized linear mixed-effects model to your sample
data using `fitglme`

. To conduct
a valid likelihood ratio test on two models that have response distributions
other than normal, you must fit both models using the `'ApproximateLaplace'`

or `'Laplace'`

fit
method. Models with response distributions other than normal that
are fitted using `'MPL'`

or `'REMPL'`

cannot
be compared using a likelihood ratio test.

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

`'CheckNesting'`

— Indicator to check nesting between two models`true`

(default) | `false`

Indicator to check
nesting between two models, specified as the comma-separated
pair consisting of `'CheckNesting'`

and either `true`

or `false`

.
If `'CheckNesting'`

is `true`

, then `compare`

checks
if the smaller model `glme`

is nested in the larger
model `altglme`

. If the nesting requirements are
not satisfied, then `compare`

returns an error. If `'CheckNesting'`

is `false`

,
then `compare`

does not perform this check.

**Example: **`'CheckNesting',true`

`results`

— Results of likelihood ratio testtable

Results of the likelihood ratio test, returned as a table with
two rows. The first row is for `glme`

, and the
second row is for `altglme`

. The columns of `results`

contain
the following.

Column Name | Description |
---|---|

`Model` | Name of the model |

`DF` | Degrees of freedom |

`AIC` | Akaike information criterion for the model |

`BIC` | Bayesian information criterion for the model |

`LogLik` | Maximized log likelihood for the model |

`LRStat` | Likelihood ratio test statistic for comparing `altglme` and `glme` |

`deltaDF` | `DF` for `altglme` minus `DF` for `glme` |

`pValue` | p-value for the likelihood ratio test |

Load the sample data.

`load mfr`

This simulated data is from a manufacturing company that operates 50 factories across the world, with each factory running a batch process to create a finished product. The company wants to decrease the number of defects in each batch, so it developed a new manufacturing process. To test the effectiveness of the new process, the company selected 20 of its factories at random to participate in an experiment: Ten factories implemented the new process, while the other ten continued to run the old process. In each of the 20 factories, the company ran five batches (for a total of 100 batches) and recorded the following data:

Flag to indicate whether the batch used the new process (

`newprocess`

)Processing time for each batch, in hours (

`time`

)Temperature of the batch, in degrees Celsius (

`temp`

)Categorical variable indicating the supplier of the chemical used in the batch (

`supplier`

)Number of defects in the batch (

`defects`

)

The data also includes `time_dev`

and `temp_dev`

, which represent the absolute deviation of time and temperature, respectively, from the process standard of 3 hours at 20 degrees Celsius.

Fit a fixed-effects-only model using `newprocess`

, `time_dev`

, `temp_dev`

, and `supplier`

as fixed-effects predictors. Specify the response distribution as Poisson, the link function as log, and the fit method as Laplace. Specify the dummy variable encoding as `'effects'`

, so the dummy variable coefficients sum to 0.

FEglme = fitglme(mfr,'defects ~ 1 + newprocess + time_dev + temp_dev + supplier','Distribution','Poisson','Link','log','FitMethod','Laplace','DummyVarCoding','effects');

Fit a second model that uses the same fixed-effects predictors, response distribution, link function, and fit method. This time, include a random-effects intercept grouped by `factory`

, to account for quality differences that might exist due to factory-specific variations.

The number of defects can be modeled using a Poisson distribution

$${\text{defects}}_{ij}\sim \text{Poisson}({\mu}_{ij})$$

This corresponds to the generalized linear mixed-effects model

$$\mathrm{log}({\mu}_{ij})={\beta}_{0}+{\beta}_{1}{\text{newprocess}}_{ij}+{\beta}_{2}{\text{time}\text{\_}\text{dev}}_{ij}+{\beta}_{3}{\text{temp}\text{\_}\text{dev}}_{ij}+{\beta}_{4}{\text{supplier}\text{\_}\text{C}}_{ij}+{\beta}_{5}{\text{supplier}\text{\_}\text{B}}_{ij}+{b}_{i},$$

where

$${\text{defects}}_{ij}$$ is the number of defects observed in the batch produced by factory $$i$$ during batch $$j$$.

$${\mu}_{ij}$$ is the mean number of defects corresponding to factory $$i$$ (where $$i=1,2,...,20$$) during batch $$j$$ (where $$j=1,2,...,5$$).

$${\text{newprocess}}_{ij}$$, $${\text{time}\text{\_}\text{dev}}_{ij}$$, and $${\text{temp}\text{\_}\text{dev}}_{ij}$$ are the measurements for each variable that correspond to factory $$i$$ during batch $$j$$. For example, $${\text{newprocess}}_{ij}$$ indicates whether the batch produced by factory $$i$$ during batch $$j$$ used the new process.

$${\text{supplier}\text{\_}\text{C}}_{ij}$$ and $${\text{supplier}\text{\_}\text{B}}_{ij}$$ are dummy variables that use effects (sum-to-zero) coding to indicate whether company

`C`

or`B`

, respectively, supplied the process chemicals for the batch produced by factory $$i$$ during batch $$j$$.$${b}_{i}\sim N(0,{\sigma}_{b}^{2})$$ is a random-effects intercept for each factory $$i$$ that accounts for factory-specific variation in quality.

glme = fitglme(mfr,'defects ~ 1 + newprocess + time_dev + temp_dev + supplier + (1|factory)','Distribution','Poisson','Link','log','FitMethod','Laplace','DummyVarCoding','effects');

Compare the two models using a theoretical likelihood ratio test. Specify `'CheckNesting'`

as `true`

, so `compare`

returns a warning if the nesting requirements are not satisfied.

`results = compare(FEglme,glme,'CheckNesting',true)`

results = Theoretical Likelihood Ratio Test Model DF AIC BIC LogLik LRStat deltaDF FEglme 6 431.02 446.65 -209.51 glme 7 416.35 434.58 -201.17 16.672 1 pValue 4.4435e-05

Since `compare`

did not return an error, the nesting requirements are satisfied. The small $$p$$-value indicates that `compare`

rejects the null hypothesis that the observed response vector is generated by the model `FEglme`

, and instead accepts the alternate model `glme`

. The smaller `AIC`

and `BIC`

values for `glme`

also support the conclusion that `glme`

provides a better fitting model for the response.

A *likelihood ratio test* compares
the specifications of two nested models by assessing the significance
of restrictions to an extended model with unrestricted parameters.
Under the null hypothesis *H*_{0},
the likelihood ratio test statistic has an approximate chi-squared
reference distribution with degrees of freedom `deltaDF`

.

When comparing two models, `compare`

computes
the *p*-value for the likelihood ratio test by comparing
the observed likelihood ratio test statistic with this chi-squared
reference distribution. A small *p*-value leads to
a rejection of *H*_{0} in favor
of *H*_{1}, and acceptance of
the alternate model `altglme`

. On the other hand,
a large *p*-value indicates that we cannot reject *H*_{0},
and reflects insufficient evidence to accept the model `altglme`

.

The *p*-values obtained using the likelihood
ratio test can be conservative when testing for the presence or absence
of random-effects terms, and anti-conservative when testing for the
presence or absence of fixed-effects terms. Instead, use the `fixedEffects`

or `coefTest`

methods
to test for fixed effects.

To conduct a valid likelihood ratio test on GLME models, both
models must be fitted using a Laplace or approximate Laplace fit method.
Models fitted using a maximum pseudo likelihood (MPL) or restricted
maximum pseudo likelihood (REMPL) method cannot be compared using
a likelihood ratio test. When comparing models fitted using MPL, the
maximized log likelihood of the pseudodata from the final pseudo likelihood
iteration is used in the likelihood ratio test. If you compare models
with non-normal distributions fitted using MPL, then `compare`

gives
a warning that the likelihood ratio test is using maximized log likelihood
of pseudodata from the final pseudo likelihood iteration. To use the
true maximized log likelihood in the likelihood ratio test, fit both `glme`

and `altglme`

using
approximate Laplace or Laplace prior to model comparison.

To conduct a valid likelihood ratio test, `glme`

must
be nested in `altglme`

. The `'CheckNesting',true`

name-value
pair argument checks the following requirements, and returns an error
if any are not satisfied:

You must fit both models (

`glme`

and`altglme`

) using the`'ApproximateLaplace'`

or`'Laplace'`

fit method. You cannot compare GLME models fitted using`'MPL'`

or`'REMPL'`

using a likelihood ratio test.You must fit both models using the same response vector, response distribution, and link function.

The smaller model (

`glme`

) must be nested within the larger model (`altglme`

), such that you can obtain`glme`

from`altglme`

by setting some of the model parameters of`altglme`

to fixed values such as 0.The maximized log likelihood of the larger model (

`altglme`

) must be greater than or equal to the maximized log likelihood of the smaller model (`glme`

).The weight vectors used to fit

`glme`

and`altglme`

must be identical.The random-effects design matrix of the larger model (

`altglme`

) must contain the random-effects design matrix of the smaller model (`glme`

).The fixed-effects design matrix of the larger model (

`altglme`

) must contain the fixed-effects design matrix of the smaller model (`glme`

).

The *Akaike information criterion* (AIC)
is *AIC* = –2log*L*_{M} +
2(*param*).

log*L*_{M} depends
on the method used to fit the model.

If you use

`'Laplace'`

or`'ApproximateLaplace'`

, then log*L*_{M}is the maximized log likelihood.If you use

`'MPL'`

, then log*L*_{M}is the maximized log likelihood of the pseudo data from the final pseudo likelihood iteration.If you use

`'REMPL'`

, then log*L*_{M}is the maximized restricted log likelihood of the pseudo data from the final pseudo likelihood iteration.

*param* is the total number of parameters estimated
in the model. For most GLME models, *param* is equal
to *nc* + *p* +
1, where *nc* is the total number
of parameters in the random-effects covariance, excluding the residual
variance, and *p* is the number of fixed-effects
coefficients. However, if the dispersion parameter is fixed at 1.0
for binomial or Poisson distributions, then *param* is
equal to (*nc* + *p*).

The *Bayesian information criterion* (BIC)
is *BIC* = –2*log*L*_{M} +
ln(*n _{eff}*)(

log*L*_{M} depends
on the method used to fit the model.

If you use

`'Laplace'`

or`'ApproximateLaplace'`

, then log*L*_{M}is the maximized log likelihood.If you use

`'MPL'`

, then log*L*_{M}is the maximized log likelihood of the pseudo data from the final pseudo likelihood iteration.If you use

`'REMPL'`

, then log*L*_{M}is the maximized restricted log likelihood of the pseudo data from the final pseudo likelihood iteration.

*n _{eff}* is
the effective number of observations.

If you use

`'MPL'`

,`'Laplace'`

, or`'ApproximateLaplace'`

, then*n*=_{eff}*n*, where*n*is the number of observations.If you use

`'REMPL'`

, then*n*=_{eff}*n*–*p*.

*param* is the total number of parameters estimated
in the model. For most GLME models, *param* is equal
to *nc* + *p* +
1, where *nc* is the total number
of parameters in the random-effects covariance, excluding the residual
variance, and *p* is the number of fixed-effects
coefficients. However, if the dispersion parameter is fixed at 1.0
for binomial or Poisson distributions, then *param* is
equal to (*nc* + *p*).

A lower value of deviance indicates a better fit. As the value
of deviance decreases, both AIC and BIC tend to decrease. Both AIC
and BIC also include penalty terms based on the number of parameters
estimated, *p*. So, when the number of parameters
increase, the values of AIC and BIC tend to increase as well. When
comparing different models, the model with the lowest AIC or BIC value
is considered as the best fitting model.

For models fitted using `'MPL'`

and `'REMPL'`

,
AIC and BIC are based on the log likelihood (or restricted log likelihood)
of pseudo data from the final pseudo likelihood iteration. Therefore,
a direct comparison of AIC and BIC values between models fitted using `'MPL'`

and `'REMPL'`

is
not appropriate.

You have a modified version of this example. Do you want to open this example with your edits?

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)