coefTest

Hypothesis test on fixed and random effects of generalized linear mixed-effects model

Description

example

pVal = coefTest(glme) returns the p-value of an F-test of the null hypothesis that all fixed-effects coefficients of the generalized linear mixed-effects model glme, except for the intercept, are equal to 0.

example

pVal = coefTest(glme,H) returns the p-value of an F-test using a specified contrast matrix, H. The null hypothesis is H0: Hβ = 0, where β is the fixed-effects vector.

pVal = coefTest(glme,H,C) returns the p-value for an F-test using the hypothesized value, C. The null hypothesis is H0: Hβ = C, where β is the fixed-effects vector.

pVal = coefTest(glme,H,C,Name,Value) returns the p-value for an F-test on the fixed- and/or random-effects coefficients of the generalized linear mixed-effects model glme, with additional options specified by one or more name-value pair arguments. For example, you can specify the method to compute the approximate denominator degrees of freedom for the F-test.

example

[pVal,F,DF1,DF2] = coefTest(___) also returns the F-statistic, F, and the numerator and denominator degrees of freedom for F, respectively DF1 and DF2, using any of the previous syntaxes.

Input Arguments

expand all

Generalized linear mixed-effects model, specified as a GeneralizedLinearMixedModel object. For properties and methods of this object, see GeneralizedLinearMixedModel.

Fixed-effects contrasts, specified as an m-by-p matrix, where p is the number of fixed-effects coefficients in glme. Each row of H represents one contrast. The columns of H (left to right) correspond to the rows of the p-by-1 fixed-effects vector beta (top to bottom) whose estimate is returned by the fixedEffects method.

Data Types: single | double

Hypothesized value for testing the null hypothesis Hβ = C, specified as an m-by-1 vector. Here, β is the vector of fixed-effects whose estimate is returned by fixedEffects.

Data Types: single | double

Name-Value Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Method for computing approximate degrees of freedom, specified as the comma-separated pair consisting of 'DFMethod' and one of the following.

ValueDescription
'residual'The degrees of freedom value is assumed to be constant and equal to np, where n is the number of observations and p is the number of fixed effects.
'none'The degrees of freedom is set to infinity.

Example: 'DFMethod','none'

Random-effects contrasts, specified as the comma-separated pair consisting of 'REContrast' and an m-by-q matrix, where q is the number of random effects parameters in glme. The columns of the matrix (left to right) correspond to the rows of the q-by-1 random-effects vector B (top to bottom), whose estimate is returned by the randomEffects method.

Data Types: single | double

Output Arguments

expand all

p-value for the F-test on the fixed- and/or random-effects coefficients of the generalized linear mixed-effects model glme, returned as a scalar value.

When fitting a GLME model using fitglme and one of the maximum likelihood fit methods ('Laplace' or 'ApproximateLaplace'), coefTest uses an approximation of the conditional mean squared error of prediction (CMSEP) of the estimated linear combination of fixed- and random-effects to compute p-values. This accounts for the uncertainty in the fixed-effects estimates, but not for the uncertainty in the covariance parameter estimates. For tests on fixed effects only, if you specify the 'CovarianceMethod' name-value pair argument in fitglme as 'JointHessian', then coefTest accounts for the uncertainty in the estimation of covariance parameters.

When fitting a GLME model using fitglme and one of the pseudo likelihood fit methods ('MPL' or 'REMPL'), coefTest bases the inference on the fitted linear mixed effects model from the final pseudo likelihood iteration.

F-statistic, returned as a scalar value.

Numerator degrees of freedom for the F-statistic F, returned as a scalar value.

• If you test the null hypothesis H0: Hβ = 0 or H0: Hβ = C, then DF1 is equal to the number of linearly independent rows in H.

• If you test the null hypothesis H0: Hβ + KB = C, then DF1 is equal to the number of linearly independent rows in [H,K].

Denominator degrees of freedom for the F-statistic F, returned as a scalar value. The value of DF2 depends on the option specified by the 'DFMethod' name-value pair argument.

Examples

expand all

This simulated data is from a manufacturing company that operates 50 factories across the world, with each factory running a batch process to create a finished product. The company wants to decrease the number of defects in each batch, so it developed a new manufacturing process. To test the effectiveness of the new process, the company selected 20 of its factories at random to participate in an experiment: Ten factories implemented the new process, while the other ten continued to run the old process. In each of the 20 factories, the company ran five batches (for a total of 100 batches) and recorded the following data:

• Flag to indicate whether the batch used the new process (newprocess)

• Processing time for each batch, in hours (time)

• Temperature of the batch, in degrees Celsius (temp)

• Categorical variable indicating the supplier (A, B, or C) of the chemical used in the batch (supplier)

• Number of defects in the batch (defects)

The data also includes time_dev and temp_dev, which represent the absolute deviation of time and temperature, respectively, from the process standard of 3 hours at 20 degrees Celsius.

Fit a generalized linear mixed-effects model using newprocess, time_dev, temp_dev, and supplier as fixed-effects predictors. Include a random-effects intercept grouped by factory, to account for quality differences that might exist due to factory-specific variations. The response variable defects has a Poisson distribution, and the appropriate link function for this model is log. Use the Laplace fit method to estimate the coefficients. Specify the dummy variable encoding as 'effects', so the dummy variable coefficients sum to 0.

The number of defects can be modeled using a Poisson distribution

${\text{defects}}_{ij}\sim \text{Poisson}\left({\mu }_{ij}\right)$

This corresponds to the generalized linear mixed-effects model

$\mathrm{log}\left({\mu }_{ij}\right)={\beta }_{0}+{\beta }_{1}{\text{newprocess}}_{ij}+{\beta }_{2}{\text{time}\text{_}\text{dev}}_{ij}+{\beta }_{3}{\text{temp}\text{_}\text{dev}}_{ij}+{\beta }_{4}{\text{supplier}\text{_}\text{C}}_{ij}+{\beta }_{5}{\text{supplier}\text{_}\text{B}}_{ij}+{b}_{i},$

where

• ${\text{defects}}_{ij}$ is the number of defects observed in the batch produced by factory $i$ during batch $j$.

• ${\mu }_{ij}$ is the mean number of defects corresponding to factory $i$ (where $i=1,2,...,20$) during batch $j$ (where $j=1,2,...,5$).

• ${\text{newprocess}}_{ij}$, ${\text{time}\text{_}\text{dev}}_{ij}$, and ${\text{temp}\text{_}\text{dev}}_{ij}$ are the measurements for each variable that correspond to factory $i$ during batch $j$. For example, $newproces{s}_{ij}$ indicates whether the batch produced by factory $i$ during batch $j$ used the new process.

• ${\text{supplier}\text{_}\text{C}}_{ij}$ and ${\text{supplier}\text{_}\text{B}}_{ij}$ are dummy variables that use effects (sum-to-zero) coding to indicate whether company C or B, respectively, supplied the process chemicals for the batch produced by factory $i$ during batch $j$.

• ${b}_{i}\sim N\left(0,{\sigma }_{b}^{2}\right)$ is a random-effects intercept for each factory $i$ that accounts for factory-specific variation in quality.

glme = fitglme(mfr,'defects ~ 1 + newprocess + time_dev + temp_dev + supplier + (1|factory)','Distribution','Poisson','Link','log','FitMethod','Laplace','DummyVarCoding','effects');

Test if there is any significant difference between supplier C and supplier B.

H = [0,0,0,0,1,-1];

[pVal,F,DF1,DF2] = coefTest(glme,H)
pVal = 0.2793
F = 1.1842
DF1 = 1
DF2 = 94

The large $p$-value indicates that there is no significant difference between supplier C and supplier B at the 5% significance level. Here, coefTest also returns the $F$-statistic, the numerator degrees of freedom, and the approximate denominator degrees of freedom.

Test if there is any significant difference between supplier A and supplier B.

If you specify the 'DummyVarCoding' name-value pair argument as 'effects' when fitting the model using fitglme, then

${\beta }_{A}+{\beta }_{B}+{\beta }_{C}=0,$

where ${\beta }_{A}$, ${\beta }_{B}$, and ${\beta }_{C}$ correspond to suppliers A, B, and C, respectively. ${\beta }_{A}$ is the effect of A minus the average effect of A, B, and C. To determine the contrast matrix corresponding to a test between supplier A and supplier B,

${\beta }_{B}-{\beta }_{A}={\beta }_{B}-\left(-{\beta }_{B}-{\beta }_{C}\right)=2{\beta }_{B}+{\beta }_{C}.$

From the output of disp(glme), column 5 of the contrast matrix corresponds to ${\beta }_{C}$, and column 6 corresponds to ${\beta }_{B}$. Therefore, the contrast matrix for this test is specified as H = [0,0,0,0,1,2].

H = [0,0,0,0,1,2];

[pVal,F,DF1,DF2] = coefTest(glme,H)
pVal = 0.6177
F = 0.2508
DF1 = 1
DF2 = 94

The large $p$-value indicates that there is no significant difference between supplier A and supplier B at the 5% significance level.

 Booth, J.G., and J.P. Hobert. “Standard Errors of Prediction in Generalized Linear Mixed Models.” Journal of the American Statistical Association, Vol. 93, 1998, pp. 262–272.