designMatrix

Fixed- and random-effects design matrices

Description

example

D = designMatrix(glme) or D = designMatrix(glme,'Fixed') returns the fixed-effects design matrix for the generalized linear mixed-effects model glme.

example

D = designMatrix(glme,'Random') returns the random-effects design matrix for the generalized linear mixed-effects model glme.

Dsub = designMatrix(glme,'Random',gnumbers) returns a subset of the random-effects design matrix for the generalized linear mixed-effects model glme that corresponds to the grouping variables indicated by gnumbers.

[Dsub,gnames] = designMatrix(glme,'Random',gnumbers) also returns the grouping variable names that correspond to gnumbers.

Input Arguments

expand all

Generalized linear mixed-effects model, specified as a GeneralizedLinearMixedModel object. For properties and methods of this object, see GeneralizedLinearMixedModel.

Grouping variable numbers, specified as an array of integer values containing elements in the range [1,R], where R is the length of the cell array that contains the grouping variables for the generalized linear mixed-effects model glme.

For example, you can specify the grouping variables g1, g3, and gr as [1,3,r].

Data Types: single | double

Output Arguments

expand all

Design matrix of a generalized linear mixed-effects model glme returned as one of the following:

• Fixed-effects design matrix — n-by-p matrix consisting of the fixed-effects design matrix of glme, where n is the number of observations and p is the number of fixed-effects terms. The order of fixed-effects terms in D matches the order of terms in the CoefficientNames property of the GeneralizedLinearMixedModel object glme.

• Random-effects design matrix — n-by-k matrix, consisting of the random-effects design matrix of glme. Here, k is equal to length(B), where B is the random-effects coefficients vector of generalized linear mixed-effects model glme. The random-effects design matrix is returned as a sparse matrix. For more information, see Sparse Matrices.

If glme has R grouping variables g1, g2, ..., gR, with levels m1, m2, ..., mR, respectively, and if q1, q2, ..., qR are the lengths of the random-effects vectors that are associated with g1, g2, ..., gR, respectively, then B is a column vector of length q1*m1 + q2*m2 + ... + qR*mR.

B is made by concatenating the empirical Bayes predictors of random effects vectors corresponding to each level of each grouping variable as [g1level1; g1level2; ...; g1levelm1; g2level1; g2level2; ...; g2levelm2; ...; gRlevel1; gRlevel2; ...; gRlevelmR]'.

Data Types: single | double

Submatrix of random-effects design matrix that corresponds to the grouping variables specified by gnumbers, returned as an n-by-k matrix, where k is length of the column vector Bsub.

Bsub contains the concatenated empirical Bayes predictors of random-effects vectors, corresponding to each level of the grouping variables, specified by gnumbers.

If, for example, gnumbers is [1,3,r], this corresponds to the grouping variables g1, g3, and gr. Then, Bsub contains the empirical Bayes predictors of random-effects vectors corresponding to each level of the grouping variables g1, g3, and gr, such as

[g1level1; g1level2; ...; g1levelm1; g3level1; g3level2; ...; g3levelm3; grlevel1; grlevel2; ...; grlevelmr]'.

Thus, Dsub*Bsub represents the contribution of all random effects corresponding to grouping variables g1, g3, and gr to the response of glme.

If gnumbers is empty, then Dsub is the full random-effects design matrix.

Data Types: single | double

Names of grouping variables corresponding to the integers in gnumbers if the design type is 'Random', returned as a k-by-1 cell array. If the design type is 'Fixed', then gnames is an empty matrix [].

Data Types: cell

Examples

expand all

This simulated data is from a manufacturing company that operates 50 factories across the world, with each factory running a batch process to create a finished product. The company wants to decrease the number of defects in each batch, so it developed a new manufacturing process. To test the effectiveness of the new process, the company selected 20 of its factories at random to participate in an experiment: Ten factories implemented the new process, while the other ten continued to run the old process. In each of the 20 factories, the company ran five batches (for a total of 100 batches) and recorded the following data:

• Flag to indicate whether the batch used the new process (newprocess)

• Processing time for each batch, in hours (time)

• Temperature of the batch, in degrees Celsius (temp)

• Categorical variable indicating the supplier (A, B, or C) of the chemical used in the batch (supplier)

• Number of defects in the batch (defects)

The data also includes time_dev and temp_dev, which represent the absolute deviation of time and temperature, respectively, from the process standard of 3 hours at 20 degrees Celsius.

Fit a generalized linear mixed-effects model using newprocess, time_dev, temp_dev, and supplier as fixed-effects predictors. Include a random-effects term for intercept grouped by factory, to account for quality differences that might exist due to factory-specific variations. The response variable defects has a Poisson distribution, and the appropriate link function for this model is log. Use the Laplace fit method to estimate the coefficients. Specify the dummy variable encoding as 'effects', so the dummy variable coefficients sum to 0.

The number of defects can be modeled using a Poisson distribution

${\text{defects}}_{ij}\sim \text{Poisson}\left({\mu }_{ij}\right).$

This corresponds to the generalized linear mixed-effects model

$\mathrm{log}\left({\mu }_{ij}\right)={\beta }_{0}+{\beta }_{1}{\text{newprocess}}_{ij}+{\beta }_{2}{\text{time}\text{_}\text{dev}}_{ij}+{\beta }_{3}{\text{temp}\text{_}\text{dev}}_{ij}+{\beta }_{4}{\text{supplier}\text{_}\text{C}}_{ij}+{\beta }_{5}{\text{supplier}\text{_}\text{B}}_{ij}+{b}_{i},$

where

• ${\text{defects}}_{ij}$ is the number of defects observed in the batch produced by factory $i$ during batch $j$.

• ${\mu }_{ij}$ is the mean number of defects corresponding to factory $i$ (where $i=1,2,...,20$) during batch $j$ (where $j=1,2,...,5$).

• ${\text{newprocess}}_{ij}$, ${\text{time}\text{_}\text{dev}}_{ij}$, and ${\text{temp}\text{_}\text{dev}}_{ij}$ are the measurements for each variable that correspond to factory $i$ during batch $j$. For example, ${\text{newprocess}}_{ij}$ indicates whether the batch produced by factory $i$ during batch $j$ used the new process.

• ${\text{supplier}\text{_}\text{C}}_{ij}$ and ${\text{supplier}\text{_}\text{B}}_{ij}$ are dummy variables that use effects (sum-to-zero) coding to indicate whether company C or B, respectively, supplied the process chemicals for the batch produced by factory $i$ during batch $j$.

• ${b}_{i}\sim N\left(0,{\sigma }_{b}^{2}\right)$ is a random-effects intercept for each factory $i$ that accounts for factory-specific variation in quality.

glme = fitglme(mfr,'defects ~ 1 + newprocess + time_dev + temp_dev + supplier + (1|factory)','Distribution','Poisson','Link','log','FitMethod','Laplace','DummyVarCoding','effects');

Extract the fixed-effects design matrix and display rows 1 through 10.

Dfe = designMatrix(glme,'Fixed');
disp(Dfe(1:10,:))
1.0000         0    0.1834    0.2259    1.0000         0
1.0000         0    0.3035    0.0725         0    1.0000
1.0000         0    0.0717    0.1630    1.0000         0
1.0000         0    0.1069    0.0809   -1.0000   -1.0000
1.0000         0    0.0241    0.0319    1.0000         0
1.0000         0    0.1214    0.1114         0    1.0000
1.0000         0    0.0033    0.0553    1.0000         0
1.0000         0    0.2350    0.0616    1.0000         0
1.0000         0    0.0488    0.0177         0    1.0000
1.0000         0    0.1148    0.0105    1.0000         0

Column 1 of the fixed-effects design matrix Dfe contains the constant term. Column 2, 3, and 4 contain the newprocess, time_dev, and temp_dev terms, respectively. Columns 5 and 6 contain dummy variables for supplier_C and supplier_B, respectively.

Extract the random-effects design matrix and display rows 1 through 10.

Dre = designMatrix(glme,'Random');
disp(Dre(1:10,:))
(1,1)        1
(2,1)        1
(3,1)        1
(4,1)        1
(5,1)        1
(6,2)        1
(7,2)        1
(8,2)        1
(9,2)        1
(10,2)        1

Convert the sparse matrix Dre to a full matrix and display rows 1 through 10.

full(Dre(1:10,:))
ans = 10×20

1     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
1     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
1     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
1     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
1     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
0     1     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
0     1     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
0     1     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
0     1     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
0     1     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0

Each column corresponds to a level of the grouping variable factory.