Linear Mixed Models

Published

February 26, 2026

Solving Mixed Models Equations

A mixed model, mixed-effects model or mixed error-component model is a statistical model containing both fixed effects and random effects

First, let’s load our data

data <- read.csv("../data/example_1.csv") |>
  mutate(gen = as.factor(gen), block = as.factor(block))
head(data)

  block gen yield
1     1  g1   7.4
2     2  g1   6.5
3     3  g1   5.6
4     1  g2   9.8
5     2  g2   6.8
6     3  g2   6.2

n <- 12
n_blks <- 3
n_gens <- 4

The most common way to obtain BLUEs and BLUPs is to solve the Henderson’s Equations. To do so, first, we need to estimate the variances of the random components. Later on, we will show how to do it using Restricted Maximum Likelihood (REML), but for now we will use the methods of moments. So, let’s assume the following model

\[\mathbf{y} = \mathbf{1}_n\mu + \mathbf{Xb} + \mathbf{Zg} + \mathbf{e}, \text{where } \mathbf{g} \sim N(0, \underbrace{\mathbf{I\sigma^2_g}}_{\mathbf{G}} ) \text{ and } \mathbf{e} \sim N(0, \underbrace{\mathbf{I\sigma^2_e}}_{\mathbf{R}})\]

which can be also expressed as:

\[\mathbf{y = X\beta + Zu + e} = \begin{bmatrix} 1 \ 0 \ 0 \\ 1 \ 1 \ 0 \\ 1 \ 0 \ 1 \\ 1 \ 0 \ 0 \\ 1 \ 1 \ 0 \\ 1 \ 0 \ 1 \\ 1 \ 0 \ 0 \\ 1 \ 1 \ 0 \\ 1 \ 0 \ 1 \\ 1 \ 0 \ 0 \\ 1 \ 1 \ 0 \\ 1 \ 0 \ 1 \\ \end{bmatrix} \boldsymbol{\beta} + \begin{bmatrix} 1 \ 0 \ 0 \ 0 \\ 1 \ 0 \ 0 \ 0 \\ 1 \ 0 \ 0 \ 0 \\ 0 \ 1 \ 0 \ 0 \\ 0 \ 1 \ 0 \ 0 \\ 0 \ 1 \ 0 \ 0 \\ 0 \ 0 \ 1 \ 0 \\ 0 \ 0 \ 1 \ 0 \\ 0 \ 0 \ 1 \ 0 \\ 0 \ 0 \ 0 \ 1 \\ 0 \ 0 \ 0 \ 1 \\ 0 \ 0 \ 0 \ 1 \\ \end{bmatrix}\mathbf{u} + \mathbf{e} \]

And \(\mathbf{G}, \mathbf{R} \text{ and } \mathbf{V}\), would be:

\[\mathbf{G} = \mathbf{\sigma^2_g I} = \sigma^2_\mathbf{g} \begin{bmatrix} 1 \ 0 \ 0 \ 0\\ 0 \ 1 \ 0 \ 0\\ 0 \ 0 \ 1 \ 0\\ 0 \ 0 \ 0 \ 1\\ \end{bmatrix}\]

\[\mathbf{R} = \mathbf{\sigma^2_e I} = \sigma^2_\mathbf{e} \begin{bmatrix} 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \\ \end{bmatrix}\]

\[\mathbf{V} = Var(\mathbf{y}) = Var(\mathbf{X}\boldsymbol{\beta} + \mathbf{Zu + e}) = Var(\mathbf{Zu}) + Var(\mathbf{e}) = \mathbf{ZV(u)Z^T + R = ZGZ^T + R}\]

Therefore

\[\mathbf{y} \sim N(\boldsymbol{X\beta}, \mathbf{ZGZ^T + R})\]

Estimating variance components

In the methods of moments, the variances can be calculated
\[\sigma^2_\mathbf{g} = \frac{MSE - \sigma^2_\mathbf{e}}{n_{blocks}}\]

mod <- lm(formula = yield ~ 1 + block + gen, data = data)
aov_table <- as.data.frame(anova(mod))
aov_table

          Df Sum Sq Mean Sq F value      Pr(>F)
block      2   9.78    4.89  12.225 0.007650536
gen        3   6.63    2.21   5.525 0.036730328
Residuals  6   2.40    0.40      NA          NA

mse_g <- aov_table["gen", "Mean Sq"]
var_e <- aov_table["Residuals", "Mean Sq"]
var_g <- (mse_g - var_e) / n_blks
var_g

[1] 0.6033333

Design matrices

X <- model.matrix(yield ~ 1 + block, data)
Z <- model.matrix(yield ~ -1 + gen, data)
y <- matrix(data[, "yield"])
print(X)

   (Intercept) block2 block3
1            1      0      0
2            1      1      0
3            1      0      1
4            1      0      0
5            1      1      0
6            1      0      1
7            1      0      0
8            1      1      0
9            1      0      1
10           1      0      0
11           1      1      0
12           1      0      1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")$block
[1] "contr.treatment"

print(y)

      [,1]
 [1,]  7.4
 [2,]  6.5
 [3,]  5.6
 [4,]  9.8
 [5,]  6.8
 [6,]  6.2
 [7,]  7.3
 [8,]  6.1
 [9,]  6.4
[10,]  9.5
[11,]  8.0
[12,]  7.4

G <- diag(x = var_g, nrow = n_gens)
R <- diag(x = var_e, nrow = n)
V <- Z %*% G %*% t(Z) + R

Henderson’s equations (Inverting \(C\))

Now we have all we need to solve:

\[ \underbrace{\begin{bmatrix} \mathbf{X^T R^{-1} X} & \mathbf{X^T R^{-1} Z} \\ \mathbf{Z^T R^{-1} X} & \mathbf{Z^T R^{-1} Z + G^{-1}} \end{bmatrix}}_{\boldsymbol{C}} \begin{bmatrix} \hat\beta\\ \hat u \end{bmatrix} = \underbrace{\begin{bmatrix} \mathbf{X^T R^{-1} y} \\ \mathbf{Z^T R^{-1} y} \end{bmatrix}}_{\boldsymbol{RHS}} \]

And therefore

\[ \begin{bmatrix} \hat\beta\\ \hat u \end{bmatrix} = C^{-1} RHS, \text{ where } C = \begin{bmatrix} C_{11} & C_{12} \\ C_{21} & C_{22} \end{bmatrix} \]

# Mixed Model Equations
C11 <- t(X) %*% chol2inv(chol(R)) %*% X
C12 <- t(X) %*% chol2inv(chol(R)) %*% Z
C21 <- t(Z) %*% chol2inv(chol(R)) %*% X
C22 <- t(Z) %*% chol2inv(chol(R)) %*% Z + solve(G)

# Coefficient matrix (LHS)
C <- as.matrix(
  rbind(
    cbind(C11, C12),
    cbind(C21, C22)
  )
)
dim(C)

[1] 7 7

# RHS
rhs <- rbind(
  t(X) %*% solve(R) %*% y,
  t(Z) %*% solve(R) %*% y
)

# Solution
C_inv <- chol2inv(chol(C))
rownames(C_inv) <- colnames(C_inv) <- rownames(C)
print(round(C_inv, 4))

            (Intercept) block2 block3   geng1   geng2   geng3   geng4
(Intercept)      0.2508   -0.1   -0.1 -0.1508 -0.1508 -0.1508 -0.1508
block2          -0.1000    0.2    0.1  0.0000  0.0000  0.0000  0.0000
block3          -0.1000    0.1    0.2  0.0000  0.0000  0.0000  0.0000
geng1           -0.1508    0.0    0.0  0.2327  0.1235  0.1235  0.1235
geng2           -0.1508    0.0    0.0  0.1235  0.2327  0.1235  0.1235
geng3           -0.1508    0.0    0.0  0.1235  0.1235  0.2327  0.1235
geng4           -0.1508    0.0    0.0  0.1235  0.1235  0.1235  0.2327

ans <- C_inv %*% rhs
ans

                  [,1]
(Intercept)  8.5000000
block2      -1.6500000
block3      -2.1000000
geng1       -0.6142534
geng2        0.2866516
geng3       -0.5323529
geng4        0.8599548

Alternative solution (Inverting \(V\))

BLUEs and BLUPs can also be calculated separately as follows:

\[ \boldsymbol{\hat{\beta}} = \mathbf{(X^T V^{-1} X)^{-1} X^T V^{-1} y} \]

These are the generalized forms. Let’s show how would \(\boldsymbol{\hat \beta}\) be calculated using \(\mathbf{V = \sigma^2_e I}\)

\[ \boldsymbol{\hat{\beta}} = \mathbf{\sigma^2_e(X^T X)^{-1} \frac{X^T y}{\sigma^2_e}} = \mathbf{(X^T X)^{-1}X^T y} \]

Respectively, for BLUPs:

\[ \mathbf{\hat u} = \mathbf{GZ^T V^{-1}(y-X \boldsymbol{\hat{\beta})}} \]

dim(V)

[1] 12 12

V_inv <- solve(V)

# BLUEs
betas <- solve(t(X) %*% V_inv %*% X) %*% t(X) %*% V_inv %*% y
# BLUPs
u <- G %*% t(Z) %*% V_inv %*% (y - X %*% betas)
rownames(u) <- colnames(Z)
# Solution
m <- rbind(betas, u)
rownames(m) <- c(rownames(betas), rownames(u))
m

                  [,1]
(Intercept)  8.5000000
block2      -1.6500000
block3      -2.1000000
geng1       -0.6142534
geng2        0.2866516
geng3       -0.5323529
geng4        0.8599548