Bayesian Hierarchical Linear Regression

Learning Objectives
  • Learn how to fit regression model with correlated data with brms

1. Hierarchical models

  • Often models have more complex structures than the ones described by likelihood and priors and simple relationships between data and parameters

  • In these models data and parameters participate in relationships of hierarchical fashion

    • e.g., Correlated and clustered data structures, such as measures within patients, patients within clinics etc.
  • Bayesian models deal with hierarchical models in a natural and intuitive way (Bürkner 2017)

  • Source of correlation in clustered or longitudinal data

    • Between-individual heterogeneity on (i) mean (intercept) and (ii) velocity / rate of change (slope).
    • Within-individual/cluster homogeneity: repeated measures on the same unit tend to be more alike than measures from different units.
Three layers of a Bayesian hierarchical model

1.1 Bayesian Hierarchical Models

  1. Data Layer \(Y \mid \theta\): the likelihood for observed data \(Y\) given model parameters \(\theta\).

  2. Adaptive / Structural Priors \(\theta \mid \alpha, \beta\): the model for \(\theta\) governed by hyperparameters \(\alpha\) and \(\beta\).

  3. Hyper Priors \(\alpha\) and \(\beta\): top-level priors on the hyperparameters.

1.1 Example 1 - Clustered data: developmental toxicity study

  • Example: Developmental toxicity study of ethylene glycol (EG) (Price et al. 1985)

  • The data are from a development toxicity study of ethylene glycol (EG).

  • Ethylene glycol is a high-volume industrial chemical used in many applications. EG is a high-volume industrial chemical.

  • In a study conducted through the National Toxicology Program (NTP), EG was administered at doses of 0, 750, 1500, or 3000 mg/kg/day to 94 pregnant mice (dams).

  • Following sacrifice, fetal weight and evidence of malformations were recorded.

  • Data structure:

    • Each row represents one fetus (mouse).
    • Fetuses are nested within litters ??? a classic two-level clustered data set.
Code
ntp <- read.table("data/ntp-data.txt", header = TRUE)
colnames(ntp) <- c("Litter", "Dose", "Weight", "Malformation")
ntp <- ntp %>%
  group_by(Litter) %>%
  mutate(Litid = row_number()) %>%
  mutate(FetalID = paste0(Litter, ".", Litid)) %>%
  select(-Litid)
  • Summary: number of litters and fetuses per dose group.
Code
cluster <- ntp %>%
  group_by(Dose) %>%
  summarize(nlitter = n_distinct(Litter), nfetal = n_distinct(FetalID))

cluster %>% datatable(
  rownames = FALSE,
  options = list(columnDefs = list(list(className = 'dt-center', targets = 0:2)))
)
Code
clustersize <- ntp %>% group_by(Litter) %>% summarise(csize = n_distinct(FetalID))

clustersize %>%
  ggplot(aes(x = Litter, y = csize)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "white") +
  ylab("Cluster size (fetuses per litter)") +
  theme_bw()

  • Visualise fetal weight by litter and dose.
Code
ntp %>%
  ggplot(aes(x = Litter, y = Weight)) +
  geom_boxplot(aes(group = Litter), color = "steelblue") +
  facet_wrap(~Dose) +
  theme_bw() +
  ggtitle("Fetal weight by litter, stratified by dose")

Code
ntp <- ntp %>%
  mutate(Dose = factor(Dose, levels = c("0", "750", "1500", "3000")))
  • There is clear between-litter variability in fetal weight within each dose group, and weights decrease with increasing dose.
  • Suppose we are interested to assess the effect of dose on fetal weight.
  • The data on fetal weight from this experiment are clustered, with observations on the fetuses nested within litters.

1.3 Normal hierarchical model with random intercept (model 1)

Model specification

  • Each fetus \(i\) in litter \(j\) has weight \(Y_{ij}\).
  • The mean weight depends on dose (a fixed effect) and a litter-specific intercept (a random effect):

\[Y_{ij} \mid \beta_{0j}, \beta_{1}, \sigma_y \sim N(\beta_{0j} + \beta_{1}\,\text{dose}_{ij},\; \sigma_y^2)\]

\[\beta_{0j} \mid \beta_0, \sigma_0 \sim N(\beta_0, \sigma_0^2)\]

\[\beta_0 \sim N(0, 10), \quad \beta_1 \sim N(0, 10), \quad \sigma_y \sim \text{HalfCauchy}(0,10), \quad \sigma_0 \sim \text{HalfCauchy}(0,10)\]

  • \(\beta_{0j}\) is the litter-specific intercept: baseline weight for litter \(j\).
  • \(\beta_1\) is the shared dose effect across all litters.
  • \(\sigma_0\) captures between-litter variability in baseline weight.
  • \(\sigma_y\) captures within-litter residual variability.

Prior specification

Before fitting any model, we should state our priors explicitly and visualise them.

  • Intercept \(\beta_0\): expected weight at dose 0. We use a weakly informative \(N(0, 10)\).
  • Dose slopes \(\beta_1, \ldots\): expected weight differences relative to dose 0. We use \(N(0, 10)\).
  • Residual SD \(\sigma_y\): \(\text{HalfCauchy}(0, 10)\) - allows substantial variability but is regularising.
  • Between-litter SD \(\sigma_0\): \(\text{HalfCauchy}(0, 10)\) - same rationale.
Code
x_b   <- seq(-40, 40, length.out = 500)
x_sig <- seq(0, 50, length.out = 500)

p_intercept <- tibble(x = x_b, d = dnorm(x, 0, 10)) %>%
  ggplot(aes(x, d)) + geom_line(color = "steelblue", linewidth = 1) +
  labs(title = "Prior: Intercept ~ N(0, 10)", x = expression(beta[0]), y = "Density") +
  theme_bw()

p_slope <- tibble(x = x_b, d = dnorm(x, 0, 10)) %>%
  ggplot(aes(x, d)) + geom_line(color = "darkorange", linewidth = 1) +
  labs(title = "Prior: Slopes ~ N(0, 10)", x = expression(beta), y = "Density") +
  theme_bw()

p_sigma <- tibble(x = x_sig,
                  d = dcauchy(x, 0, 10) * 2) %>%  # half-Cauchy
  ggplot(aes(x, d)) + geom_line(color = "firebrick", linewidth = 1) +
  labs(title = "Prior: SD ~ HalfCauchy(0, 10)", x = expression(sigma), y = "Density") +
  theme_bw()

ggarrange(p_intercept, p_slope, p_sigma, nrow = 1)

Why weakly informative priors?

A prior is weakly informative when it rules out physically implausible values (e.g., negative standard deviations) without strongly constraining the posterior. This lets data dominate while providing some regularisation to aid sampling.


Code
fit2 <- brm(data = ntp,
            family= gaussian(),
            Weight ~ Dose + (1|Litter),  
            prior = c(prior(normal(0, 10), class = Intercept),
                      prior(normal(0, 10), class = b),
                      prior(cauchy(0, 10), class = sd),
                      prior(cauchy(0, 10), class = sigma)),
            iter = 10000,
            warmup = 8000,
            chains = 4,
            cores = 4, 
            seed = 123, 
            silent = 2,
            refresh = 0)
# saveRDS(fit2, "data/chp4_h_example2")

Posterior results

Code
fit2 <- readRDS("data/chp4_h_example2")
summary(fit2)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Weight ~ Dose + (1 | Litter) 
   Data: ntp (Number of observations: 1027) 
  Draws: 4 chains, each with iter = 10000; warmup = 8000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Litter (Number of levels: 94) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.09      0.01     0.07     0.10 1.00     1327     2096

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     0.97      0.02     0.94     1.01 1.00      934     1686
Dose750      -0.09      0.03    -0.14    -0.04 1.00     1059     1878
Dose1500     -0.19      0.03    -0.25    -0.14 1.00     1017     1607
Dose3000     -0.26      0.03    -0.31    -0.21 1.00      997     2009

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.07      0.00     0.07     0.08 1.00     7140     5025

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Code
# checking traceplot
fit2 %>%
  plot(combo = c("hist", "trace"), widths = c(1, 1.5),
       theme = theme_bw(base_size = 10))

Code
conditional_effects(fit2)

Interpretation

Fixed effects

Term Posterior mean 95% CrI Interpretation
Intercept 0.97 0.94 - 1.01 Expected weight at dose 0, averaged over all litters.
Dose 750 -0.09 -0.14 - -0.04 Average weight decrease moving from dose 0 to 750 mg/kg/day.
Dose 1500 -0.19 -0.25 - -0.14 Larger decrease at 1500 mg/kg/day.
Dose 3000 -0.26 -0.31 - -0.21 Largest average decrease at 3000 mg/kg/day.

All dose coefficients are negative and their credible intervals exclude 0, providing strong evidence that higher doses lower fetal weight in a dose-dependent fashion.

Random effects

Parameter Posterior SD 95% CrI Interpretation
sd(Intercept) 0.09 0.07 - 0.10 Litters differ in baseline weight by roughly \(\pm\) 0.09 units around the grand mean.
\(\sigma\) (residual) 0.07 0.07 - 0.08 After accounting for dose and litter, individual fetuses vary by \(\pm\) 0.07 units.
  • Only baseline (intercept) varies; slope variation was not included in this model.
  • Within-litter (residual) variance, \(\sigma = 0.07 (0.07 - 0.08)\), after accounting for dose and litter baseline, individual vary by about 0.07 units.

Intraclass Correlation Coefficient (ICC)

The ICC measures what fraction of the total outcome variability is attributable to between-litter differences.

Adjusted ICC (ignoring fixed effects):

\[\rho_{\text{adj}} = \frac{\sigma_0^2}{\sigma_0^2 + \sigma_y^2}\]

Conditional ICC (accounting for fixed effect variability):

\[\rho_c = \frac{\sigma_0^2}{\sigma_0^2 + \sigma_f^2 + \sigma_y^2}, \qquad \sigma_f^2 = \text{Var}\!\left(\sum_m \hat\beta_m x_{mij}\right)\]

  • The adjusted ICC represents the proportion of variance of the outcome that are due to between-level (e.g., between-group, between-person) differences

    • Does not consider fixed effect! In this case, adjusted ICC does not feature variation caused by dose.
  • reference: (Nakagawa, Johnson, and Schielzeth 2017) doi: 10.1098/rsif.2017.0213

  • We can use the posterior draws of the two SD to compute the posterior for the adjusted ICC:

Code
# Extract posterior draws of tau and sigma
sd_m2      <- VarCorr(fit2, summary = FALSE)
draws_tau  <- sd_m2$Litter$sd[, "Intercept"]   # between-litter SD
draws_sigma <- sd_m2$residual__$sd[, 1]          # within-litter SD

# Adjusted ICC
draws_icc  <- draws_tau^2 / (draws_tau^2 + draws_sigma^2)
cat("Adjusted ICC - posterior median [95% CrI]:\n")
Adjusted ICC - posterior median [95% CrI]:
Code
round(quantile(draws_icc, c(0.5, 0.025, 0.975)), 3)
  50%  2.5% 97.5% 
0.562 0.478 0.645 
Code
p_icc1 <- ggplot(tibble(icc = draws_icc), aes(x = icc)) +
  geom_density(fill = "steelblue", alpha = 0.4) +
  labs(title = "Posterior of adjusted ICC",
       x = expression(rho[adj]), y = "Density") +
  theme_bw()

# Conditional ICC
post_par  <- colMeans(as_draws_df(fit2)[, c("b_Dose750", "b_Dose1500", "b_Dose3000")])
dummy     <- with(ntp, outer(Dose, levels(Dose), `==`) * 1)[, -1]
sigma_f2  <- var(dummy %*% post_par)
draws_icc_cond <- draws_tau^2 / (draws_tau^2 + sigma_f2 + draws_sigma^2)

cat("\nConditional ICC - posterior median [95% CrI]:\n")

Conditional ICC - posterior median [95% CrI]:
Code
round(quantile(draws_icc_cond, c(0.5, 0.025, 0.975)), 3)
  50%  2.5% 97.5% 
0.314 0.250 0.391 
Code
p_icc2 <- ggplot(tibble(icc = draws_icc_cond), aes(x = icc)) +
  geom_density(fill = "darkorange", alpha = 0.4) +
  labs(title = "Posterior of conditional ICC",
       x = expression(rho[c]), y = "Density") +
  theme_bw()

ggarrange(p_icc1, p_icc2, nrow = 1)

ICC interpretation
  • Adjusted ICC = 0.56 [0.48, 0.65]: about 56% of the total variation in fetal weight is due to differences between litters, before accounting for dose. This is high, confirming strong clustering.
  • Conditional ICC = 0.31 [0.25, 0.39]: once dose is accounted for, between-litter differences explain about 31% of remaining variance. The dose fixed effect “explains away” some of the apparent litter-level clustering.

An ICC above 0.1 generally justifies using a hierarchical model rather than treating observations as independent.

1.4 Normal hierarchical model with varying intercept and slope (model 2)

Model specification

Does the effect of dose also vary by litter? Maybe some litters are more sensitive to EG than others. We add a random slope on dose:

\[Y_{ij} \mid \beta_{0j}, \beta_{1j}, \sigma_y \sim N(\beta_{0j} + \beta_{1j}\,\text{dose}_{ij},\; \sigma_y^2)\]

The litter-specific intercept and slope are modelled jointly as bivariate normal:

\[\begin{pmatrix}\beta_{0j}\\\beta_{1j}\end{pmatrix} \sim N\!\left(\begin{pmatrix}\beta_0\\\beta_1\end{pmatrix},\; \begin{pmatrix}\sigma_0^2 & \rho\,\sigma_0\sigma_1\\\rho\,\sigma_0\sigma_1 & \sigma_1^2\end{pmatrix}\right)\]

  • \(\sigma_1\) captures how much the dose effect varies across litters.
  • \(\rho\) is the correlation between a litter’s baseline weight and its sensitivity to dose.
  • The correlation matrix \(\mathbf{R}\) receives an LKJ prior (Lewandowski, Kurowicka, and Joe 2009).
Code
dlkjcorr2 <- function(rho, eta = 1, log = FALSE) {
  out <- (eta - 1) * log(1 - rho^2) -
    1/2 * log(pi) - lgamma(eta) + lgamma(eta + 1/2)
  if (!log) out <- exp(out)
  out
}

ggplot(tibble(rho = c(-1, 1)), aes(x = rho)) +
  stat_function(fun = dlkjcorr2, args = list(eta = 0.5), aes(col = "0.5"), n = 501) +
  stat_function(fun = dlkjcorr2, args = list(eta = 1),   aes(col = "1"),   n = 501) +
  stat_function(fun = dlkjcorr2, args = list(eta = 2),   aes(col = "2"),   n = 501) +
  stat_function(fun = dlkjcorr2, args = list(eta = 10),  aes(col = "10"),  n = 501) +
  labs(col = expression(eta), x = expression(rho), y = "Density",
       title = "LKJ prior on correlation between random intercept and slope") +
  theme_bw()

LKJ prior for different values of the shape parameter
Code
fit3 <- brm(data = ntp,
            family= gaussian(),
            Weight ~ Dose + (Dose|Litter),  
            prior = c(prior(normal(0, 10), class = Intercept),
                      prior(normal(0, 10), class = b),
                      prior(cauchy(0, 10), class = sd),
                      prior(cauchy(0, 10), class = sigma),
                      prior(lkj(2), class = cor)),
            iter = 11000,
            warmup = 10000,
            chains = 3,
            cores = 4, 
            control = list(adapt_delta = .9), 
            seed = 123, 
            silent = 2,
            refresh = 0)
# saveRDS(fit3, "data/chp4_h_example3")

Posterior results

Code
fit3 <- readRDS("data/chp4_h_example3")
summary(fit3)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Weight ~ Dose + (Dose | Litter) 
   Data: ntp (Number of observations: 1027) 
  Draws: 3 chains, each with iter = 11000; warmup = 10000; thin = 1;
         total post-warmup draws = 3000

Multilevel Hyperparameters:
~Litter (Number of levels: 94) 
                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
sd(Intercept)               0.07      0.01     0.05     0.09 1.01      385
sd(Dose750)                 0.05      0.03     0.00     0.13 1.01      300
sd(Dose1500)                0.07      0.04     0.00     0.15 1.01      334
sd(Dose3000)                0.08      0.04     0.01     0.16 1.01      254
cor(Intercept,Dose750)     -0.09      0.37    -0.73     0.67 1.00      635
cor(Intercept,Dose1500)    -0.02      0.35    -0.68     0.67 1.00      485
cor(Dose750,Dose1500)       0.01      0.38    -0.68     0.74 1.00      935
cor(Intercept,Dose3000)     0.02      0.36    -0.67     0.71 1.00      500
cor(Dose750,Dose3000)       0.01      0.38    -0.70     0.72 1.00      713
cor(Dose1500,Dose3000)      0.02      0.38    -0.70     0.72 1.01      804
                        Tail_ESS
sd(Intercept)               1063
sd(Dose750)                  671
sd(Dose1500)                 797
sd(Dose3000)                 388
cor(Intercept,Dose750)       785
cor(Intercept,Dose1500)      572
cor(Dose750,Dose1500)       1527
cor(Intercept,Dose3000)      799
cor(Dose750,Dose3000)       1173
cor(Dose1500,Dose3000)      1491

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     0.97      0.01     0.95     1.00 1.00      842     1310
Dose750      -0.09      0.02    -0.14    -0.05 1.00     1276     1748
Dose1500     -0.19      0.03    -0.25    -0.14 1.00     1387     1757
Dose3000     -0.26      0.03    -0.32    -0.21 1.00     1500     1921

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.07      0.00     0.07     0.08 1.00     3616     2306

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Code
fit3 %>% plot( combo = c("hist", "trace"), widths = c(1, 1.5),
      theme = theme_bw(base_size = 10))

Code
conditional_effects(fit3)

Interpretation

  • Dose effects (fixed): similar to Model 1 - higher dose is associated with lower fetal weight at every dose level, with all credible intervals excluding 0.
  • sd(Intercept) = 0.07: litters still differ considerably in baseline weight.
  • sd(Dose 750) = 0.05: the effect of 750 mg/kg varies across litters by roughly �0.05 units - some litters are more sensitive to dosing than others, though this variation is modest.
  • cor(Intercept, Dose750) = -0.09: litters with higher baseline weights tend to show slightly more weight reduction at 750 mg/kg, but this correlation is uncertain (wide CrI).

1.5 Model comparison

Code
loo2 <- loo(fit2)
loo3 <- loo(fit3)
loo_compare(loo2, loo3)
     elpd_diff se_diff
fit3  0.0       0.0   
fit2 -0.3       1.1   
Interpreting LOO comparison
  • The elpd difference (expected log predictive density) compares out-of-sample predictive accuracy.
  • A difference < 4 in elpd (in absolute terms) relative to its standard error suggests the two models predict new data equally well.
  • Here, the random-slope model (fit3) does not substantially outperform the random-intercept model (fit2). We favour fit2 for parsimony.

Practical decision rule:

  • If the elpd difference is large relative to its SE (|elpd_diff / se_diff| > 2), choose the better-predicting model.
  • Otherwise, the simpler model is preferred.

1.6 Modelling variance: which factors explain excess variability?

  • So far we have assumed residual variance \(\sigma_y\) is constant. But what if some dose groups are more variable than others?

  • We can extend the model to let \(\sigma_y\) depend on covariates using brms’s distributional regression (bf() with sigma ~).

This approach is sometimes called a location-scale model:

\[Y_{ij} \sim N(\mu_{ij},\; \sigma_{ij}^2)\]

\[\mu_{ij} = \beta_{0j} + \beta_1\,\text{dose}_{ij}\]

\[\log(\sigma_{ij}) = \gamma_0 + \gamma_1\,\text{dose}_{ij}\]

The second sub-model lets us ask: does higher dose lead to more or less variable fetal weights?

Code
fit_ls <- brm(
  data   = ntp,
  family = gaussian(),
  bf(Weight ~ Dose + (1 | Litter),
     sigma ~ Dose),      # model the log-SD as a function of dose
  prior = c(
    prior(normal(0, 10),  class = Intercept),
    prior(normal(0, 10),  class = b),
    prior(cauchy(0, 10),  class = sd),
    prior(normal(0, 1),   class = b, dpar = sigma),   # prior on log-sigma coefficients
    prior(normal(0, 2),   class = Intercept, dpar = sigma)
  ),
  iter    = 10000,
  warmup  = 8000,
  chains  = 4,
  cores   = 4,
  seed    = 456,
  silent  = 2,
  refresh = 0
)
# saveRDS(fit_ls, "data/chp4_h_example_ls")
Code
fit_ls <- readRDS("data/chp4_h_example_ls")
summary(fit_ls)
 Family: gaussian 
  Links: mu = identity; sigma = log 
Formula: Weight ~ Dose + (1 | Litter) 
         sigma ~ Dose
   Data: ntp (Number of observations: 1027) 
  Draws: 4 chains, each with iter = 10000; warmup = 8000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~Litter (Number of levels: 94) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.08      0.01     0.07     0.10 1.00     1542     2210

Regression Coefficients:
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept           0.97      0.02     0.94     1.01 1.01      959     1672
sigma_Intercept    -2.57      0.04    -2.65    -2.48 1.00     5436     5270
Dose750            -0.09      0.03    -0.14    -0.04 1.01      897     1143
Dose1500           -0.19      0.03    -0.25    -0.14 1.00     1059     1959
Dose3000           -0.26      0.03    -0.31    -0.21 1.00     1127     2003
sigma_Dose750      -0.10      0.06    -0.23     0.02 1.00     6100     5767
sigma_Dose1500     -0.03      0.07    -0.15     0.10 1.00     5460     5876
sigma_Dose3000      0.04      0.07    -0.09     0.17 1.00     5954     5700

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Interpreting the variance sub-model

The sigma_ coefficients in the output are on the log scale. For example:

  • sigma_Intercept: log of the residual SD at dose 0.
  • sigma_Dose750: the change in log-SD when moving from dose 0 to 750 mg/kg/day. A negative value means it decreases the residual variability.

To convert to the original SD scale: \(\hat\sigma_{\text{dose}} = \exp(\gamma_0 + \gamma_1 \cdot \text{dose})\).

If credible intervals for sigma_Dose coefficients include 0, there is insufficient evidence that variability differs by dose group. If they exclude 0, it implies that dose is associated with differential within-litter variability in fetal weight - a scientifically meaningful heteroscedasticity finding.

Code
# Compare constant-variance vs variance-modelled versions
loo2_c <- loo(fit2)
loo_ls <- loo(fit_ls)
loo_compare(loo2_c, loo_ls)
       elpd_diff se_diff
fit2    0.0       0.0   
fit_ls -3.6       3.3   

A posterior predictive check.

Code
pp_check(fit2, ndraws = 50) +
  labs(title = "Posterior predictive check - random intercept model",
       subtitle = "Dark line = observed data; light lines = simulated replications") +
  theme_bw()


2. Longitudinal data: sleep deprivation study

3.1 Background and data

  • Dataset: sleepstudy from the lme4 package.

  • Study: 18 subjects were allowed only 3 hours of sleep per night. Their average reaction time (ms) on a psychomotor vigilance task was measured on Days 0-9.

  • Data structure:

    • Each row is one observation.
    • Observations are nested within subjects (repeated measures -> longitudinal data).
    • We expect reaction times to increase (worsen) with more sleep deprivation, but subjects may differ both in their starting reaction time and in how quickly they deteriorate.
Code
data(sleepstudy, package = "lme4")

# Quick look
sleepstudy %>%
  select(Subject, Days, Reaction) %>%
  slice_head(n = 10) %>%
  knitr::kable(caption = "First 10 rows of sleepstudy data")
First 10 rows of sleepstudy data
Subject Days Reaction
308 0 249.5600
308 1 258.7047
308 2 250.8006
308 3 321.4398
308 4 356.8519
308 5 414.6901
308 6 382.2038
308 7 290.1486
308 8 430.5853
308 9 466.3535
  • Descriptive summary
Code
sleepstudy %>%
  group_by() %>%
  summarise(
    `Reaction time (ms)` = paste0(round(mean(Reaction), 1), " (", round(sd(Reaction), 1), ")"),
    `Days of deprivation` = paste0(round(mean(Days), 1), " (", round(sd(Days), 1), ")")
  ) %>%
  pivot_longer(everything(), names_to = "Variable", values_to = "Mean (SD)") %>%
  knitr::kable(caption = "Descriptive summary of sleepstudy")
Descriptive summary of sleepstudy
Variable Mean (SD)
Reaction time (ms) 298.5 (56.3)
Days of deprivation 4.5 (2.9)
  • Visualize individual trajectories
Code
ggplot(sleepstudy, aes(x = Days, y = Reaction, group = Subject, color = Subject)) +
  geom_line(alpha = 0.7) +
  geom_smooth(aes(group = 1), method = "lm", color = "black", linewidth = 1.2,
              se = FALSE) +
  scale_x_continuous(breaks = 0:9) +
  labs(title = "Reaction time trajectories across sleep-deprived days",
       subtitle = "Each coloured line = one subject; black = overall linear trend",
       x = "Days of sleep deprivation",
       y = "Reaction time (ms)") +
  theme_bw() +
  theme(legend.position = "none")

Note

Two important patterns: (1) reaction times increase with days of deprivation on average; (2) subjects differ substantially both in their baseline reaction time (Day 0) and in the rate at which they deteriorate - motivating both random intercepts and random slopes.


2.2 Prior specification

Code
x_mu  <- seq(100, 600, length.out = 500)
x_b   <- seq(-50, 100, length.out = 500)
x_sig <- seq(0, 150, length.out = 500)

p1 <- tibble(x = x_mu, d = dnorm(x, 300, 50)) %>%
  ggplot(aes(x, d)) + geom_line(color = "steelblue", linewidth = 1) +
  labs(title = "Intercept ~ N(300, 50)", x = "ms", y = "Density") + theme_bw()

p2 <- tibble(x = x_b, d = dnorm(x, 10, 20)) %>%
  ggplot(aes(x, d)) + geom_line(color = "darkorange", linewidth = 1) +
  labs(title = "Days slope ~ N(10, 20)", x = "ms / day", y = "Density") + theme_bw()

p3 <- tibble(x = x_sig, d = 2 * dcauchy(x, 0, 30)) %>%
  ggplot(aes(x, d)) + geom_line(color = "firebrick", linewidth = 1) +
  labs(title = "SD ~ HalfCauchy(0, 30)", x = "ms", y = "Density") + theme_bw()

ggarrange(p1, p2, p3, nrow = 1)

Note
  • Intercept ~ N(300, 50): centred on 300 ms (plausible baseline reaction time), with SD = 50 ms giving a reasonable range of 150-450 ms.
  • Days slope ~ N(10, 20): centred slightly positive (we expect worsening), but wide enough to allow a null or even improving effect.
  • SDs ~ HalfCauchy(0, 30): allows wide between-subject and residual variability while regularising against very large values.

2.3 Model 1 - Random intercept (longitudinal)

Model specification

\[\text{Reaction}_{ij} \mid \beta_{0i}, \beta_1, \sigma \sim N(\beta_{0i} + \beta_1\,\text{Days}_{ij},\; \sigma^2)\]

\[\beta_{0i} \mid \beta_0, \sigma_0 \sim N(\beta_0, \sigma_0^2)\]

  • \(\beta_{0i}\): subject-specific intercept (baseline reaction time for subject \(i\)).
  • \(\beta_1\): shared daily rate of worsening across all subjects.
  • \(\sigma_0\): between-subject SD in baseline reaction time.
  • \(\sigma\): within-subject (residual) SD after accounting for the linear day effect.
Code
fit_sleep_ri <- brm(
  data   = sleepstudy,
  family = gaussian(),
  Reaction ~ Days + (1 | Subject),
  prior = c(
    prior(normal(300, 50),  class = Intercept),
    prior(normal(10, 20),   class = b),
    prior(cauchy(0, 30),    class = sd),
    prior(cauchy(0, 30),    class = sigma)
  ),
  iter    = 6000,
  warmup  = 3000,
  chains  = 4,
  cores   = 4,
  seed    = 123,
  silent  = 2,
  refresh = 0
)
# saveRDS(fit_sleep_ri, "data/sleep_ri")
Code
fit_sleep_ri <- readRDS("data/sleep_ri")
summary(fit_sleep_ri)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Reaction ~ Days + (1 | Subject) 
   Data: sleepstudy (Number of observations: 180) 
  Draws: 4 chains, each with iter = 6000; warmup = 3000; thin = 1;
         total post-warmup draws = 12000

Multilevel Hyperparameters:
~Subject (Number of levels: 18) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)    38.41      7.35    26.74    55.34 1.00     2015     3548

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept   251.23     10.13   231.08   270.73 1.00     1981     3304
Days         10.46      0.81     8.89    12.06 1.00    13333     7534

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma    31.15      1.74    27.97    34.80 1.00    11709     8969

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Interpretation

Parameter Posterior mean 95% CrI Interpretation
Intercept ~251 ms 231 - 270 ms Grand-mean baseline reaction time at Day 0, averaged across subjects.
Days ~11 ms/day 9 - 12 ms/day Each additional day of sleep deprivation is associated with ~10 ms longer reaction time on average.
sd(Intercept) ~38 ms 27 - 55 ms Subjects differ in baseline reaction time by \(\pm\) 38 ms around the grand mean.
sigma ~31 ms 27 - 35 ms Within-subject residual variability after accounting for the day effect and subject intercept.

ICC

Code
sd_ri   <- VarCorr(fit_sleep_ri, summary = FALSE)
tau_ri  <- sd_ri$Subject$sd[, "Intercept"]
sig_ri  <- sd_ri$residual__$sd[, 1]

icc_sleep <- tau_ri^2 / (tau_ri^2 + sig_ri^2)
cat("Sleep ICC (adjusted) - posterior median [95% CrI]:\n")
Sleep ICC (adjusted) - posterior median [95% CrI]:
Code
round(quantile(icc_sleep, c(0.5, 0.025, 0.975)), 3)
  50%  2.5% 97.5% 
0.593 0.416 0.767 
Code
ggplot(tibble(icc = icc_sleep), aes(x = icc)) +
  geom_density(fill = "steelblue", alpha = 0.4) +
  labs(title = "Posterior of adjusted ICC (longitudinal random intercept model)",
       x = expression(rho[adj]), y = "Density") +
  theme_bw()

Note

The ICC here tells us what proportion of total reaction-time variability is due to stable between-subject differences in baseline speed (as opposed to day-to-day within-subject fluctuations). An ICC around 0.4-0.5 is common in behavioural studies.


2.4 Model 2 - Random intercept and slope (longitudinal)

Model specification

We now allow each subject’s rate of deterioration to vary:

\[\text{Reaction}_{ij} \sim N(\beta_{0i} + \beta_{1i}\,\text{Days}_{ij},\; \sigma^2)\]

\[\begin{pmatrix}\beta_{0i}\\\beta_{1i}\end{pmatrix} \sim N\!\left(\begin{pmatrix}\beta_0\\\beta_1\end{pmatrix},\; \begin{pmatrix}\sigma_0^2 & \rho\,\sigma_0\sigma_1\\\rho\,\sigma_0\sigma_1 & \sigma_1^2\end{pmatrix}\right)\]

  • \(\sigma_1\) captures between-subject variability in the slope (rate of deterioration).
  • \(\rho\) captures the correlation between a subject’s baseline speed and their rate of change.
Code
fit_sleep_rs <- brm(
  data   = sleepstudy,
  family = gaussian(),
  Reaction ~ Days + (Days | Subject),
  prior = c(
    prior(normal(300, 50),  class = Intercept),
    prior(normal(10, 20),   class = b),
    prior(cauchy(0, 30),    class = sd),
    prior(cauchy(0, 30),    class = sigma),
    prior(lkj(2),           class = cor)
  ),
  iter    = 8000,
  warmup  = 4000,
  chains  = 4,
  cores   = 4,
  seed    = 123,
  silent  = 2,
  refresh = 0
)
# saveRDS(fit_sleep_rs, "data/sleep_rs")
Code
fit_sleep_rs <- readRDS("data/sleep_rs")
summary(fit_sleep_rs)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: Reaction ~ Days + (Days | Subject) 
   Data: sleepstudy (Number of observations: 180) 
  Draws: 4 chains, each with iter = 8000; warmup = 4000; thin = 1;
         total post-warmup draws = 16000

Multilevel Hyperparameters:
~Subject (Number of levels: 18) 
                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)          25.84      6.43    15.32    40.31 1.00     6568     9916
sd(Days)                6.54      1.51     4.16    10.09 1.00     5971     8172
cor(Intercept,Days)     0.08      0.27    -0.43     0.60 1.00     4171     7079

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept   251.41      7.21   236.91   265.61 1.00     7392     9861
Days         10.48      1.68     7.17    13.74 1.00     5283     7896

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma    25.87      1.56    23.04    29.12 1.00    14752    10801

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Code
# conditional_effects(fit_sleep_rs)

Interpretation

Parameter Posterior mean 95% CrI Interpretation
Intercept ~251 ms 237 - 266 ms Grand-mean baseline reaction time at Day 0.
Days ~10 ms/day 7 - 14 ms/day Average increase in reaction time per day.
sd(Intercept) ~26 ms 15 - 40 ms Between-subject variability in baseline speed.
sd(Days) ~7 ms/day 4 - 10 ms/day Subjects differ in their daily deterioration rate by \(\pm\) 7 ms/day.
cor(Intercept, Days) ~0.08 -0.43 - 0.60 Weak positive correlation: subjects with slower baselines tend to deteriorate slightly faster, but the evidence is weak (wide CrI).
sigma ~25 ms 21 - 30 ms Residual within-subject variability after accounting for individual trajectories.

2.7 Model comparison

Code
loo_ri <- loo(fit_sleep_ri)
loo_rs <- loo(fit_sleep_rs)
loo_compare(loo_ri, loo_rs)
             elpd_diff se_diff
fit_sleep_rs   0.0       0.0  
fit_sleep_ri -22.6      12.3  

Code
# Posterior predictive check
pp_check(fit_sleep_rs, ndraws = 50) +
  labs(title = "Posterior predictive check - random slope model (sleep study)") +
  theme_bw()


3. Bayesian meta-analysis with a continuous outcome

3.1 What is meta-analysis?

Meta-analysis is a statistical method that combines results from multiple independent studies to obtain a more precise, overall estimate of an effect. A Bayesian random-effects meta-analysis is a natural application of hierarchical modelling:

  • Each study is a “cluster”.
  • The true effect size in each study is drawn from a common distribution.
  • We estimate both the pooled effect (\(\mu\)) and the between-study heterogeneity (\(\tau\)).

This is exactly the same structure as the random intercept model we used for clustered data!

Bayesian random-effects meta-analysis model

Let \(y_k\) be the observed effect estimate from study \(k\) (\(k = 1, \ldots, K\)) with known standard error \(s_k\):

\[y_k \mid \theta_k \sim N(\theta_k,\; s_k^2) \qquad \text{(data layer)}\]

\[\theta_k \mid \mu, \tau \sim N(\mu,\; \tau^2) \qquad \text{(adaptive prior / study-level model)}\]

\[\mu \sim N(0, 1), \quad \tau \sim \text{HalfNormal}(0, 0.5) \qquad \text{(hyperpriors)}\]

  • \(\mu\): the overall pooled effect.
  • \(\tau\): the between-study standard deviation - how much true effects vary across studies.
  • \(\theta_k\): the true (latent) effect in study \(k\), partially pooled toward \(\mu\).

3.2 Dataset: length of hospital stay across sites

We use dat.normand1999 from the metafor package. This dataset is freely available in R.

  • Study question: What is the mean length of hospital stay (days) for schizophrenia patients across different sites/hospitals?
  • Data: 9 sites, each reporting a sample mean and standard deviation for length of stay.
  • Outcome: continuous (days in hospital) - ideal for a normal-likelihood meta-analysis.
Code
library(metafor)
data(dat.normand1999)

# dat.normand1999 has two groups:
# group 1 (specialised care): n1i, m1i, sd1i
# group 2 (routine care):     n2i, m2i, sd2i

meta_dat <- dat.normand1999 %>%
  mutate(
    # pooled SE of the mean difference
    sei   = sqrt(sd1i^2 / n1i + sd2i^2 / n2i),
    # mean difference (specialised - routine)
    yi    = m1i - m2i,
    site  = paste0("Site ", 1:n())
  )

meta_dat %>%
  select(site, n1i, m1i, sd1i, n2i, m2i, sd2i) %>%
  knitr::kable(
    col.names = c("Site", "n (special)", "Mean (special)", "SD (special)",
                  "n (routine)", "Mean (routine)", "SD (routine)"),
    caption   = "dat.normand1999: length of stay by care type",
    digits    = 2
  )
dat.normand1999: length of stay by care type
Site n (special) Mean (special) SD (special) n (routine) Mean (routine) SD (routine)
Site 1 155 55 47 156 75 64
Site 2 31 27 7 32 29 4
Site 3 75 64 17 71 119 29
Site 4 18 66 20 18 137 48
Site 5 8 14 8 13 18 11
Site 6 57 19 7 52 18 4
Site 7 34 52 45 33 41 34
Site 8 110 21 16 183 31 27
Site 9 60 30 27 52 23 20
  • Forest plot of the raw study estimates
Code
meta_dat %>%
  ggplot(aes(x = yi, y = reorder(site, yi),
             xmin = yi - 1.96 * sei,
             xmax = yi + 1.96 * sei)) +
  geom_pointrange(color = "steelblue") +
  geom_vline(xintercept = 0, linetype = "dashed", color = "firebrick") +
  labs(title   = "Forest plot: mean difference in length of stay (specialised vs routine)",
       subtitle = "Negative = shorter stay under specialised care",
       x = "Mean difference in days",
       y = NULL) +
  theme_bw()

Note

There is considerable heterogeneity across sites - some sites have much longer stays than others. This motivates using a random-effects model rather than assuming a single common effect.


3.3 Prior specification (meta-analysis)

For a meta-analysis on the mean of a continuous clinical outcome:

  • \(\mu\) (overall pooled mean): We expect hospital stays of roughly 10-50 days based on clinical knowledge. A \(N(20, 15)\) prior is weakly informative.
  • \(\tau\) (between-site SD): Should be on the same scale as the outcome (days). A \(\text{HalfNormal}(0, 10)\) prior allows moderate to large heterogeneity while regularising against extreme values.
Code
p_mu <- tibble(x = seq(-20, 80, length.out = 500),
               d = dnorm(x, 20, 15)) %>%
  ggplot(aes(x, d)) + geom_line(color = "steelblue", linewidth = 1) +
  labs(title = "Prior: mu ~ N(20, 15)", x = "Mean stay (days)", y = "Density") +
  theme_bw()

p_tau <- tibble(x = seq(0, 40, length.out = 500),
                d = 2 * dnorm(x, 0, 10)) %>%
  ggplot(aes(x, d)) + geom_line(color = "firebrick", linewidth = 1) +
  labs(title = "Prior: tau ~ HalfNormal(0, 10)", x = "Between-site SD (days)", y = "Density") +
  theme_bw()

ggarrange(p_mu, p_tau, nrow = 1)


3.4 Fitting the Bayesian random-effects meta-analysis

In brms, we treat each site’s observed mean as the outcome and its standard error as a known measurement error - a natural hierarchical model.

Code
fit_meta <- brm(
  data   = meta_dat,
  family = gaussian(),
  # yi is the observed mean; sei is the known SE
  # (1 | study) gives study-specific random effects = true theta_k
  yi | se(sei) ~ 1 + (1 | study),
  prior = c(
    prior(normal(20, 15),  class = Intercept),   # prior on mu
    prior(normal(0, 10),   class = sd)            # prior on tau (half-normal via positive constraint)
  ),
  iter    = 10000,
  warmup  = 5000,
  chains  = 4,
  cores   = 4,
  seed    = 123,
  silent  = 2,
  refresh = 0
)
saveRDS(fit_meta, "data/fit_meta_normand")
Code
fit_meta <- readRDS("data/fit_meta_normand")
summary(fit_meta)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: yi | se(sei) ~ 1 + (1 | study) 
   Data: meta_dat (Number of observations: 9) 
  Draws: 4 chains, each with iter = 10000; warmup = 5000; thin = 1;
         total post-warmup draws = 20000

Multilevel Hyperparameters:
~study (Number of levels: 9) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)    21.66      4.30    14.44    31.21 1.00     5050     8215

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -7.84      6.98   -20.96     6.46 1.00     3482     5072

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.00      0.00     0.00     0.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
  • Intercept = -7.84 days [-20.96, 6.46]: the pooled mean difference in length of stay (specialised minus routine care). The credible interval includes 0, so there is no strong evidence of a difference on average across sites.

  • sd(Intercept) = 21.66 [14.44, 31.21]: very large between-site heterogeneity - sites differ enormously in their observed mean differences. This wide \(\tau\) is driven by the raw data having very large SDs (some sites have stays of 4 days, others 40+ days).


3.5 Prior vs posterior (meta-analysis)

Code
post_meta <- as_draws_df(fit_meta)

p_mu_pp <- ggplot() +
  stat_function(fun = dnorm, args = list(mean = 20, sd = 15),
                aes(color = "Prior"), linewidth = 1, xlim = c(-20, 80)) +
  geom_density(data = post_meta, aes(x = b_Intercept, color = "Posterior"),
               linewidth = 1) +
  scale_color_manual(values = c(Prior = "grey60", Posterior = "steelblue")) +
  labs(title = "Pooled mean (mu)", x = "Days", y = "Density", color = NULL) +
  theme_bw()

p_tau_pp <- ggplot() +
  geom_density(data = post_meta, aes(x = sd_study__Intercept, color = "Posterior"),
               linewidth = 1) +
  stat_function(fun = function(x) 2 * dnorm(x, 0, 10),
                aes(color = "Prior"), linewidth = 1, xlim = c(0, 40)) +
  scale_color_manual(values = c(Prior = "grey60", Posterior = "firebrick")) +
  labs(title = "Between-site heterogeneity (tau)", x = "Days", y = "Density", color = NULL) +
  theme_bw()

ggarrange(p_mu_pp, p_tau_pp, nrow = 1, common.legend = TRUE, legend = "bottom")


3.6 Posterior results and interpretation

Code
# Site-specific shrunken estimates
ranef_meta <- ranef(fit_meta)$study[,,"Intercept"]
fixef_mu   <- fixef(fit_meta)["Intercept", "Estimate"]

site_estimates <- tibble(
  study        = rownames(ranef_meta),
  theta_hat    = fixef_mu + ranef_meta[, "Estimate"],
  lo95         = fixef_mu + ranef_meta[, "Q2.5"],
  hi95         = fixef_mu + ranef_meta[, "Q97.5"],
  observed_yi  = meta_dat$yi
)

site_estimates
# A tibble: 9 × 5
  study theta_hat   lo95   hi95 observed_yi
  <chr>     <dbl>  <dbl>  <dbl>       <int>
1 1       -19.0   -37.0   -1.89         -20
2 2        -2.02  -16.7   11.4           -2
3 3       -53.3   -69.7  -38.4          -55
4 4       -54.9   -81.9  -30.2          -71
5 5        -4.17  -20.1   10.8           -4
6 6         0.997 -13.5   14.2            1
7 7         7.64  -13.1   27.8           11
8 8        -9.97  -24.9    3.81         -10
9 9         6.38   -9.62  21.6            7
Code
ggplot(site_estimates,
       aes(y = reorder(study, theta_hat))) +
  geom_pointrange(aes(x = theta_hat, xmin = lo95, xmax = hi95,
                      color = "Posterior (shrunk)")) +
  geom_point(aes(x = observed_yi, color = "Observed"), shape = 4, size = 3) +
  geom_vline(xintercept = fixef_mu, linetype = "dashed") +
  scale_color_manual(values = c("Posterior (shrunk)" = "steelblue",
                                "Observed"           = "firebrick")) +
  labs(title    = "Bayesian meta-analysis: shrinkage toward pooled estimate",
       subtitle = "Dashed line = posterior mean of mu; X = raw observed mean difference",
       x = "Mean difference in length of stay (days)", y = NULL, color = NULL) +
  theme_bw()

\(I^2\)-equivalent: We can compute a Bayesian analogue of \(I^2\) (proportion of variability due to heterogeneity)

Code
# Posterior draws for tau and a typical within-study variance
sd_meta    <- VarCorr(fit_meta, summary = FALSE)
draws_tau  <- sd_meta$study$sd[, "Intercept"]

# Median within-study variance (sigma^2 from data: average sei^2)
typical_sigma2 <- mean(meta_dat$sei^2)

# Bayesian I^2 analogue
draws_I2 <- draws_tau^2 / (draws_tau^2 + typical_sigma2)
round(quantile(draws_I2, c(0.5, 0.025, 0.975)), 3)
  50%  2.5% 97.5% 
0.921 0.844 0.962 
Code
ggplot(tibble(I2 = draws_I2 * 100), aes(x = I2)) +
  geom_density(fill = "steelblue", alpha = 0.4) +
  labs(title = "Posterior of I� (between-site heterogeneity proportion)",
       x = "I� (%)", y = "Density") +
  theme_bw()

Interpreting \(I^2\)

\(I^2\) represents the proportion of total variability across studies that is due to genuine between-study (here between-site) differences, rather than sampling error. Common benchmarks: - \(I^2 < 25\%\): low heterogeneity - \(25\% \leq I^2 < 75\%\): moderate heterogeneity - \(I^2 \geq 75\%\): high heterogeneity

If the posterior \(I^2\) is high (e.g., > 75%), the random-effects model is strongly justified, and the pooled estimate should be interpreted cautiously - there may be important moderators explaining why sites differ.


3.7 Model diagnostics

Code
pp_check(fit_meta, ndraws = 50) +
  labs(title = "Posterior predictive check - meta-analysis model") +
  theme_bw()


References

Bürkner, Paul-Christian. 2017. “Advanced Bayesian Multilevel Modeling with the r Package Brms.” arXiv Preprint arXiv:1705.11123.
Lewandowski, Daniel, Dorota Kurowicka, and Harry Joe. 2009. “Generating Random Correlation Matrices Based on Vines and Extended Onion Method.” Journal of Multivariate Analysis 100 (9): 1989–2001.
Nakagawa, Shinichi, Paul CD Johnson, and Holger Schielzeth. 2017. “The Coefficient of Determination r 2 and Intra-Class Correlation Coefficient from Generalized Linear Mixed-Effects Models Revisited and Expanded.” Journal of the Royal Society Interface 14 (134): 20170213.
Price, Catherine J, Carole A Kimmel, Rochelle W Tyl, and Melissa C Marr. 1985. “The Developmental Toxicity of Ethylene Glycol in Rats and Mice.” Toxicology and Applied Pharmacology 81 (1): 113–27.