Power Analysis for Mixed Models in R
Everything so far has assumed observations are independent. That breaks down the moment your data has a natural grouping: students nested in classrooms, patients nested in clinics, measurements nested in subjects. Ignoring that structure inflates your effective sample size and overstates power. A [[concepts/mixed-effects|mixed effects]] model fixes this by adding a random intercept that absorbs between-group variation.
What's new on this rung
family = "lme"in the constructor selects the mixed-effects estimator — fit by maximum likelihood, so the summary header reportsestimator: MLE;- the random-intercept term goes directly in the formula as
(1|classroom); set_clustertells the engine about the grouping structure — the cluster variable name, its ICC, and the number of groups.
Everything else — find_power, target_test, verbose = FALSE, summary() —
works exactly as before.
The model
We study a classroom experiment: does a new teaching_method lift score,
controlling for prior_gpa? Students are nested in 30 classrooms.
family = "lme"selects the mixed estimator;(1|classroom)in the formula adds a random intercept for classroom;model$set_cluster("classroom", ICC = 0.15, n_clusters = 30)supplies the two numbers that define the grouping: the ICC (intraclass correlation — the fraction of total variance that lies between classrooms; 0.15 means 15% of score variation is between classrooms rather than between students) and n_clusters (how many distinct classrooms exist);- effects follow the same effect-size benchmarks as OLS — 0.10 / 0.25 / 0.40 for small / medium / large.
Mixed models default to 800 simulations rather than the 1,600 used for OLS,
because each mixed-model fit is more expensive. Confidence intervals are
correspondingly a little wider; increase n_sims explicitly if you need tighter
Monte-Carlo uncertainty.
How much power at n = 300?
library(mcpower)
model <- MCPower$new("score ~ teaching_method + prior_gpa + (1|classroom)", family = "lme")
model$set_variable_type("teaching_method=binary")
model$set_effects("teaching_method=0.4, prior_gpa=0.18")
model$set_cluster("classroom", ICC = 0.15, n_clusters = 30)
result <- model$find_power(sample_size = 300, target_test = "teaching_method, prior_gpa", verbose = FALSE)
print(summary(result))
================================================== MCPower · Power Analysis ================================================== formula: score ~ teaching_method + prior_gpa + (1|classroom) estimator: MLE N=300 sims=800 α=0.05 target=80% effects: teaching_method=0.40, prior_gpa=0.18 Per-test power ─────────────────────────────────── Test Power Target ─────────────────────────────────── teaching_method 90.9% 80% prior_gpa 85.8% 80% ─────────────────────────────────── Power & 95% CI ─────────────────────────────────────────── Test Power CI 95% ─────────────────────────────────────────── teaching_method 90.9% [88.7%, 92.7%] prior_gpa 85.8% [83.2%, 88.0%] ─────────────────────────────────────────── 95% CIs are Monte-Carlo (Wilson), n_sims=800. Joint significance distribution ──────────────────────── k Exactly At least ──────────────────────── 0 1.6% 100% 1 20.1% 98.4% 2 78.2% 78.2% ──────────────────────── Estimator details tau_estimate: nan boundary_hits: 0 joint_uncorrected_rate: 0.9812 joint_corrected_rate: 0.9812 singular_fit_rate: 0 boundary_rate_per_component: [0.0] Plots: plot(result) to view, plot(result, 'chart.png') to save.
Both targets are comfortably above 80%: teaching_method lands at 90.9% and
prior_gpa at 85.8%. The joint significance distribution shows a 78.2% chance
of detecting both effects in the same study — close to the 80% target, so N = 300
is a reasonable budget when both effects matter.
The estimator details block reports boundary_hits: 0 — no simulation hit a
variance boundary during fitting. Boundary hits show up more often in smaller
samples, where mixed-model fitting is numerically demanding, so seeing none here is
reassuring at this sample size.
ICC values are rarely known precisely in advance. Run a few scenarios with
ICC = 0.05, ICC = 0.15, and ICC = 0.30 to see how sensitive your power
estimate is to that assumption — the same way you would stress-test effect sizes.
Going further
Repeated measures
A repeated-measures (within-subjects) design — each participant measured across many trials — is a clustered design with the participant as the cluster. The catch is the sample size: n counts measurements, not participants, so n = participants × trials-per-participant. Sixty participants on 100 trials each is sample_size = 6000 with n_clusters = 60 — passing sample_size = 60 would model one measurement per person:
# 60 participants × 100 trials = 6000 measurements
model <- MCPower$new("rt ~ condition + (1|participant)", family = "lme")
model$set_cluster("participant", ICC = 0.30, n_clusters = 60)
result <- model$find_power(sample_size = 6000, target_test = "condition", verbose = FALSE)
Extra trials sharpen a within-participant predictor (one rotated trial by trial, estimated against the within-person residual) but do nothing for a between-participant one (group, age — constant within a person). Mark between-participant predictors as cluster-level so the engine uses the participant count as their effective sample size:
# `group` is between-participant; `condition` stays within-participant
model <- MCPower$new("rt ~ condition + group + (1|participant)", family = "lme")
model$set_cluster("participant", ICC = 0.30, n_clusters = 60,
cluster_level_vars = c("group"))
See Repeated measures for the within- vs between-participant power discussion.
Cluster-level predictors
If your primary predictor is assigned at the group level — a treatment applied to all students in a classroom, a policy rolled out site-wide — declare it in set_cluster so the engine uses n_clusters (not total N) as the effective sample size:
model$set_cluster("site", ICC = 0.15, n_clusters = 30,
cluster_level_vars = c("treatment"))
Halving n_clusters at fixed total N roughly doubles the SE of a cluster-level predictor while barely affecting within-cluster predictors. See Cluster-level predictors for the design-effect discussion.
Crossed and nested groupings
For crossed designs (e.g. participants × stimuli) or nested designs (classrooms within schools), name every grouping factor in the formula and call set_cluster once per factor:
# Crossed: 24 subjects × 12 items — formula "rt ~ frequency + (1|subject) + (1|item)"
model$set_cluster("subject", ICC = 0.20, n_clusters = 24)
model$set_cluster("item", ICC = 0.15, n_clusters = 12)
For crossed groupings, N must be a multiple of the design atom (n_subjects × n_items); for nested, call set_cluster once for the outer factor and once for the "school:classroom" child with n_per_parent. See Multiple groupings.
Random slopes
A random intercept lets each group start at a different baseline; a random slope lets the effect itself vary across groups. Declare it in the formula as (1 + x|group) and give it a variance in set_cluster — the extra variability widens the SE of the average effect, so treating a varying slope as fixed overstates power:
model <- MCPower$new("rt ~ dose + (1 + dose|subject)", family = "lme")
model$set_cluster("subject", ICC = 0.20, n_clusters = 30,
random_slopes = list(list(predictor = "dose", variance = 0.15)))
Random-slope fits hit variance boundaries more often than intercept-only ones, so watch the boundary-hit rate in summary(). See Random slopes.
Clustered logistic (GLMM)
A binary outcome with clustering is a logistic GLMM — combine family = "logit" with a (1|group) term, where the ICC is read on the log-odds (latent) scale. This composes with the cluster-level trick above: a treatment randomised at the clinic level in a binary-outcome trial is just family = "logit" plus cluster_level_vars:
model <- MCPower$new("recovered ~ treatment + baseline_severity + (1|clinic)", family = "logit")
model$set_baseline_probability(0.3)
model$set_cluster("clinic", ICC = 0.10, n_clusters = 40, cluster_level_vars = c("treatment"))
See Clustered logistic (GLMM).
Scenario stress-testing for mixed models
Mixed models support three scenario knobs: random_effect_dist (normal / heavy_tailed / right_skewed), random_effect_df, and icc_noise_sd. Use them to probe how sensitive power is to non-Gaussian random effects or an uncertain ICC:
model$set_scenario_configs(list(
re_stress = list(random_effect_dist = "heavy_tailed",
random_effect_df = 5, icc_noise_sd = 0.07)
))
model$find_power(sample_size = 600, target_test = "teaching_method",
scenarios = c("optimistic", "re_stress"))
See Mixed-model scenario knobs.
next → ANOVA & post-hoc