MCPower — Validating the MLE (Mixed-Effects) Solver
What this report shows
MCPower generates a clustered dataset from a formula, then fits it back to recover the fixed-effect coefficients. This report checks the fitting half: given one fixed dataset, does MCPower’s own mixed-model solver return the same numbers a trusted, independent R fit returns?
For each formula we take the exact bytes MCPower saved (the same design matrix, outcome, and cluster ids, byte-for-byte), fit them two ways —
- R (the reference):
lme4::lmerwith REML on the design matrix through the origin plus a random intercept per cluster — the textbook restricted-maximum- likelihood mixed-model fit. - MCPower: the engine’s own
load_data()path, which runs the same REML profile solver (Brent on the variance component) the power simulation uses.
— and compare the fitted fixed-effect coefficients, the test statistics,
and the decision threshold. The statistic is the Wald z on the
fixed effect (fixef / sqrt(diag(vcov))), not lmerTest’s
Satterthwaite t — MCPower’s own test definition is the Wald z. Because
both fit the identical bytes, sampling noise cancels: any difference is
a solver discrepancy.
The chosen cases sit at ICC 0.1–0.2 with 20–30 clusters, well clear of
the τ̂ ≈ 0 boundary. If a case ever collapsed to that boundary,
MCPower’s OLS-fallback fit would diverge from lmer’s REML fit and the
statistic band would FAIL vs R — surfacing the boundary indirectly.
None of the cases below are expected to hit it.
How the check works
- Provenance. Re-generate the dataset from the committed seed and confirm its content hash matches the saved file — proof the bytes haven’t drifted.
- B (R) vs C (MCPower) on those bytes: fixed-effect coefficients,
the Wald z statistics (β̂/se), and the critical value
qnorm(1 − α/2)≈ 1.96. - C vs A (a readable sanity overlay): MCPower’s recovered coefficients next to the formula’s true values.
The thresholds
| Quantity | Allowed difference | Why |
|---|---|---|
| Coefficient (β) | 10^{-4} relative | REML profile optimisation — agrees with lmer inside this band |
| Statistic (z) | 10^{-4} relative | derived from the same β and variance |
| Critical value | 10^{-8} absolute | engine’s own normal/χ² quantile vs R’s qnorm ≈1.6e-9 |
These are the B↔C gates from tolerances.R. MLE is a REML profile
optimisation (Brent on the variance component) — the loosest of the
three fits against the iterative estimate_rel_iter band, since two
independent optimisers (MCPower’s and lmer’s) need only agree to that
band, not to machine precision. The per-formula tables below show each
formula’s actual margin; a case sitting near the band edge is called out
in the summary.
| Formula | n | B↔C | Converged | Reproduces |
|---|---|---|---|---|
| y ~ x1 + (1|grp) | 600 | PASS | yes | yes |
| y ~ x1 + (1|grp) | 600 | PASS | yes | yes |
| y ~ x1 + x2 + (1|grp) | 750 | PASS | yes | yes |
| y ~ x1 + x2 + (1|grp) | 750 | PASS | yes | yes |
| y ~ x1*x2 + (1|grp) | 750 | PASS | yes | yes |
| y ~ x1*x2 + (1|grp) | 900 | PASS | yes | yes |
| y ~ x1 + g + (1|grp) | 750 | PASS | yes | yes |
| y ~ x1 + g + (1|grp) | 900 | PASS | yes | yes |
All MLE formulas pass: MCPower’s REML solver matches
lme4::lmerwell inside the iterative band on every saved dataset, and every file reproduces from its seed.
png 2 
Each point is one fitted quantity on one saved dataset; the red line is
the gate. Two independent REML optimisers (MCPower’s and lme4’s) agree
inside the iterative band on every quantity.
y = 0.5·x1 + per-grp random intercept (ICC 0.20) + noise
R formula y ~ x1 + (1|grp) · linear mixed model (random intercept) ·
n=600, seed=2137. File reproduces from seed: yes.
| Term | True (formula) | β (R = MCPower) | t / z | crit | rel. diff (β, stat) | crit abs. diff | Verdict |
|---|---|---|---|---|---|---|---|
| intercept | 0.0 | 0.078383 | NA | NA | 3.0e-08 | — | PASS |
| x1 | 0.5 | 0.404979 | 10.34034 | 1.959964 | 8.2e-07 | 1.6e-09 | PASS |
y = 0.3·x1 + per-grp random intercept (ICC 0.30) + noise
R formula y ~ x1 + (1|grp) · linear mixed model (random intercept) ·
n=600, seed=2138. File reproduces from seed: yes.
| Term | True (formula) | β (R = MCPower) | t / z | crit | rel. diff (β, stat) | crit abs. diff | Verdict |
|---|---|---|---|---|---|---|---|
| intercept | 0.0 | 0.131583 | NA | NA | 3.6e-09 | — | PASS |
| x1 | 0.3 | 0.347749 | 8.595397 | 1.959964 | 4.3e-07 | 1.6e-09 | PASS |
y = 0.5·x1 + 0.3·x2 + per-grp random intercept (ICC 0.20) + noise
R formula y ~ x1 + x2 + (1|grp) · linear mixed model (random
intercept) · n=750, seed=2137. File reproduces from seed: yes.
| Term | True (formula) | β (R = MCPower) | t / z | crit | rel. diff (β, stat) | crit abs. diff | Verdict |
|---|---|---|---|---|---|---|---|
| intercept | 0.0 | -0.007542 | NA | NA | 5.6e-07 | — | PASS |
| x1 | 0.5 | 0.440708 | 12.927637 | 1.959964 | 7.6e-07 | 1.6e-09 | PASS |
| x2 | 0.3 | 0.341816 | 9.183627 | 1.959964 | 9.0e-07 | 1.6e-09 | PASS |
y = 0.3·x1 + 0.5·x2 + per-grp random intercept (ICC 0.30) + noise
R formula y ~ x1 + x2 + (1|grp) · linear mixed model (random
intercept) · n=750, seed=2138. File reproduces from seed: yes.
| Term | True (formula) | β (R = MCPower) | t / z | crit | rel. diff (β, stat) | crit abs. diff | Verdict |
|---|---|---|---|---|---|---|---|
| intercept | 0.0 | 0.102675 | NA | NA | 1.9e-09 | — | PASS |
| x1 | 0.3 | 0.357704 | 9.980624 | 1.959964 | 1.1e-07 | 1.6e-09 | PASS |
| x2 | 0.5 | 0.497794 | 13.989819 | 1.959964 | 9.8e-08 | 1.6e-09 | PASS |
y = 0.5·x1 + 0.3·x2 + 0.3·x1:x2 + per-grp random intercept (ICC 0.20) + noise
R formula y ~ x1*x2 + (1|grp) · linear mixed model (random intercept)
· n=750, seed=2137. File reproduces from seed: yes.
| Term | True (formula) | β (R = MCPower) | t / z | crit | rel. diff (β, stat) | crit abs. diff | Verdict |
|---|---|---|---|---|---|---|---|
| intercept | 0.0 | -0.007002 | NA | NA | 3.6e-07 | — | PASS |
| x1 | 0.5 | 0.447848 | 12.953272 | 1.959964 | 8.3e-07 | 1.6e-09 | PASS |
| x2 | 0.3 | 0.341794 | 9.187226 | 1.959964 | 9.3e-07 | 1.6e-09 | PASS |
| x1:x2 | 0.3 | 0.254123 | 6.738949 | 1.959964 | 5.7e-07 | 1.6e-09 | PASS |
y = 0.4·x1 + 0.3·x2 + 0.2·x1:x2 + per-grp random intercept (ICC 0.30) + noise
R formula y ~ x1*x2 + (1|grp) · linear mixed model (random intercept)
· n=900, seed=2138. File reproduces from seed: yes.
| Term | True (formula) | β (R = MCPower) | t / z | crit | rel. diff (β, stat) | crit abs. diff | Verdict |
|---|---|---|---|---|---|---|---|
| intercept | 0.0 | 0.202794 | NA | NA | 7.4e-10 | — | PASS |
| x1 | 0.4 | 0.428858 | 12.586038 | 1.959964 | 8.1e-08 | 1.6e-09 | PASS |
| x2 | 0.3 | 0.289511 | 8.587036 | 1.959964 | 1.6e-07 | 1.6e-09 | PASS |
| x1:x2 | 0.2 | 0.251104 | 7.673815 | 1.959964 | 1.7e-07 | 1.6e-09 | PASS |
y = 0.3·x1 + 0.3·g[2] + per-grp random intercept (ICC 0.20) + noise
R formula y ~ x1 + g + (1|grp) · linear mixed model (random intercept)
· n=750, seed=2137. File reproduces from seed: yes.
| Term | True (formula) | β (R = MCPower) | t / z | crit | rel. diff (β, stat) | crit abs. diff | Verdict |
|---|---|---|---|---|---|---|---|
| intercept | 0.0 | -0.042469 | NA | NA | 6.6e-08 | — | PASS |
| x1 | 0.3 | 0.241373 | 7.079191 | 1.959964 | 8.8e-07 | 1.6e-09 | PASS |
| g[2] | 0.3 | 0.375982 | 5.275995 | 1.959964 | 6.6e-07 | 1.6e-09 | PASS |
y = 0.4·x1 + 0.5·g[2] + 0.8·g[3] + per-grp random intercept (ICC 0.30) + noise
R formula y ~ x1 + g + (1|grp) · linear mixed model (random intercept)
· n=900, seed=2138. File reproduces from seed: yes.
| Term | True (formula) | β (R = MCPower) | t / z | crit | rel. diff (β, stat) | crit abs. diff | Verdict |
|---|---|---|---|---|---|---|---|
| intercept | 0.0 | 0.123802 | NA | NA | 6.8e-09 | — | PASS |
| x1 | 0.4 | 0.435261 | 12.890829 | 1.959964 | 5.2e-08 | 1.6e-09 | PASS |
| g[2] | 0.5 | 0.612951 | 2.210123 | 1.959964 | 3.2e-06 | 1.6e-09 | PASS |
| g[3] | 0.8 | 1.000041 | 3.147536 | 1.959964 | 3.2e-06 | 1.6e-09 | PASS |
| item | value |
|---|---|
| generated | 2026-06-21 |
| R | R version 4.5.3 (2026-03-11) |
| mcpower | 1.0.0 |
| MLE solving bands (beta/stat rel · crit abs) | 0.0001 rel / 0.0001 rel / 1e-08 abs |
M2 — crossed & nested groupings (general lmm path)
A↔B: lme4 recovers the spec’s β and variance components from MCPower’s
own draws (the spec is the oracle; K independent draws, Monte-Carlo
bands). B↔C: MCPower and lme4 fit the SAME saved bytes; β̂ at
estimate_rel_iter with the optimizer-vs-optimizer abs floors (stat
1e-4 / β̂ 1e-5), variance components at vc_rel/vc_abs. Wald z on both
sides — never lmerTest t.
lmm_crossed_a: PASS (25/25 lme4-clean draws)
lmm_crossed_b: PASS (24/25 lme4-clean draws)
lmm_nested_a: PASS (25/25 lme4-clean draws)
lmm_nested_b: PASS (25/25 lme4-clean draws)
lmm_crossed_nested_a: PASS (25/25 lme4-clean draws)
lmm_crossed_a: A<->B PASS (K=200)
lmm_crossed_b: A<->B PASS (K=200)
lmm_nested_a: A<->B PASS (K=200)
lmm_nested_b: A<->B PASS (K=200)
lmm_crossed_nested_a: A<->B PASS (K=200)
Reproduce: Rscript mcpower/validation/data_generation.r then
rmarkdown::render("mcpower/validation/validation_MLE_solving.rmd", output_dir = "mcpower/web/documentation/validation").
M3 — random slopes (general lmm path: standalone, multi-slope, composition)
B↔C: MCPower and lme4 fit the SAME saved bytes; β̂ at estimate_rel_iter
with the optimizer-vs-optimizer abs floors (stat 1e-4 / β̂ 1e-5); every
RE variance at slope_var_rel/vc_abs; the full RE correlation vector
(re_corr) at slope_corr_abs. A↔B: lme4 recovers the spec’s β, the
per-component variances, and the set correlations from MCPower’s own
draws over K draws (the spec is the oracle). Per-component boundary: a
τ→0 component must drive its boundary rate up while the others stay low.
Wald z on both sides — never lmerTest t.
lmm_slope_a: PASS (25/25 lme4-clean draws)
lmm_slope_b: PASS (25/25 lme4-clean draws)
lmm_multislope: PASS (25/25 lme4-clean draws)
lmm_slope_crossed: PASS (24/25 lme4-clean draws)
lmm_slope_a: A↔B PASS (K=300)
lmm_slope_b: A↔B PASS (K=300)
lmm_multislope: A↔B PASS (K=300)
lmm_slope_crossed: A↔B PASS (K=300)
## Power Analysis — MLE N=480 sims=800 α=0.05 target=80%
## formula: y ~ x1 + (1|grp)
##
## ───────────────────────────────────
## Test Power Target
## ───────────────────────────────────
## x1 100% 80%
## ───────────────────────────────────
## Power Analysis — MLE N=480 sims=800 α=0.05 target=80%
## formula: y ~ x1 + (1|grp)
##
## ───────────────────────────────────
## Test Power Target
## ───────────────────────────────────
## x1 95.0% 80%
## ───────────────────────────────────
## Slope τ²=0: singular_fit_rate = 0.626 (expect high)
## Slope τ²=0.10: singular_fit_rate = 0.010 (expect low)
M4 — clustered logistic GLMM (intercept, slope, multi-slope, composition, Laplace-bias)
B↔C: MCPower-GLMM and lme4::glmer(family = binomial) fit the SAME
generated bytes; β̂ at GLMM_TOL$beta_rel / beta_abs_floor; RE
variances at GLMM_TOL$tau_rel; RE correlations at GLMM_TOL$corr_abs.
Wald z (β̂/se) on both sides. Draws where glmer reports convergence
warnings or a singular fit are skipped (same policy as M3: two
optimizers on the same bytes need not share the optimum at a boundary).
B↔C is a documented XFAIL, not a hard gate. MCPower-GLMM diverges from
glmer’s default summary on two known, deferred axes (full rationale in the workspace docdocs/ideas-features/mixed-model-significance-reference.md): (z) MCPower’s Wald SE is the RX/PLS Schur (= glmer vcov(use.hessian = FALSE)), ~3% anticonservative vs glmer’s defaultuse.hessian = TRUE— a deliberate, documented, deferred choice (that doc’s §“Second axis”), not a regression; (β̂/τ̂²/ρ̂) MCPower-BOBYQA vs glmer-Laplace agree on the same bytes only to ~2–3× the LMM-inheritedGLMM_TOLbands, plus glmer’s few-cluster convergence noise. The cell logs the measured gaps and renders green, tripping only a generous gross-regression backstop; the engine’s own brute-force-Laplace unit tests are the fine fit-regression guard.
A↔B: glmer recovers the spec’s β and variance components from
MCPower’s draws over K draws (the spec is the oracle; Monte-Carlo
bands).
Section 8.4 asserts τ²→0 collapse: using a near-boundary config
(ICC=0.05, 10 clusters × 5 obs) where the non-negative variance
estimator pins at τ̂²=0 in ~55–60% of draws, MCPower-GLMM must agree with
plain glm() on those boundary-pinned draws within
GLMM_TOL$collapse_beta / collapse_stat. Non-pinned draws are
excluded from the tight gate (a fitted RE legitimately makes the GLMM
diverge from glm).
Section 8.5 is the Laplace-bias cell: MCPower-GLMM (Laplace) vs
glmer(nAGQ = 7); the gap must stay within
GLMM_TOL$laplace_bias_beta_abs.
glmm_intercept: XFAIL — 60 z (SE convention, 2.6% max) + 13 β̂/τ̂²/ρ̂ (optimizer band: β̂ 1.7e-03, τ̂² 2.8e-03, ρ̂ 0.0e+00) (60/60 glmer-clean draws)
glmm_slope: XFAIL — 40 z (SE convention, 8.6% max) + 91 β̂/τ̂²/ρ̂ (optimizer band: β̂ 2.9e-03, τ̂² 4.0e-03, ρ̂ 5.1e-03) (40/60 glmer-clean draws)
glmm_multislope: XFAIL — 30 z (SE convention, 7.8% max) + 62 β̂/τ̂²/ρ̂ (optimizer band: β̂ 2.6e-03, τ̂² 2.9e-03, ρ̂ 6.1e-03) (15/60 glmer-clean draws)
glmm_slope_crossed: XFAIL — 37 z (SE convention, 9.5% max) + 127 β̂/τ̂²/ρ̂ (optimizer band: β̂ 3.6e-03, τ̂² 6.6e-03, ρ̂ 2.6e-02) (37/60 glmer-clean draws)
glmm_intercept: A↔B PASS (K=200)
glmm_slope: A↔B PASS (K=200)
glmm_multislope: A↔B PASS (K=200)
glmm_slope_crossed: A↔B PASS (K=200)
## **τ²→0 collapse (ICC=0.05, 10×5)**: PASS — 14/25 draws pinned; worst |Δβ̂| 7.92e-08, |Δz| 3.39e-04 (25 draws, τ̂² mean 0.309)
## **Laplace-bias cell**: max |Δβ̂_x1| = 0.0136 (limit 0.0500), mean = 0.0067 — within limit