What this report shows

MCPower generates a clustered dataset from a formula, then fits it back to recover the fixed-effect coefficients. This report checks the fitting half: given one fixed dataset, does MCPower’s own mixed-model solver return the same numbers a trusted, independent R fit returns?

For each formula we take the exact bytes MCPower saved (the same design matrix, outcome, and cluster ids, byte-for-byte), fit them two ways —

R (the reference): lme4::lmer with REML on the design matrix through the origin plus a random intercept per cluster — the textbook restricted-maximum- likelihood mixed-model fit.
MCPower: the engine’s own load_data() path, which runs the same REML profile solver (Brent on the variance component) the power simulation uses.

— and compare the fitted fixed-effect coefficients, the test statistics, and the decision threshold. The statistic is the Wald z on the fixed effect (fixef / sqrt(diag(vcov))), not lmerTest’s Satterthwaite t — MCPower’s own test definition is the Wald z. Because both fit the identical bytes, sampling noise cancels: any difference is a solver discrepancy.

The chosen cases sit at ICC 0.1–0.2 with 20–30 clusters, well clear of the τ̂ ≈ 0 boundary. If a case ever collapsed to that boundary, MCPower’s OLS-fallback fit would diverge from lmer’s REML fit and the statistic band would FAIL vs R — surfacing the boundary indirectly. None of the cases below are expected to hit it.

How the check works

Provenance. Re-generate the dataset from the committed seed and confirm its content hash matches the saved file — proof the bytes haven’t drifted.
B (R) vs C (MCPower) on those bytes: fixed-effect coefficients, the Wald z statistics (β̂/se), and the critical value qnorm(1 − α/2) ≈ 1.96.
C vs A (a readable sanity overlay): MCPower’s recovered coefficients next to the formula’s true values.

The thresholds

Quantity	Allowed difference	Why
Coefficient (β)	10^{-4} relative	REML profile optimisation — agrees with `lmer` inside this band
Statistic (z)	10^{-4} relative	derived from the same β and variance
Critical value	10^{-8} absolute	engine’s own normal/χ² quantile vs R’s `qnorm` ≈1.6e-9

These are the B↔C gates from tolerances.R. MLE is a REML profile optimisation (Brent on the variance component) — the loosest of the three fits against the iterative estimate_rel_iter band, since two independent optimisers (MCPower’s and lmer’s) need only agree to that band, not to machine precision. The per-formula tables below show each formula’s actual margin; a case sitting near the band edge is called out in the summary.

Formula	n	B↔C	Converged	Reproduces
y ~ x1 + (1\|grp)	600	PASS	yes	yes
y ~ x1 + (1\|grp)	600	PASS	yes	yes
y ~ x1 + x2 + (1\|grp)	750	PASS	yes	yes
y ~ x1 + x2 + (1\|grp)	750	PASS	yes	yes
y ~ x1*x2 + (1\|grp)	750	PASS	yes	yes
y ~ x1*x2 + (1\|grp)	900	PASS	yes	yes
y ~ x1 + g + (1\|grp)	750	PASS	yes	yes
y ~ x1 + g + (1\|grp)	900	PASS	yes	yes

All MLE formulas pass: MCPower’s REML solver matches lme4::lmer well inside the iterative band on every saved dataset, and every file reproduces from its seed.

png 2 B↔C relative agreement of every coefficient and statistic across
the base MLE formulas, against the tolerance
gate.

Each point is one fitted quantity on one saved dataset; the red line is the gate. Two independent REML optimisers (MCPower’s and lme4’s) agree inside the iterative band on every quantity.

y = 0.5·x1 + per-grp random intercept (ICC 0.20) + noise

R formula y ~ x1 + (1|grp) · linear mixed model (random intercept) · n=600, seed=2137. File reproduces from seed: yes.

Term	True (formula)	β (R = MCPower)	t / z	crit	rel. diff (β, stat)	crit abs. diff	Verdict
intercept	0.0	0.078383	NA	NA	3.0e-08	—	PASS
x1	0.5	0.404979	10.34034	1.959964	8.2e-07	1.6e-09	PASS

y = 0.3·x1 + per-grp random intercept (ICC 0.30) + noise

R formula y ~ x1 + (1|grp) · linear mixed model (random intercept) · n=600, seed=2138. File reproduces from seed: yes.

Term	True (formula)	β (R = MCPower)	t / z	crit	rel. diff (β, stat)	crit abs. diff	Verdict
intercept	0.0	0.131583	NA	NA	3.6e-09	—	PASS
x1	0.3	0.347749	8.595397	1.959964	4.3e-07	1.6e-09	PASS

y = 0.5·x1 + 0.3·x2 + per-grp random intercept (ICC 0.20) + noise

R formula y ~ x1 + x2 + (1|grp) · linear mixed model (random intercept) · n=750, seed=2137. File reproduces from seed: yes.

Term	True (formula)	β (R = MCPower)	t / z	crit	rel. diff (β, stat)	crit abs. diff	Verdict
intercept	0.0	-0.007542	NA	NA	5.6e-07	—	PASS
x1	0.5	0.440708	12.927637	1.959964	7.6e-07	1.6e-09	PASS
x2	0.3	0.341816	9.183627	1.959964	9.0e-07	1.6e-09	PASS

y = 0.3·x1 + 0.5·x2 + per-grp random intercept (ICC 0.30) + noise

R formula y ~ x1 + x2 + (1|grp) · linear mixed model (random intercept) · n=750, seed=2138. File reproduces from seed: yes.

Term	True (formula)	β (R = MCPower)	t / z	crit	rel. diff (β, stat)	crit abs. diff	Verdict
intercept	0.0	0.102675	NA	NA	1.9e-09	—	PASS
x1	0.3	0.357704	9.980624	1.959964	1.1e-07	1.6e-09	PASS
x2	0.5	0.497794	13.989819	1.959964	9.8e-08	1.6e-09	PASS

y = 0.5·x1 + 0.3·x2 + 0.3·x1:x2 + per-grp random intercept (ICC 0.20) + noise

R formula y ~ x1*x2 + (1|grp) · linear mixed model (random intercept) · n=750, seed=2137. File reproduces from seed: yes.

Term	True (formula)	β (R = MCPower)	t / z	crit	rel. diff (β, stat)	crit abs. diff	Verdict
intercept	0.0	-0.007002	NA	NA	3.6e-07	—	PASS
x1	0.5	0.447848	12.953272	1.959964	8.3e-07	1.6e-09	PASS
x2	0.3	0.341794	9.187226	1.959964	9.3e-07	1.6e-09	PASS
x1:x2	0.3	0.254123	6.738949	1.959964	5.7e-07	1.6e-09	PASS

y = 0.4·x1 + 0.3·x2 + 0.2·x1:x2 + per-grp random intercept (ICC 0.30) + noise

R formula y ~ x1*x2 + (1|grp) · linear mixed model (random intercept) · n=900, seed=2138. File reproduces from seed: yes.

Term	True (formula)	β (R = MCPower)	t / z	crit	rel. diff (β, stat)	crit abs. diff	Verdict
intercept	0.0	0.202794	NA	NA	7.4e-10	—	PASS
x1	0.4	0.428858	12.586038	1.959964	8.1e-08	1.6e-09	PASS
x2	0.3	0.289511	8.587036	1.959964	1.6e-07	1.6e-09	PASS
x1:x2	0.2	0.251104	7.673815	1.959964	1.7e-07	1.6e-09	PASS

y = 0.3·x1 + 0.3·g[2] + per-grp random intercept (ICC 0.20) + noise

R formula y ~ x1 + g + (1|grp) · linear mixed model (random intercept) · n=750, seed=2137. File reproduces from seed: yes.

Term	True (formula)	β (R = MCPower)	t / z	crit	rel. diff (β, stat)	crit abs. diff	Verdict
intercept	0.0	-0.042469	NA	NA	6.6e-08	—	PASS
x1	0.3	0.241373	7.079191	1.959964	8.8e-07	1.6e-09	PASS
g[2]	0.3	0.375982	5.275995	1.959964	6.6e-07	1.6e-09	PASS

y = 0.4·x1 + 0.5·g[2] + 0.8·g[3] + per-grp random intercept (ICC 0.30) + noise

R formula y ~ x1 + g + (1|grp) · linear mixed model (random intercept) · n=900, seed=2138. File reproduces from seed: yes.

Term	True (formula)	β (R = MCPower)	t / z	crit	rel. diff (β, stat)	crit abs. diff	Verdict
intercept	0.0	0.123802	NA	NA	6.8e-09	—	PASS
x1	0.4	0.435261	12.890829	1.959964	5.2e-08	1.6e-09	PASS
g[2]	0.5	0.612951	2.210123	1.959964	3.2e-06	1.6e-09	PASS
g[3]	0.8	1.000041	3.147536	1.959964	3.2e-06	1.6e-09	PASS

item	value
generated	2026-06-21
R	R version 4.5.3 (2026-03-11)
mcpower	1.0.0
MLE solving bands (beta/stat rel · crit abs)	0.0001 rel / 0.0001 rel / 1e-08 abs

M2 — crossed & nested groupings (general lmm path)

A↔B: lme4 recovers the spec’s β and variance components from MCPower’s own draws (the spec is the oracle; K independent draws, Monte-Carlo bands). B↔C: MCPower and lme4 fit the SAME saved bytes; β̂ at estimate_rel_iter with the optimizer-vs-optimizer abs floors (stat 1e-4 / β̂ 1e-5), variance components at vc_rel/vc_abs. Wald z on both sides — never lmerTest t.

lmm_crossed_a: PASS (25/25 lme4-clean draws)

lmm_crossed_b: PASS (24/25 lme4-clean draws)

lmm_nested_a: PASS (25/25 lme4-clean draws)

lmm_nested_b: PASS (25/25 lme4-clean draws)

lmm_crossed_nested_a: PASS (25/25 lme4-clean draws)

lmm_crossed_a: A<->B PASS (K=200)

lmm_crossed_b: A<->B PASS (K=200)

lmm_nested_a: A<->B PASS (K=200)

lmm_nested_b: A<->B PASS (K=200)

lmm_crossed_nested_a: A<->B PASS (K=200)

Reproduce: Rscript mcpower/validation/data_generation.r then rmarkdown::render("mcpower/validation/validation_MLE_solving.rmd", output_dir = "mcpower/web/documentation/validation").

M3 — random slopes (general lmm path: standalone, multi-slope, composition)

B↔C: MCPower and lme4 fit the SAME saved bytes; β̂ at estimate_rel_iter with the optimizer-vs-optimizer abs floors (stat 1e-4 / β̂ 1e-5); every RE variance at slope_var_rel/vc_abs; the full RE correlation vector (re_corr) at slope_corr_abs. A↔B: lme4 recovers the spec’s β, the per-component variances, and the set correlations from MCPower’s own draws over K draws (the spec is the oracle). Per-component boundary: a τ→0 component must drive its boundary rate up while the others stay low. Wald z on both sides — never lmerTest t.

lmm_slope_a: PASS (25/25 lme4-clean draws)

lmm_slope_b: PASS (25/25 lme4-clean draws)

lmm_multislope: PASS (25/25 lme4-clean draws)

lmm_slope_crossed: PASS (24/25 lme4-clean draws)

lmm_slope_a: A↔B PASS (K=300)

lmm_slope_b: A↔B PASS (K=300)

lmm_multislope: A↔B PASS (K=300)

lmm_slope_crossed: A↔B PASS (K=300)

## Power Analysis — MLE  N=480  sims=800  α=0.05  target=80%
## formula: y ~ x1 + (1|grp)
## 
## ───────────────────────────────────
## Test                 Power   Target
## ───────────────────────────────────
## x1                    100%      80%
## ───────────────────────────────────

## Power Analysis — MLE  N=480  sims=800  α=0.05  target=80%
## formula: y ~ x1 + (1|grp)
## 
## ───────────────────────────────────
## Test                 Power   Target
## ───────────────────────────────────
## x1                   95.0%      80%
## ───────────────────────────────────

## Slope τ²=0: singular_fit_rate = 0.626 (expect high)

## Slope τ²=0.10: singular_fit_rate = 0.010 (expect low)

M4 — clustered logistic GLMM (intercept, slope, multi-slope, composition, Laplace-bias)

B↔C: MCPower-GLMM and lme4::glmer(family = binomial) fit the SAME generated bytes; β̂ at GLMM_TOL$beta_rel / beta_abs_floor; RE variances at GLMM_TOL$tau_rel; RE correlations at GLMM_TOL$corr_abs. Wald z (β̂/se) on both sides. Draws where glmer reports convergence warnings or a singular fit are skipped (same policy as M3: two optimizers on the same bytes need not share the optimum at a boundary).

B↔C is a documented XFAIL, not a hard gate. MCPower-GLMM diverges from glmer’s default summary on two known, deferred axes (full rationale in the workspace doc docs/ideas-features/mixed-model-significance-reference.md): (z) MCPower’s Wald SE is the RX/PLS Schur (= glmer vcov(use.hessian = FALSE)), ~3% anticonservative vs glmer’s default use.hessian = TRUE — a deliberate, documented, deferred choice (that doc’s §“Second axis”), not a regression; (β̂/τ̂²/ρ̂) MCPower-BOBYQA vs glmer-Laplace agree on the same bytes only to ~2–3× the LMM-inherited GLMM_TOL bands, plus glmer’s few-cluster convergence noise. The cell logs the measured gaps and renders green, tripping only a generous gross-regression backstop; the engine’s own brute-force-Laplace unit tests are the fine fit-regression guard.

A↔B: glmer recovers the spec’s β and variance components from MCPower’s draws over K draws (the spec is the oracle; Monte-Carlo bands).

Section 8.4 asserts τ²→0 collapse: using a near-boundary config (ICC=0.05, 10 clusters × 5 obs) where the non-negative variance estimator pins at τ̂²=0 in ~55–60% of draws, MCPower-GLMM must agree with plain glm() on those boundary-pinned draws within GLMM_TOL$collapse_beta / collapse_stat. Non-pinned draws are excluded from the tight gate (a fitted RE legitimately makes the GLMM diverge from glm).

Section 8.5 is the Laplace-bias cell: MCPower-GLMM (Laplace) vs glmer(nAGQ = 7); the gap must stay within GLMM_TOL$laplace_bias_beta_abs.

glmm_intercept: XFAIL — 60 z (SE convention, 2.6% max) + 13 β̂/τ̂²/ρ̂ (optimizer band: β̂ 1.7e-03, τ̂² 2.8e-03, ρ̂ 0.0e+00) (60/60 glmer-clean draws)

glmm_slope: XFAIL — 40 z (SE convention, 8.6% max) + 91 β̂/τ̂²/ρ̂ (optimizer band: β̂ 2.9e-03, τ̂² 4.0e-03, ρ̂ 5.1e-03) (40/60 glmer-clean draws)

glmm_multislope: XFAIL — 30 z (SE convention, 7.8% max) + 62 β̂/τ̂²/ρ̂ (optimizer band: β̂ 2.6e-03, τ̂² 2.9e-03, ρ̂ 6.1e-03) (15/60 glmer-clean draws)

glmm_slope_crossed: XFAIL — 37 z (SE convention, 9.5% max) + 127 β̂/τ̂²/ρ̂ (optimizer band: β̂ 3.6e-03, τ̂² 6.6e-03, ρ̂ 2.6e-02) (37/60 glmer-clean draws)

glmm_intercept: A↔B PASS (K=200)

glmm_slope: A↔B PASS (K=200)

glmm_multislope: A↔B PASS (K=200)

glmm_slope_crossed: A↔B PASS (K=200)

## **τ²→0 collapse (ICC=0.05, 10×5)**: PASS — 14/25 draws pinned; worst |Δβ̂| 7.92e-08, |Δz| 3.39e-04 (25 draws, τ̂² mean 0.309)

## **Laplace-bias cell**: max |Δβ̂_x1| = 0.0136 (limit 0.0500), mean = 0.0067 — within limit