What this report validates

get_effects_from_data recovers standardized effect sizes from uploaded pilot data by fitting the model the data came from. This report is the acceptance gate for that recovery convention: for each estimator we specify an effect of size s, let MCPower simulate a dataset from it, then recover the effect from that dataset — and check the recovery lands where the convention says it should.

The estimator is dispatched off the model family, and the recovery scale differs by estimator:

OLS (continuous outcome) z-scores the outcome, so it recovers the standardized regression coefficient — s / sqrt(Σs² + 1) for independent continuous predictors with the engine’s default N(0, 1) residual. The shrinkage is the residual variance the standardization divides through; it is small for the small/medium effects benchmarked here and is the same approximation the shipped OLS recovery has always carried.
GLM (logistic) and MLE (mixed) fit the native outcome (raw 0/1 for logit, the raw response for mixed), so they recover the coefficient directly: expected recovery is s itself. MLE recovery is fixed-effects-only and reads the grouping column from the uploaded data.

Scope: continuous main effects only — the settled convention. Factor-dummy and interaction scaling under non-continuous outcomes is a separate, still-open item and is intentionally not gated here.

How the check works

Simulate. Generate a dataset from each case’s data-generating process at n = 4000 (LME scales the cluster count to match), via the engine’s own create_data() — the same DGP the power simulation uses.
Recover. Rebuild the raw predictor frame (plus the outcome, plus the grouping column for LME), upload_data() it, and call get_effects_from_data().
Average. Repeat over 20 draws (seeds 2137…2156) and take the mean — single-draw noise is ~1/sqrt(n), so averaging isolates the convention from sampling scatter (the A↔B harness pattern).
Gate. The mean recovered effect must land within 0.02 (absolute) of the convention-predicted value.

The threshold

Quantity	Allowed difference	Why
Mean recovered − expected	0.02 absolute	K-draw mean vs the convention-predicted value; worst measured margin 0.007 → ~2.8× headroom

This is the GETEFFECTS_TOL gate from tolerances.R. A wrong estimator, a sign flip, a missing/inverted standardization, or a 2× scale would move the mean by 0.1–0.5 — far outside this band.

Case	Estimator	Formula	n	K	Verdict
ols_simple_a	OLS (continuous)	y ~ x1	4000	20	PASS
ols_two_a	OLS (continuous)	y ~ x1 + x2	4000	20	PASS
ols_corr_a	OLS (continuous)	y ~ x1 + x2	4000	20	PASS
glm_simple_a	GLM (logistic)	y ~ x1	4000	20	PASS
glm_two_b	GLM (logistic)	y ~ x1 + x2	4000	20	PASS
lme_simple_a	MLE (mixed)	y ~ x1 + (1\|grp)	3990	20	PASS
lme_two_a	MLE (mixed)	y ~ x1 + x2 + (1\|grp)	3990	20	PASS

All estimators round-trip: every specified continuous main effect is recovered to within the gate of the convention-predicted value — OLS at the shrunk standardized scale, GLM and MLE at the native coefficient.

png 2 Mean recovered effect vs the convention-predicted value for
every term, coloured by estimator.

Solid line = exact recovery (identity); dotted lines = the ±0.02 absolute gate. Every term lands on the identity line, well inside the band.

ols_simple_a · OLS (continuous)

R formula y ~ x1 · n=4000 · K=20 draws (seeds 2137–2156).

Term	Specified (s)	Expected	Mean recovered	\|err\|	Verdict
x1	0.25	0.24254	0.24393	0.0014	PASS

ols_two_a · OLS (continuous)

R formula y ~ x1 + x2 · n=4000 · K=20 draws (seeds 2137–2156).

Term	Specified (s)	Expected	Mean recovered	\|err\|	Verdict
x1	0.25	0.24140	0.24286	0.00145	PASS
x2	0.10	0.09656	0.09490	0.00166	PASS

ols_corr_a · OLS (continuous)

R formula y ~ x1 + x2 · n=4000 · K=20 draws (seeds 2137–2156).

Term	Specified (s)	Expected	Mean recovered	\|err\|	Verdict
x1	0.25	0.23864	0.24102	0.00238	PASS
x2	0.10	0.09545	0.09343	0.00202	PASS

glm_simple_a · GLM (logistic)

R formula y ~ x1 · n=4000 · K=20 draws (seeds 2137–2156).

Term	Specified (s)	Expected	Mean recovered	\|err\|	Verdict
x1	0.5	0.5	0.48882	0.01118	PASS

glm_two_b · GLM (logistic)

R formula y ~ x1 + x2 · n=4000 · K=20 draws (seeds 2137–2156).

Term	Specified (s)	Expected	Mean recovered	\|err\|	Verdict
x1	0.8	0.8	0.79534	0.00467	PASS
x2	0.5	0.5	0.50462	0.00462	PASS

lme_simple_a · MLE (mixed)

R formula y ~ x1 + (1|grp) · n=3990 · K=20 draws (seeds 2137–2156).

Term	Specified (s)	Expected	Mean recovered	\|err\|	Verdict
x1	0.5	0.5	0.50154	0.00154	PASS

lme_two_a · MLE (mixed)

R formula y ~ x1 + x2 + (1|grp) · n=3990 · K=20 draws (seeds 2137–2156).

Term	Specified (s)	Expected	Mean recovered	\|err\|	Verdict
x1	0.5	0.5	0.50152	0.00152	PASS
x2	0.3	0.3	0.29892	0.00107	PASS

item	value
generated	2026-06-21
R	R version 4.5.3 (2026-03-11)
mcpower	1.0.0
round-trip gate (mean abs)	0.02 abs
n / K / seed0	4000 / 20 / 2137

Reproduce: rmarkdown::render("mcpower/validation/validation_get_effects.rmd", output_dir = "mcpower/web/documentation/validation").