MCPower Validation — How Correctness Is Proved

How do you know a simulation-based power tool is correct? You can't check it against a textbook formula — the whole point of MCPower is to handle designs that have no clean closed form. So MCPower proves correctness a different way: by splitting the pipeline in two and verifying each half against a trusted reference.

There are two levels to this. The quick, intuitive check is to run the same design through other established simulation tools — the dedicated R packages simglm and simr, plus a plain hand-written loop — and confirm MCPower lands on the same power (Agreement with other tools). It does. But that comparison is itself Monte Carlo: both sides carry their own sampling noise, so it can only show the answers are close, never identical. The disciplined checks below go further. By fixing the dataset they remove the noise entirely, then verify MCPower's numbers are near-identical to what SOTA solvers (lme4, R's lm and glm) compute on that exact data — and that the generated data reproduces the requested specification exactly.

What "right" means

A simulation power tool is correct if both halves of its pipeline are correct:

The data generator is honest. The synthetic datasets it builds really embody the design you requested — the right effect sizes, distributions, factor proportions, and clustering.
The solver is faithful. Given a fixed dataset, it computes the same coefficients, standard errors, test statistics, and thresholds that a trusted reference — standard R — computes on that exact same dataset.

If both hold, the full generate → fit → count pipeline is correct by composition. There is no need for an end-to-end "true power" oracle, because each link in the chain has been checked against something we already trust.

How each half is checked

Data generation. We generate data from a formula with known true coefficients, then analyse it in R and confirm that the recovered moments, effect sizes, and intraclass correlations match what was requested. Here the specification is the oracle — the data must reproduce the design that defined it.
Solving. We fit the exact same dataset twice: once with standard R (lm, glm, and lme4::lmer with REML) and once with MCPower's engine. The coefficients, standard errors, test statistics, and thresholds must agree closely. Because both solvers see identical data, there is no sampling noise to hide behind — any disagreement is a real difference in the solver, not luck of the draw.

Coverage

The checks span the families MCPower supports — single and multiple continuous predictors, interactions, factors, logistic regression (GLM), and mixed / multilevel models (LME) — at both small and larger effect sizes, so the evidence covers the range of designs you're likely to analyse.

The reports

Each report below is a detailed, formula-by-formula evidence document. Open the one that matches what you want to confirm:

Report	What it shows
Data generation	the data generator — generated data matches the requested design
OLS solving	the OLS solver against R's `lm` on identical data
GLM solving	the logistic / GLM solver against R's `glm`
MLE solving	the mixed-effects (LME) solver against `lme4::lmer` (REML)
Effect recovery	end-to-end round-trip: specify a design → simulate it → recover the effects
Required N	the model-based required-N estimate — the default search grid against a dense 100,000-simulation ground truth
Scenario perturbations	heterogeneity, heteroskedasticity, correlation noise, and distribution swaps — each knob validated in isolation and in combination
Uploaded data	the upload path — data simulated from a user-provided frame reproduces its moments and correlations across draws
Agreement with other tools	an informal cross-check — MCPower's power lands on the same answer as the dedicated R tools (`simglm`, `simr`) and an independent DIY loop

How to read them

These are evidence documents, not dashboards. Each opens with a plain "what this shows" section, then walks through the designs one formula at a time. Skim the at-a-glance results if you just want reassurance, or drill into a specific design if you want to see the numbers behind it.