Sample-size calculators ask what n you need for an assumed effect. Recruitment-constrained research usually runs the other way: you know roughly how many patients you can enroll. This tool inverts the calculation — given your feasible n, the smallest true difference your study could reliably detect — so you can ask the honest question: is that difference still clinically meaningful?
Per group, not total: a two-arm study of 240 patients is n = 120 per group.
From a pilot or the literature. Leave blank to read the answer in SD units instead.
With 120 per group (240 analyzed, 1:1), the smallest true difference this study can reliably detect at 80% power is 0.36 SD (α = 0.05, two-sided). Is that smaller than the smallest difference that would change practice? If not, the study cannot answer the question at this size.
| n per group | total analyzed | detectable at 80% power | detectable at 90% power |
|---|---|---|---|
| 60 | 120 | 0.51 SD | 0.59 SD |
| 90 | 180 | 0.42 SD | 0.48 SD |
| 120 — your n | 240 | 0.36 SD | 0.42 SD |
| 180 | 360 | 0.30 SD | 0.34 SD |
| 240 | 480 | 0.26 SD | 0.30 SD |
| 360 | 720 | 0.21 SD | 0.24 SD |
Assumes a two-group comparison of means with equal allocation (normal approximation) — the inverse of the standard two-sample formula: Δmin = (z1−α/2 + zβ) · σ · √(2 / n). Smaller detectable difference = more sensitive study.
Anchor on the minimal clinically important difference — the smallest effect that would change practice — and check whether your feasible n can detect it. Working backwards from a convenient n to whatever effect it can detect, and then calling that effect the target, is effect-size fishing; with patients enrolled in a study that cannot answer its question, it is an ethics problem, not just a statistical one. If the detectable effect at your feasible n is larger than the difference that matters, the honest options are a larger or multi-site study, a more sensitive endpoint, or a redesigned question — not a smaller target.
The calculation is the inverse of the standard two-sample means formula, computed exactly as shown — Δmin = (z1−α/2 + zβ) · σ · √(2 / n) — assuming a two-group comparison with equal allocation. Binary outcomes, unequal allocation, clustering, expected dropout, and survival endpoints change the arithmetic; those are part of the full design plan, or a conversation with your biostatistician.
What is a minimum detectable effect?
The smallest true difference between groups that your study would have a specified chance (the power, usually 80% or 90%) of declaring statistically significant. If the difference that would actually change practice is smaller than your minimum detectable effect, the study as sized cannot answer the question — it is underpowered for the effect that matters.
Why does the calculator ask for patients per group, not in total?
Because the formula works per group, and conflating the two is a classic planning error. A two-arm study that can enroll 240 patients in total has n = 120 per group under 1:1 allocation. The table shows both numbers so the distinction stays visible.
What if I don't know the standard deviation of my outcome?
Leave it blank and read the answer in standard-deviation units (a standardized difference — 0.2 SD is conventionally small, 0.5 SD moderate, 0.8 SD large). When you do supply an SD from a pilot or the literature, treat pilot SDs as optimistic: they are estimated from small samples, so consider a conservative (larger) value.
Can I use this to compute the power of a study I already ran?
No — and that is deliberate. "Observed power" computed from your results adds no information beyond the p-value and is widely considered a statistical error. Power and detectable-effect calculations are planning tools: use them before enrollment, and anchor the effect on the minimal clinically important difference, not on whatever effect makes your feasible n look adequate.
This calculator answers one planning question. The $10 RigorMD design plan answers the rest from a plain-prose description of your study: the right statistical test for your design, the variables and data you need to collect, a clearly stated hypothesis with the power to answer it, and a draft IRB statistical-methods page. Where an input only you can supply is missing — an effect size, a variance, an event rate — the plan names the gap; it never invents the number.