Sample size is the single most controllable factor in research quality. Too small and you waste resources collecting inconclusive data. Too large and you spend unnecessarily on a question that a smaller study could have answered. Getting sample size right before data collection begins is one of the most important statistical decisions a researcher makes.
Why Sample Size Matters
An undersized study lacks the statistical power to detect real effects — it wastes resources and produces false negatives that can delay important discoveries. In clinical trials, an underpowered study may fail to detect a genuinely effective treatment, causing it to be abandoned prematurely. Conversely, an oversized study is unnecessarily expensive and, in clinical research, exposes more participants to experimental conditions than needed — raising ethical concerns under the principle of equipoise. For surveys and opinion polls, a sample that is too small produces margins of error so wide that the results are uninterpretable: a poll of n = 30 at 95% confidence has a margin of ±18%, making even large political swings statistically indistinguishable. Proper sample-size calculation balances statistical precision, power, and practical constraints — and the calculation must be done before data collection begins, not after, because post-hoc sample size adjustments introduce bias. Modern replication standards now require pre-registration of sample size calculations, making this step both a methodological requirement and an ethical obligation in funded research.
The Diminishing Returns of Larger Samples
The margin of error decreases with the square root of sample size. Quadrupling n only halves the margin. Going from n = 100 to n = 400 halves the margin of error, but going from n = 400 to n = 1,600 only halves it again, and the cost has quadrupled each time. This square-root relationship means that precision improvements become progressively more expensive as samples grow. For the standard proportion formula n = z²p(1−p)/E², halving E requires 4× the respondents. At 95% confidence and ±5% margin: n ≈ 384. At ±2.5% (twice the precision): n ≈ 1,537. At ±1% (five times the precision): n ≈ 9,604. The practical implication is that there is usually a sweet spot — often around n = 400–1,500 for surveys — beyond which further precision gains cost more than they are worth. Understanding this curve helps researchers set realistic, cost-effective targets rather than chasing theoretical perfection.
Power Analysis — The Often-Forgotten Step
Many researchers focus solely on confidence and margin of error, ignoring statistical power. Power (1 − β) is the probability of detecting a real effect of a given size. A study can have a tight 95% CI and still be underpowered if the effect size being measured is small relative to sample size. Power analysis works in both directions: given a fixed budget (sample size), what is the smallest effect you can reliably detect? Or, given a target effect size (from prior literature or practical importance), how many subjects do you need for 80% power? Cohen's d standardizes effect sizes so they can be compared across studies: d = 0.2 (small), 0.5 (medium), 0.8 (large). With n = 50 per group, a study has only 40% power to detect a small effect (d = 0.2) but 98% power to detect a large one (d = 0.8). Failing to account for power means underpowered studies are published with negative results that incorrectly rule out real effects — contributing to the replication crisis in psychology and medicine.
Finite Populations and the Census Myth
A common misconception is that you need to survey a large fraction of a population to get reliable results. In reality, for populations above about 10,000, the required sample size barely changes because the finite population correction factor approaches 1. The precision of a survey is determined almost entirely by the absolute sample size, not by the fraction sampled. A poll of n = 1,068 achieves ±3% margin at 95% confidence whether the population is 100,000 or 330 million. The finite population correction n_adj = nN/(n + N − 1) only matters when sampling more than 5–10% of the total population — for example, surveying 100 out of 500 employees. For small populations (N = 500), sampling 50 requires n_adj ≈ 44 instead of 384 — a substantial reduction. Understanding when to apply this correction prevents both over-sampling (wasting resources) and under-sampling (insufficient precision) in contexts like quality control, small-community surveys, or animal population surveys.