Central Limit Theorem

Watch any distribution converge to a bell curve as you add samples

The Central Limit Theorem

The Central Limit Theorem (CLT) is one of the most remarkable results in all of mathematics. It says that if you take any distribution -- no matter how skewed, lumpy, or strange -- and repeatedly draw samples of size n, then the distribution of sample means will approach a normal (bell) curve as n grows. You do not need to know the shape of the original distribution; the averaging process itself creates normality.

This explains why the normal distribution appears so often in nature: heights, measurement errors, and test scores are all shaped by the sum of many small, independent random effects. Below, three interactive demos let you watch the CLT in action and build deep intuition for why it works.

Dice Sums: From Uniform to Bell Curve

Rolling a single die gives a perfectly uniform distribution -- each outcome from 1 to 6 is equally likely. But when you sum multiple dice, something magical happens: the histogram of sums begins to look like a bell curve. Try increasing the number of dice from 1 to 10 and rolling thousands of times to watch the transformation.

Number of Dice: 2

Roll N dice repeatedly and watch the histogram of sums. With 1 die the distribution is uniform; as N increases, the sum converges to a bell curve (the Central Limit Theorem). The cyan curve shows the theoretical normal approximation.

Key insight: Even with just 3 or 4 dice, the histogram is already recognizably bell-shaped. The CLT is not just a theoretical limit -- convergence happens surprisingly fast for symmetric distributions like the uniform.

Sample Means from Any Distribution

The CLT works for any source distribution with finite variance. Choose from exponential, uniform, or bimodal distributions and watch how the distribution of sample means evolves as you increase the sample size n. The left panel shows the raw source distribution; the right panel shows the distribution of sample means with a theoretical normal overlay.

Source Distribution

Sample Size n: 5

Left panel shows the source distribution (which can be highly non-normal). Right panel shows the distribution of sample means. As sample size n increases, the right panel converges to a normal distribution regardless of the source shape.

Key insight: The standard deviation of the sample mean shrinks as 1/√n. This means doubling your sample size does not halve the spread -- you need to quadruple it. The theoretical mean of the sample means equals the population mean, while the standard error equals σ/√n.

Convergence Speed: Which Distributions Are Fastest?

Not all distributions converge to normal at the same speed. Symmetric distributions like the uniform converge very quickly (n = 3 or 4 is often enough). Skewed distributions like the exponential need larger n. This 2×2 grid lets you drag a single slider and compare all four simultaneously.

Sample Size n: 1

2,000 samples each

Normal curve

Histogram of sample means

Drag the slider to see how fast each distribution converges to normal. Symmetric distributions (uniform) converge quickly, while skewed distributions (exponential) and discrete distributions (Bernoulli) take longer. All four converge by n ≈ 30.

Key insight: The Berry-Esseen theorem quantifies this: the rate of convergence depends on the third absolute moment (skewness) of the source distribution. More symmetric distributions have lower skewness and converge faster. By n ≈ 30, even highly skewed distributions are approximately normal.

Key Takeaways

Universality -- the CLT applies to any distribution with finite variance, regardless of its shape. Sums and averages of independent random variables converge to a normal distribution.
Standard error -- the standard deviation of the sample mean is σ/√n, meaning precision improves with the square root of sample size, not linearly.
Convergence speed -- symmetric distributions converge in just a few samples; skewed distributions may need n ≥ 30 or more.
Why normality is everywhere -- any quantity that arises as the sum of many small, independent effects will be approximately normal. This is why the bell curve appears in heights, errors, test scores, and financial returns.