Information Theory

Shannon's mathematical theory of communication. Through 28 interactive demonstrations covering entropy, mutual information, source coding, channel capacity, error correction, and the rate-distortion frontier, see the math that makes every modem, codec, and compressor possible.

See also: Probability for the random variables underneath every entropy, Algorithms for information-theoretic complexity bounds, and Machine Learning for KL divergence and cross-entropy as loss functions.

Shannon Entropy

H(X) = −Σ p log p. The fundamental measure of uncertainty in a probability distribution.

4 demos

Joint, Conditional & Mutual Information

How much knowing one variable tells you about another. The chain rule for entropy.

3 demos

Source Coding & Huffman

Shannon's first theorem: entropy is the optimal compression bound. Huffman codes achieve it.

3 demos

The Asymptotic Equipartition Property

Long sequences from a source are almost certainly typical — and almost equally probable.

3 demos

Channel Capacity

Shannon's noisy channel coding theorem: the maximum reliable rate of information through noise.

3 demos

Error-Correcting Codes

Hamming codes, decoding spheres, and Reed-Solomon. Recovery from corruption.

3 demos

Differential Entropy & Maximum Entropy

Continuous entropy h(X) = −∫ f log f. Why the Gaussian is nature's default distribution.

3 demos

KL Divergence

The cost in extra bits of coding for the wrong distribution. Cross-entropy and the bridge to ML.

3 demos

Rate-Distortion Theory

Lossy compression: the best rate achievable at a given distortion. The math behind JPEG and MP3.

3 demos