Machine Learning

Every machine learning idea is a mathematical object. Through 29 interactive demonstrations across loss landscapes, eigendecompositions, kernels, automatic differentiation, Bayesian inference, information theory, equivariance, and manifolds, see the math that quietly runs every model.

See also: Linear Algebra for eigenvectors and SVD, Optimization for the foundations of gradient methods, and Probability for the priors and likelihoods behind Bayesian methods.

Loss Landscapes & Gradient Descent

A loss function is a surface — training is rolling down it. Compare gradient descent, momentum, and Newton on real landscapes.

4 demos

Linear Regression as Projection

Least squares is orthogonal projection onto a column space. Regularization is projection with a soft constraint.

3 demos

PCA = Eigenvectors of Covariance

Principal components are eigenvectors of the covariance matrix — equivalently, the SVD of the centered data.

3 demos

Kernels & Feature Spaces

Positive-definite kernels are inner products in implicit Hilbert spaces. The kernel trick makes nonlinear linear.

3 demos

Backpropagation is the Chain Rule

Reverse-mode automatic differentiation on a computational DAG. The chain rule, organized for efficiency.

3 demos

Bayesian Inference

Posterior equals likelihood times prior, normalized. Watch beliefs update as evidence arrives.

3 demos

Information Theory in ML

Entropy, KL divergence, and cross-entropy — why log-loss is the natural objective for classification.

3 demos

Convolutions & Equivariance

Convolution as group action. CNNs are equivariant maps — representation theory inside every neural network.

4 demos

The Manifold Hypothesis

High-dimensional data lives on low-dimensional manifolds. t-SNE, UMAP, and autoencoders unfold them.

3 demos