PCA = Eigenvectors of Covariance

Interactive: Covariance & its Eigenvectors

Drag points to reshape a 2D cloud. The covariance matrix and its eigenvalues update live; the principal axes are drawn as arrows whose lengths are ±2σ along that direction.

covariance Σ = (1/n) Xᵀ X

13.957.417.414.53

eigenvalues (variance per PC)

λ₁ = 18.025 (97.5% of variance)

λ₂ = 0.461 (2.5% of variance)

PC1 angle: 28.8°

The teal arrow is the first principal component — the direction of maximum variance. It is the unit eigenvector of Σ with the larger eigenvalue. The violet arrow is PC2, always orthogonal because Σ is symmetric. Arrow length is ±2σ along that axis. Drag a point to deform the cloud and watch how Σ, its eigenvalues, and the principal axes update in real time.

Interactive: SVD on an Image

An image is just a matrix. Sliding k changes how many singular components you keep — the rank-k SVD is the best low-rank reconstruction in the Frobenius norm. Watch the image sharpen as k grows.

SVD on a centered data matrix is PCA. Here the "data" is a 32×32 grayscale image. Keeping only the top k singular values gives the best rank-k approximation in the Frobenius norm — the Eckart–Young theorem.

original (rank 32)

reconstruction (rank 4)

k = 4

Frobenius error

10.4%

‖A − A_k‖ / ‖A‖

variance kept

98.9%

Σᵢ₌₁ᵏ σᵢ² / Σ σᵢ²

storage ratio

0.25×

k(m+n+1) / mn

singular value spectrum (scree plot)

The scree plot shows how singular values decay. Real-world images and datasets are typically low-rank in this sense — a handful of components captures most of the variance. PCA exploits exactly this fact: project onto the top-k right singular vectors and you preserve the maximum possible variance for any k-dimensional linear projection.

Interactive: Project a 3D Cloud to 2D

An elongated 3D point cloud with its three principal axes drawn as colored arrows. Toggle the projection and the cloud collapses onto the plane spanned by PC1 and PC2 — the optimal 2D linear summary of the data.

A 3D point cloud shaped like an elongated ellipsoid. The three colored arrows are the principal axes — eigenvectors of the 3×3 covariance matrix, sorted by eigenvalue. Drag to orbit. Toggle the projection to collapse the cloud onto the PC1-PC2 plane.

top-2 PCs preserve 98.3% of variance

PC1

λ = 15.276

85.4% variance

PC2

λ = 2.303

12.9% variance

PC3

λ = 0.299

1.7% variance

Optimal linear dimensionality reduction means: of all linear maps from R³ → R², the projection onto PC1 and PC2 loses the least variance. The third eigenvalue is the variance you throw away when you collapse onto that plane.

The math objects

Centered data matrix X: n rows of d-dimensional samples, with the column mean subtracted. Centering matters — it's why PCA captures variance rather than absolute position.
Covariance Σ = (1/n) Xᵀ X: a d × d symmetric positive semidefinite matrix. The diagonal entries are per-feature variances, the off-diagonals measure how features co-vary.
Eigendecomposition Σ = V Λ Vᵀ: by the spectral theorem, Σ is diagonalizable in an orthonormal basis. The columns of V are the principal axes; the diagonal entries of Λ are the variances along them.
SVD X = U Σ_s Vᵀ: the same V appears here. The connection is algebraic: Σ = (1/n) V Σ_s² Vᵀ, so eigenvalues of Σ are σₖ² / n.
Eckart–Young: truncating to the top k singular components gives the best rank-k approximation in any unitarily invariant norm — Frobenius and spectral norm included.
Whitening: mapping data with V Λ⁻¹ᐟ² Vᵀ leaves a cloud with identity covariance — useful as preprocessing for many downstream methods.

Interactive: Covariance & its Eigenvectors

Interactive: SVD on an Image

Interactive: Project a 3D Cloud to 2D

The math objects

Key takeaways