Principal components are eigenvectors of the covariance matrix — equivalently, the SVD of the centered data.
Principal Component Analysis sounds like an algorithm. It is really a sentence in linear algebra: the principal components of a dataset are the eigenvectors of its covariance matrix. Center the data so its mean is zero, form Σ = (1/n) Xᵀ X, and diagonalize. The eigenvectors point along the directions of greatest variance. The eigenvalues are those variances. That is the entire method.
Equivalently — and this is the form that scales — the principal components are the right singular vectors of the centered data matrix X. The Singular Value Decomposition X = U Σ Vᵀ packages everything: the columns of V are the principal directions, and the squared singular values σₖ² are n times the variances. Among all rank-k approximations of X, U_k Σ_k V_kᵀ minimizes the Frobenius error — the Eckart–Young theorem makes this optimality precise. PCA is the unique linear projection onto a k-dimensional subspace that preserves the maximum amount of variance.
The teal arrow is the first principal component — the direction of maximum variance. It is the unit eigenvector of Σ with the larger eigenvalue. The violet arrow is PC2, always orthogonal because Σ is symmetric. Arrow length is ±2σ along that axis. Drag a point to deform the cloud and watch how Σ, its eigenvalues, and the principal axes update in real time.
SVD on a centered data matrix is PCA. Here the "data" is a 32×32 grayscale image. Keeping only the top k singular values gives the best rank-k approximation in the Frobenius norm — the Eckart–Young theorem.
The scree plot shows how singular values decay. Real-world images and datasets are typically low-rank in this sense — a handful of components captures most of the variance. PCA exploits exactly this fact: project onto the top-k right singular vectors and you preserve the maximum possible variance for any k-dimensional linear projection.
A 3D point cloud shaped like an elongated ellipsoid. The three colored arrows are the principal axes — eigenvectors of the 3×3 covariance matrix, sorted by eigenvalue. Drag to orbit. Toggle the projection to collapse the cloud onto the PC1-PC2 plane.
Optimal linear dimensionality reduction means: of all linear maps from R³ → R², the projection onto PC1 and PC2 loses the least variance. The third eigenvalue is the variance you throw away when you collapse onto that plane.