High-dimensional data lives on low-dimensional manifolds. t-SNE, UMAP, and autoencoders unfold them.
Take any photograph of a face. Encoded as raw pixels it is a vector in a space of tens of thousands of dimensions — yet only an unimaginably tiny fraction of that space contains anything that would ever look like a face. The set of plausible faces forms a thin, curved surface — a manifold — sitting inside the ambient pixel space. The same is true of speech waveforms, of molecules, of natural images, of essentially every kind of real data we want a machine learning model to handle.
That is the manifold hypothesis: real-world high-dimensional data is concentrated near a much lower-dimensional manifold. The intrinsic dimension is far smaller than the ambient dimension — and that is the only reason machine learning works at all. Every dimensionality-reduction algorithm is, in effect, an attempt to recover the manifold. PCA finds the best linear approximation. Isomap, t-SNE, UMAP, and autoencoders recover curved manifolds.
See also: Topological Data Analysis for the persistent homology of point clouds, and Differential Geometry for what a manifold actually is — tangent spaces, geodesics, and curvature.
A 600-point cloud sampled near a 2D manifold rolled up in 3D. Drag to rotate; the spiral colors mark the intrinsic spiral coordinate.
The intrinsic dimension of this cloud is two, even though it lives in three. Switch to Unfold and animate: the geodesic embedding flattens the roll into its true 2D parameter plane. Switch to PCA and watch the linear method fail — it can only collapse the roll along axes, not uncurl it.
5 Gaussian blobs in 10 dimensions, 160 points total. Same data, three different 2D embeddings. PCA is a one-shot linear projection; t-SNE and UMAP iterate a neighborhood-preserving objective.
linear · variance-maximizing
nonlinear · neighborhood-preserving
nonlinear · local + global structure
PCA can only translate, rotate, and scale — so when the cluster centers in 10D do not all line up with two coordinate directions, projection collapses them. Neighborhood methods like t-SNE and UMAP pull the k-nearest-neighbor graph apart in 2D: same-cluster points attract, far-apart points repel. The clusters separate even when no linear projection could have done it.
A noisy 1D curve drawn in 3D. The teal line is the manifold a 1D-bottleneck autoencoder learned. Slide the latent z to walk along it; hover any noisy training point to see where the encoder would project it.
The bottleneck forces the network to discover a single coordinate that explains the data. Decoding that coordinate traces out the learned manifold. The training cloud is noisy — the autoencoder strips the noise off and recovers the curve underneath.