Master the visual representations of persistence: barcodes, persistence diagrams, and how to interpret them.
The output of persistent homology is a collection of birth-death pairs. Two standard visualizations make this data interpretable: barcodes andpersistence diagrams. Both encode the same information but highlight different aspects.
These visualizations are the "fingerprint" of a dataset's topology — they capture the multi-scale structure in a way that can be compared, analyzed, and even used as features for machine learning.
Each horizontal bar represents a feature. The left endpoint is the birth time; the right endpoint is the death time. Long bars indicate significant features; short bars are typically noise. Bars extending to infinity represent essential features.
Hover over a bar to see details. Long bars indicate significant topological features; short bars are typically noise.
Each point (b, d) represents a feature born at time b and dying at time d. Points near the diagonal have low persistence (noise); points far from the diagonal are significant. The diagonal line is birth = death (features that exist for zero time).
Points represent (birth, death) pairs. Distance from diagonal = persistence. Points far from diagonal are significant features; near diagonal is noise.
In the persistence diagram, the diagonal line d = b represents features with zero persistence — they're born and immediately die. While no actual features lie exactly on the diagonal, points near the diagonal have very short lifespans and are typically considered noise.
The persistence threshold is a common way to filter noise: only keep features with persistence above some threshold. Features above the threshold are considered "topological signal"; those below are noise from sampling or measurement error.
One of the most important properties of persistent homology is stability: small changes in the input lead to small changes in the output. Formally, if two functions f and g differ by at most ε, their persistence diagrams differ by at most ε in the bottleneck distance.
This means persistent homology is robust to noise and sampling variations — critical for real-world data analysis applications.
To compare two persistence diagrams, we need a notion of distance. The two standard choices are:
Find the best matching between points in the two diagrams (points can also match to the diagonal). The bottleneck distance is the maximum distance between matched points.
Like bottleneck, but uses the p-th power of distances summed over all matched pairs. More sensitive to overall distribution of features.