Barcodes & Persistence Diagrams

Master the visual representations of persistence: barcodes, persistence diagrams, and how to interpret them.

Visualizing Persistence

The output of persistent homology is a collection of birth-death pairs. Two standard visualizations make this data interpretable: barcodes andpersistence diagrams. Both encode the same information but highlight different aspects.

These visualizations are the "fingerprint" of a dataset's topology — they capture the multi-scale structure in a way that can be compared, analyzed, and even used as features for machine learning.

Persistence Barcode

Each horizontal bar represents a feature. The left endpoint is the birth time; the right endpoint is the death time. Long bars indicate significant features; short bars are typically noise. Bars extending to infinity represent essential features.

Point Cloud

Dimension

Threshold

Hover over a bar to see details. Long bars indicate significant topological features; short bars are typically noise.

Persistence Diagram

Each point (b, d) represents a feature born at time b and dying at time d. Points near the diagonal have low persistence (noise); points far from the diagonal are significant. The diagonal line is birth = death (features that exist for zero time).

Point Cloud

Dimension

Show persistence threshold

Points represent (birth, death) pairs. Distance from diagonal = persistence. Points far from diagonal are significant features; near diagonal is noise.

Barcodes vs Diagrams

Barcodes

Easy to see at which scale features exist
Natural for time-series/filtration view
Length directly shows persistence
Can get cluttered with many features

Diagrams

Compact representation for many features
Easy to see persistence (distance from diagonal)
Natural for stability theorems
Can define distances between diagrams

The Diagonal and Noise

In the persistence diagram, the diagonal line d = b represents features with zero persistence — they're born and immediately die. While no actual features lie exactly on the diagonal, points near the diagonal have very short lifespans and are typically considered noise.

persistence = d - b = vertical distance from diagonal

The persistence threshold is a common way to filter noise: only keep features with persistence above some threshold. Features above the threshold are considered "topological signal"; those below are noise from sampling or measurement error.

Stability Theorem

One of the most important properties of persistent homology is stability: small changes in the input lead to small changes in the output. Formally, if two functions f and g differ by at most ε, their persistence diagrams differ by at most ε in the bottleneck distance.

d_B(Dgm(f), Dgm(g)) ≤ ||f - g||_∞

Bottleneck distance ≤ L∞ distance between functions

This means persistent homology is robust to noise and sampling variations — critical for real-world data analysis applications.

Comparing Diagrams

To compare two persistence diagrams, we need a notion of distance. The two standard choices are:

Bottleneck Distance (d_B)

Find the best matching between points in the two diagrams (points can also match to the diagonal). The bottleneck distance is the maximum distance between matched points.

Wasserstein Distance (W_p)

Like bottleneck, but uses the p-th power of distances summed over all matched pairs. More sensitive to overall distribution of features.

Key Takeaways

Barcodes — horizontal bars showing [birth, death] intervals
Persistence diagrams — scatter plot of (birth, death) points
Distance from diagonal = persistence = significance
Stability theorem — small input changes → small diagram changes
Bottleneck/Wasserstein — distances between diagrams