Combinatorics of Trees

Explore Catalan numbers, generating functions, and the surprising connections between binary trees, Dyck paths, and parenthesizations

Counting Trees and Structures

How many binary trees have n nodes? How many ways can we parenthesize an expression? These questions lead to the beautiful Catalan numbers, one of the most ubiquitous sequences in combinatorics.

Even more remarkably, generating functions let us solve counting problems algebraically. Instead of counting directly, we encode counts as coefficients of a power series, then manipulate the series to extract formulas automatically.

In this section, you'll discover the bijections between trees, paths, and parentheses, see generating functions in action, and explore how memoization transforms exponential recursion into efficient algorithms.

Demo 1: Catalan Structures and Bijections

The nth Catalan number Cn counts binary trees with n internal nodes, Dyck paths of length 2n, and balanced parenthesizations with n pairs. These are all in bijection — select any structure and see its counterparts!

n = 3
C3 =5
5 distinct structures in each representation

Binary Tree

Dyck Path

///\\\

Balanced Parentheses

(
(
(
)
)
)
((()))

The Bijection

These three structures are in bijection — there's a one-to-one correspondence between them. Each binary tree corresponds to exactly one Dyck path and one parenthesization. The Catalan number Cn counts all of them!

  • Tree traversal → Dyck path (up on left edge, down on right)
  • Dyck path → Parentheses ('/' = '(', '\' = ')')
  • Parentheses → Tree (nested structure)

Demo 2: Generating Functions

The generating function C(x) = Σ Cnxn satisfies the equation C(x) = 1 + x·C(x)². Solving this quadratic algebraically derives the closed form for Catalan numbers — no counting required!

The Generating Function Equation

C(x) = 1 + x · C(x)2

A binary tree is either empty (1) or a root with two subtrees (x · C(x)²)

Step 1: Rearrange as Quadratic

C(x) = 1 + x·C(x)²
x·C(x)² - C(x) + 1 = 0

This is a quadratic in C(x)!

Step 2: Apply Quadratic Formula

C(x) = (1 ± √(1 - 4x)) / (2x)

We need C(0) = 1, so we take the minus sign (the plus would give ∞ at x=0)

Step 3: Closed Form

C(x) = (1 - √(1 - 4x)) / (2x)

Step 4: Taylor Expand to Get Coefficients

Cn = coefficient of xn in C(x)
Cn = C(2n, n) / (n + 1)

The binomial series of √(1-4x) gives us the Catalan numbers!

Coefficients (Catalan Numbers)

n012345678
Cn112514421324291430

C(x) as Power Series

C(x) = 1 + 1x + 2x^2 + 5x^3 + 14x^4 + 42x^5 + 132x^6 + 429x^7 + 1430x^8 + ...

Verify: C(x)² Convolution

The coefficient of xn in C(x)² is Σi=0n Ci·Cn-i

x0 in C²: 1·1 = 1(→ C1 = 1 ✓)
x1 in C²: 1·1 + 1·1 = 2(→ C2 = 2 ✓)
x2 in C²: 1·2 + 1·1 + 2·1 = 5(→ C3 = 5 ✓)
x3 in C²: 1·5 + 1·2 + 2·1 + 5·1 = 14(→ C4 = 14 ✓)
x4 in C²: 1·14 + 1·5 + 2·2 + 5·1 + 14·1 = 42(→ C5 = 42 ✓)

The Power of Generating Functions

Instead of counting directly, we encode counts as coefficients of a formal power series. Algebraic manipulation (solving the quadratic) automatically derives the formula! This technique handles complex recurrences that would be hard to solve directly.

Demo 3: Memoization as DAG

Many counting recurrences (Fibonacci, Catalan, binomial coefficients) have overlapping subproblems. Memoization transforms exponential recursive trees into polynomial DAGs. Watch the computation unfold and count the actual work done.

F(n) = F(n-1) + F(n-2), F(0)=0, F(1)=1
Step 0 / 7
fib(1)fib(0)fib(2)fib(3)fib(4)fib(5)fib(6)
Unique Subproblems
7
Dependencies
10
Final Value
?

Memoization = DAG, Not Tree

Without memoization, recursive calls form a tree with exponential nodes. With memoization, we only compute each unique subproblem once. The call structure becomes a DAG (Directed Acyclic Graph), reducing exponential time to polynomial time.

Demo 4: Set Partitions and Bell Numbers

The Bell number Bn counts ways to partition a set of n elements into non-empty subsets. The Stirling numbers S(n,k) refine this count by number of blocks. These appear in clustering and database optimization.

Set: {1, 2, 3, 4}
B4 =15
= total number of ways to partition 4 elements

Stirling Numbers of the Second Kind S(n, k)

S(n, k) = number of ways to partition n elements into exactly k non-empty blocks

n\k01234
010000
101000
201100
301310
401761
Row sums = Bell numbers: 1, 1, 2, 5, 15

All Partitions of {1, ..., 4}

k = 1 blocks(S(4, 1) = 1)
{1,2,3,4}
k = 2 blocks(S(4, 2) = 7)
{1,2,3} {4}
{1,2,4} {3}
{1,2} {3,4}
{1,3,4} {2}
{1,3} {2,4}
{1,4} {2,3}
{1} {2,3,4}
k = 3 blocks(S(4, 3) = 6)
{1,2} {3} {4}
{1,3} {2} {4}
{1} {2,3} {4}
{1,4} {2} {3}
{1} {2,4} {3}
{1} {2} {3,4}
k = 4 blocks(S(4, 4) = 1)
{1} {2} {3} {4}

Bell Triangle (First 5 rows)

1
12
235
571015
1520273752

First element of each row = Bell number Bn

Combinatorial Explosion

Bell numbers grow super-exponentially: B1=1, B5=52, B10=115,975. This rapid growth appears in clustering problems, database query optimization, and anywhere we need to consider all possible groupings.

Combinatorics Toolkit Acquired!

You now have powerful tools for counting structures:

  • Catalan numbers count trees, paths, parenthesizations, and more
  • Generating functions solve recurrences algebraically
  • Memoization exploits overlapping subproblems for efficiency
  • Bell and Stirling numbers count set partitions

Next: We'll analyze divide-and-conquer algorithms with the Master Theorem, understanding exactly when T(n) = aT(n/b) + f(n) gives O(n log n) versus O(n²).