Advanced Tree Structures

Study B-trees for disk optimization, persistent trees for version control, and van Emde Boas trees for integer keys

Beyond Binary: Specialized Tree Structures

Binary search trees are fundamental, but specialized applications demand specialized structures. B-trees minimize disk accesses, persistent trees preserve history efficiently, finger trees provide fast access to both ends, and van Emde Boas trees achieve O(log log U) for integers.

Each structure optimizes for a specific access pattern: disk I/O, immutability, double-ended queues, or integer predecessor queries. Understanding when to use each is the mark of an advanced algorithm designer.

Demo 1: B-Trees & Disk Access

B-trees store many keys per node, minimizing the height and thus disk accesses. A B-tree of order m has at most m children per node. With m=100 and millions of keys, we need only 2-3 disk reads for any search. This powers databases like PostgreSQL and file systems like NTFS.

10 | 203 | 5 | 712 | 15 | 1825 | 30
Height
2
Keys
10
Order
4
Insert Disk I/O
23

Search Demo

Disk Access Comparison

With 10 keys and order 4:

Binary Search Tree
4 disk reads
Height: log₂(10) ≈ 4
B-Tree (order 4)
2 disk reads
Height: log4(10) ≈ 2
50% fewer disk accesses with B-tree!
In-order traversal:
[3, 5, 7, 10, 12, 15, 18, 20, 25, 30]

Demo 2: Persistent Trees

Persistent (immutable) data structures preserve all previous versions. Using path copying, we create new nodes only along the path to the modified node, sharing unchanged subtrees. This gives O(log n) space per update instead of O(n) for full copies.

Version 7: Insert 8
5324768
Values: [2, 3, 4, 5, 6, 7, 8]

Version History

Space Efficiency

Actual nodes:17
With full copies:28
Space savings:39%

Path copying shares unchanged subtrees between versions.

Compare Versions

vs
Shared nodes:0
Unique to v0:0
Unique to v7:7

How Path Copying Works

When modifying a node, we create a new path from the root to that node. All nodes not on this path are shared with the previous version. This gives O(log n) space per update instead of O(n) for a full copy.

Demo 3: Finger Trees

Finger trees provide O(1) amortized access to both ends, O(log n) concatenation and splitting. They achieve this by maintaining "fingers" (direct pointers) to the ends, with a recursive middle structure. Perfect for implementing sequences and priority queues.

Tree Structure (Deep):
prefix:
1
|
middle (Empty)
|
suffix:
2
3
4
5
Size
5
Left (Front)
1
Right (Back)
5
Type
Deep
Sequence (left to right):
1
2
3
4
5
Recent Operations:
Built initial tree [1,2,3,4,5]

Finger Tree Complexities

Operation
Time
Description
pushLeft
O(1)
Add to front (amortized)
pushRight
O(1)
Add to back (amortized)
popLeft
O(1)
Remove from front (amortized)
popRight
O(1)
Remove from back (amortized)
peekLeft
O(1)
View front element
peekRight
O(1)
View back element
concat
O(log n)
Concatenate two trees
split
O(log n)
Split at position
index
O(log n)
Access by index (with annotation)

Why O(1) Amortized?

Finger trees maintain fingers (direct pointers) to both ends. The prefix and suffix hold 1-4 elements each. When full, we push a group to the recursive middle structure. The amortized cost spreads out the rare expensive operations across many cheap ones.

Demo 4: Van Emde Boas Trees

vEB trees achieve O(log log U) for insert, delete, search, successor, and predecessor on integers in [0, U-1]. They work by recursively splitting the universe into √U clusters of size √U. For U = 232, this means at most 5 recursions!

Elements in vEB tree (universe [0, 15]):
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Values: {2, 3, 5, 7, 11, 13}

Why O(log log U)?

vEB trees recursively split the universe into √U clusters of size √U. Each recursion halves the exponent, giving depth log log U.

For U = 16:
U=16
U=4
U=2
log U = 4, log log U = 2

Data Structure Comparison (U = 16)

StructureInsertSearchSuccessor
Sorted ArrayO(n)O(log n)O(log n)
BST / AVL / RBO(log n)O(log n)O(log n)
Hash TableO(1)O(1)O(n)
Bit VectorO(1)O(1)O(U) = O(16)
van Emde BoasO(log log U) = O(2)O(log log U) = O(2)O(log log U) = O(2)

When to Use vEB Trees

vEB trees excel when you need fast successor/predecessor queries on integers from a bounded universe. They're used in network routers (IP lookup), scheduling algorithms, and computational geometry. The trade-off is O(U) space.

Advanced Trees Mastered!

You now understand specialized tree structures for different use cases:

  • B-trees: Wide nodes minimize disk accesses — O(logm n)
  • Persistent trees: Path copying preserves history in O(log n) space
  • Finger trees: O(1) ends, O(log n) split/concat
  • Van Emde Boas: O(log log U) by recursive universe splitting

Next: Head to the Algorithms Playground to experiment freely with all structures, then test your knowledge with a comprehensive quiz!