Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Red-Black Trees
Robert Sedgewick
Princeton University
Textbooks on algorithms
. . .
. . .
Worth revisiting?
Red-black trees Introduction
2-3-4 Trees
LLRB Trees
are now found throughout our computational infrastructure Deletion
Analysis
Typical:
Digression: Introduction
2-3-4 Trees
LLRB Trees
Red-black trees are found in popular culture?? Deletion
Analysis
Mystery: black door?
Mystery: red door?
An explanation ?
Primary goals Introduction
2-3-4 Trees
LLRB Trees
Deletion
Analysis
Red-black trees (Guibas-Sedgewick, 1978)
• reduce code complexity
• minimize or eliminate space overhead
• unify balanced tree algorithms
• single top-down pass (for concurrent algorithms)
• find version amenable to average-case analysis
Current implementations
• maintenance
• migration
• space not so important (??)
• guaranteed performance
• support full suite of operations
Worth revisiting ?
Primary goals Introduction
2-3-4 Trees
LLRB Trees
Deletion
Analysis
Red-black trees (Guibas-Sedgewick, 1978)
• reduce code complexity
• minimize or eliminate space overhead
• unify balanced tree algorithms
• single top-down pass (for concurrent algorithms)
• find version amenable to average-case analysis
Current implementations
• maintenance
• migration
• space not so important (??)
• guaranteed performance
• support full suite of operations
Perfect balance. Every path from root to leaf has same length.
between
K and R
C E M O W
A D F G J L N Q S V Y Z
Search in a 2-3-4 Tree Introduction
2-3-4 Trees
LLRB Trees
Compare node keys against search key to guide search. Deletion
Analysis
Search.
• Compare search key against keys in node.
• Find interval containing search key.
• Follow associated link (recursively).
between
K and R
C E M O W
smaller than M
A D F G J L N Q S V Y Z
found L
Insertion in a 2-3-4 Tree Introduction
2-3-4 Trees
LLRB Trees
Add new keys at the bottom of the tree. Deletion
Analysis
Insert.
• Search to bottom for key.
Ex: Insert B K R
smaller than K
smaller than C C E M O W
A D F G J L N Q S V Y Z
B not found
Insertion in a 2-3-4 Tree Introduction
2-3-4 Trees
LLRB Trees
Add new keys at the bottom of the tree. Deletion
Analysis
Insert.
• Search to bottom for key.
• 2-node at bottom: convert to a 3-node.
Ex: Insert B K R
smaller than K
smaller than C C E M O W
A B D F G J L N Q S V Y Z
B fits here
Insertion in a 2-3-4 Tree Introduction
2-3-4 Trees
LLRB Trees
Add new keys at the bottom of the tree. Deletion
Analysis
Insert.
• Search to bottom for key.
Ex: Insert X K R
larger than R
larger
than W
C E M O W
A D F G J L N Q S V Y Z
X not found
Insertion in a 2-3-4 Tree Introduction
2-3-4 Trees
LLRB Trees
Add new keys at the bottom of the tree. Deletion
Analysis
Insert.
• Search to bottom for key.
• 3-node at bottom: convert to a 4-node.
Ex: Insert X K R
larger than R
larger
than W
C E M O W
A D F G J L N Q S V X Y Z
X fits here
Insertion in a 2-3-4 Tree Introduction
2-3-4 Trees
LLRB Trees
Add new keys at the bottom of the tree. Deletion
Analysis
Insert.
• Search to bottom for key.
Ex: Insert H K R
smaller than K
C E M O W
larger than E
A D F G J L N Q S V Y Z
H not found
Insertion in a 2-3-4 Tree Introduction
2-3-4 Trees
LLRB Trees
Add new keys at the bottom of the tree. Deletion
Analysis
Insert.
• Search to bottom for key.
• 2-node at bottom: convert to a 3-node.
• 3-node at bottom: convert to a 4-node.
• 4-node at bottom: no room for new key.
Ex: Insert H K R
smaller than K
C E M O W
larger than E
A D F G J L N Q S V Y Z
no room for H
Splitting 4-nodes in a 2-3-4 tree Introduction
2-3-4 Trees
LLRB Trees
is an effective way to make room for insertions Deletion
Analysis
move middle
key to parent
C E C E G
split remainder
into two 2-nodes
A B D F J
A B D F G J
local transformations
that work anywhere in the tree
Consequences:
• 4-node below a 4-node case never happens
• Bottom node reached is always a 2-node or a 3-node
Splitting a 4-node below a 2-node Introduction
2-3-4 Trees
LLRB Trees
is a local transformation that works anywhere in the tree Deletion
Analysis
D D Q
K Q W K W
A-C A-C
DH D H Q
K Q W K W
A-C E-G A-C E-G
insert A
E
A A C R S
insert S
insert D
A S
E
insert E
A C D R S
A E S
insert I
insert R
split 4-node to
E E
E
A S
A C D I R S
tree grows A R S and then insert
up one level
Growth of a 2-3-4 tree (continued) Introduction
2-3-4 Trees
LLRB Trees
happens upwards from the bottom Deletion
Analysis
split 4-node to
E R
E insert N
I S
A C D I R S and then insert
E R
A C D I N S
insert X split 4-node to
insert B C E R E
C R
A B D I N S and then insert
split 4-node to
C E R
A D E
and then insert C R
tree grows
up one level
A B D I N S X
Balance in 2-3-4 trees Introduction
2-3-4 Trees
LLRB Trees
Key property: All paths from root to leaf are the same length Deletion
Analysis
Tree height.
• Worst case: lg N [all 2-nodes]
• Best case: log4 N = 1/2 lg N [all 4-nodes]
• Between 10 and 20 for 1 million nodes.
• Between 15 and 30 for 1 billion nodes.
Bottom line: Could do it, but stay tuned for an easier way.
Introduction
2-3-4 Trees
LLRB Trees
Deletion
Analysis
Red-black trees (Guibas-Sedgewick, 1978) Introduction
2-3-4 Trees
LLRB Trees
1. Represent 2-3-4 tree as a BST. Deletion
Analysis
2. Use "internal" red edges for 3- and 4- nodes.
3-node 4-node
or
Key Properties
• elementary BST search works
• easy to maintain a correspondence with 2-3-4 trees
(and several other types of balanced trees)
E
C E
C J
B D G F
A B D F G J
A
E C C
B G F G F
3-node 4-node
Key Properties
• elementary BST search works
• easy-to-maintain 1-1 correspondence with 2-3-4 trees
• trees therefore have perfect black-link balance
E
C E
C G
B D F J
A B D F G J
A
Left-leaning red-black trees Introduction
2-3-4 Trees
LLRB Trees
1. Represent 2-3-4 tree as a BST. Deletion
Analysis
2. Use "internal" red edges for 3- and 4- nodes.
3. Require that 3-nodes be left-leaning.
3-node 4-node
Disallowed
• right-leaning 3-node representation
single-rotation trees
allow all of these
Java data structure for red-black trees Introduction
2-3-4 Trees
LLRB Trees
adds one bit for color to elementary BST data structure Deletion
Analysis
return h;
}
Note: effectively travels down the tree and then up the tree.
• simplifies correctness proof
• simplifies code for balanced BST implementations
• could remove recursion to get stack-based single-pass algorithm
Balanced tree code Introduction
2-3-4 Trees
LLRB Trees
is based on local transformations known as rotations Deletion
Analysis
R-Z A-E
A-E G-P G-P R-Z
Insert a new node at the bottom in a LLRB tree Introduction
2-3-4 Trees
LLRB Trees
follows directly from 1-1 correspondence with 2-3-4 trees Deletion
Analysis
1. Add new node as usual, with red link to glue it to node above
2. Rotate if necessary to get correct 3-node or 4-node representation
rotate
left
rotate
right
rotate
left
Splitting a 4-node Introduction
2-3-4 Trees
LLRB Trees
is accomplished with a color flip Deletion
Analysis
Key points:
• preserves prefect black-lin balance
• passes a RED link up the tree
• reduces problem to inserting (that link) into parent
Splitting a 4-node in a LLRB tree Introduction
2-3-4 Trees
LLRB Trees
follows directly from 1-1 correspondence with 2-3-4 trees Deletion
Analysis
color
flip
color rotate
flip left
Splitting a 4-node in a LLRB tree Introduction
2-3-4 Trees
LLRB Trees
follows directly from 1-1 correspondence with 2-3-4 trees Deletion
Analysis
color rotate
flip right
color
flip
Inserting and splitting nodes in LLRB trees Introduction
2-3-4 Trees
LLRB Trees
are easier when rotates are done on the way up the tree. Deletion
Analysis
Search as usual or
2. Split a 4-node.
if (isRed(h.left) && isRed(h.right))
colorFlip(h);
4. Balance a 4-node.
if (isRed(h.left) && isRed(h.left.left))
h = rotateRight(h);
Insert implementation for LLRB trees Introduction
2-3-4 Trees
LLRB Trees
is a few lines of code added to elementary BST insert Deletion
Analysis
if (isRed(h.right))
fix right-leaning reds on the way up
h = rotateLeft(h);
return h;
}
LLRB (top-down 2-3-4) insert movie Introduction
2-3-4 Trees
LLRB Trees
Deletion
Analysis
A surprise Introduction
2-3-4 Trees
LLRB Trees
Deletion
Analysis
Q. What happens if we move color flip to the end?
if (isRed(h.right))
h = rotateLeft(h);
return h;
}
A surprise Introduction
2-3-4 Trees
LLRB Trees
Deletion
Analysis
Q. What happens if we move color flip to the end?
if (isRed(h.right))
h = rotateLeft(h);
return h;
}
A surprise Introduction
2-3-4 Trees
LLRB Trees
Deletion
Analysis
Q. What happens if we move color flip to the end?
A. It becomes an implementation of 2-3 trees (!)
return h;
}
Insert implementation for 2-3 trees (!) Introduction
2-3-4 Trees
LLRB Trees
is a few lines of code added to elementary BST insert Deletion
Analysis
if (isRed(h.right))
h = rotateLeft(h); fix right-leaning reds on the way up
return h;
}
LLRB (bottom-up 2-3) insert movie Introduction
2-3-4 Trees
LLRB Trees
Deletion
Analysis
Why revisit red-black trees? Introduction
2-3-4 Trees
LLRB Trees
Which do you prefer? Deletion
Analysis
private Node insert(Node x, Key key, Value val, boolean sw)
{
if (x == null)
return new Node(key, value, RED);
int cmp = key.compareTo(x.key);
wrong scale!
TreeMap.java
Adapted from
CLR by
experienced
professional
programmers
(2004)
2008
1978
2008
1978
h.right.right
turns RED
deleteMax() implementation for LLRB trees Introduction
2-3-4 Trees
LLRB Trees
is otherwise a few lines of code Deletion
Analysis
B F J N B F J N
A C E G I K M O A C E G I K M
2 D L
6
H
D L
B F J N
A C E G I K M O B F J N
A C E G I K M
H
D L
3 H
B F J N 7 D L
A C E G I K M O
B F J N
H A C E G I K M
D L
4 (nothing to fix!)
B F J N
A C E G I K M O
H
remove maximum
D L
5 B F J N
A C E G I K M
deleteMax() example 2 Introduction
2-3-4 Trees
LLRB Trees
push reds down remove maximum
H Deletion
1 D L
4 D Analysis
B H
B F J N
A C F J
A C E G I K M
E G I L
K M
2 D
N
fix right-leaning reds
B H on the way up
A C L
F 5 D
E G J N
B H
I K M
A C F L
E G
3 D
J M
B H I K
A C F J 6
E G I L
H
K N D L
M
M
4 D B F J
A C E G I K
B H
A C F J
E G I L
K M
N
LLRB deleteMax() movie Introduction
2-3-4 Trees
LLRB Trees
Deletion
Analysis
Warmup 2: delete the minimum Introduction
2-3-4 Trees
LLRB Trees
is similar but slightly different (since trees lean left). Deletion
Analysis
color
Borrow from sibling h
flip
h
h.left.left
turns RED
deleteMin() implementation for LLRB trees Introduction
2-3-4 Trees
LLRB Trees
is a few lines of code Deletion
Analysis
B F J N B F J N
A C E G I K M O C E G I K M O
H
H
2 D L
6 D L
B F J N
C F J N
A C E G I K M O
B E G I K M O
H
H
D L
3 F L
B F J N 7
D G J N
A C E G I K M O
C E I K M O
H
B
D L
L
4
B F J N H
8 N
A C E G I K M O
F J M O
H G I K
D
remove minimum
D L C E
5 B F J N B
C E G I K M O
LLRB deleteMin() movie Introduction
2-3-4 Trees
LLRB Trees
Deletion
Analysis
Deleting an arbitrary node Introduction
2-3-4 Trees
LLRB Trees
involves the same general strategy. Deletion
Analysis
Difficulty:
• Far too many cases!
• LLRB representation dramatically reduces the number of cases.
A standard trick:
A C E G I K M
B F J N
A C E G I K M
H
then delete the successor
E L
deleteMin(right child of D) B J
F N
A C E G I K M
flip colors, delete node
H
E L
A C F I K M
Deleting an arbitrary node at the bottom Introduction
2-3-4 Trees
LLRB Trees
can be implemented with the same helper methods Deletion
used for deleteMin() and deleteMax(). Analysis
L L
I F
N N
F J M O I M O
D G
D G I K C E K
C E B
J
B
delete() implementation for LLRB trees Introduction
2-3-4 Trees
LLRB Trees
private Node delete(Node h, Key key) Deletion
{ Analysis
int cmp = key.compareTo(h.key);
if (cmp < 0) LEFT
{
push red right if necessary
if (!isRed(h.left) && !isRed(h.left.left))
h = moveRedLeft(h); move down (left)
h.left = delete(h.left, key);
}
else RIGHT or EQUAL
{
if (isRed(h.left)) h = leanRight(h); rotate to push red right
insert delete helper accomplishes the same result with less than 1/5 the code
Introduction
2-3-4 Trees
LLRB Trees
Deletion
Analysis
Worst-case analysis Introduction
2-3-4 Trees
LLRB Trees
follows immediately from 2-3-4 tree correspondence Deletion
Analysis
1 1
2 2 2 2
N: 8
internal path length: 0 + 1 + 1 + 2 + 2 + 2 + 2 + 3 = 13
Experimental evidence
Tufte plot
lg N − 1.5
14
100 50,000
Average-case analysis of balanced trees Introduction
2-3-4 Trees
LLRB Trees
deserves another look! Deletion
Analysis
Main questions:
Is average path length in tree built from random keys ~ c lg N ?
If so, is c = 1 ?
100 50,000
Experimental evidence Introduction
2-3-4 Trees
LLRB Trees
can suggest and confirm hypotheses Deletion
Analysis
Average path length in (top-down) 2-3-4 tree built from random keys
lg N − 1.5
14
100 50,000
100 50,000
Average-case analysis of balanced trees Introduction
2-3-4 Trees
LLRB Trees
deserves another look! Deletion
Analysis
Main questions:
Is average path length in tree built from random keys ~ c lg N ?
If so, is c = 1 ?
Exact distributions
4
12 12
5
72 48
1152 864 7
2160 864
Left subtree size in left-leaning 2-3 trees Introduction
2-3-4 Trees
LLRB Trees
Limiting distribution? Deletion
Analysis
64
Left subtree size in left-leaning 2-3 trees Introduction
2-3-4 Trees
LLRB Trees
Tufte plot Deletion
Analysis
64
Left subtree size in left-leaning 2-3 trees Introduction
2-3-4 Trees
LLRB Trees
Tufte plot Deletion
100 Analysis
500
Left subtree size in left-leaning 2-3 trees Introduction
2-3-4 Trees
LLRB Trees
Limiting distribution? Deletion
Analysis
350
smooth factor 10
400
An exercise in the analysis of algorithms Introduction
2-3-4 Trees
LLRB Trees
Find a proof ! Deletion
Analysis
lg N − 1.5
14
100 50,000
Addendum:
Observations
Observation 1 Introduction
2-3-4 Trees
LLRB Trees
The percentage of red nodes in a 2-3 tree is Deletion
Analysis
between 25 and 25.5%
25 25.38168
100 50,000
Observation 2 Introduction
2-3-4 Trees
LLRB Trees
The height of a 2-3 tree is ~2 ln N (!!!) Deletion
Analysis
21.66990
22
lg N - 1.5
14
100 50,000
Very surprising because the average path length in an elementary BST is also ~2 ln N ≈ 1.386 lg N
Observation 3 Introduction
2-3-4 Trees
LLRB Trees
The percentage of red nodes on each path in a 2-3 tree Deletion
Analysis
rises to about 25%, then drops by 2 when the root splits
Observation 4 Introduction
2-3-4 Trees
LLRB Trees
In aggregate, the observed number of red links per path Deletion
Analysis
log-alternates between periods of steady growth and
not-so-steady decrease (because root-split times vary widely)
0
500