0 valutazioniIl 0% ha trovato utile questo documento (0 voti)

2 visualizzazioni93 pagineMay 03, 2013

ECE750_F2008_Algorithms4

© Attribution Non-Commercial (BY-NC)

PDF, TXT o leggi online da Scribd

Attribution Non-Commercial (BY-NC)

0 valutazioniIl 0% ha trovato utile questo documento (0 voti)

2 visualizzazioni93 pagineECE750_F2008_Algorithms4

Attribution Non-Commercial (BY-NC)

Sei sulla pagina 1di 93

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

ECE750 Lecture 4: Trees, Tree Iterators,

Treaps, Unique Representation, Persistence,

Red-Black Trees, Tries

Todd Veldhuizen

tveldhui@acm.org

Electrical & Computer Engineering

University of Waterloo

Canada

Oct. 5, 2007

1 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Trees

sorted array in (log n) time. However, inserting or

removing an item from the array took (n) time in the

worst case.

also (log n) insert and remove.

of a search space:

...

2 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Trees

3 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Trees

three common orders for doing this:

[E, C, B, A, D, G, F, H, I ]

the right subtree, e.g., [A, B, C, D, E, F, G, H, I ].

[A, B, D, C, F, I , H, G, E].

4 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Trees

Terminology:

C is the parent of B, D;

5 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Search Trees

any tree node x,

1. If x.left ,= null, then x.left.data x.data;

2. If x.right ,= null, then x.data x.right.data.

start at the root and recursively:

1. Visit the left subtree;

2. Visit the node;

3. Visit the right subtree.

i.e. an inorder traversal.

6 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Search Trees

array.

boolean contains(int z)

if (z == data)

return true;

else if ((z < data) && (left != null ))

return left . contains(z);

else if ((z > data) && (right != null))

return right . contains(z);

return false ;

7 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Search Trees

the height of the tree.

from the root to a leaf.

h is the height of the tree.

8 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Search Trees

2

h

1 values.

tree of height at most 1 + log

2

(n):

array A[0..n 1].

Make the root A[0]; make its right subtree A[1], ...

9 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Search Trees

the tree balanced, i.e., have all leaves roughly the same

distance from the root.

dynamically.

probability of having a badly balanced tree 0 as

n .

Example: treaps

balance in the worst case.

red-black trees;

AVL trees;

2-3 trees;

B-trees;

splay trees.

10 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Enumeration of Binary Search Trees

on n keys is given by the Catalan numbers:

C

n

=

_

2n

n

_

1

n + 1

4

n

n

3/2

(Sequence A000108 in the Online Encyclopedia of

Integer Sequences.)

1 1

2 2

3 5

4 14

5 42

6 132

7 429

8 1430

9 4862

10 16796

11 58786

12 208012

11 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Search Trees

balance, is the following:

void insert (int z)

if (z == data) return;

else if (z < data)

if ( left == null)

left = new Tree(z);

else

left . insert (z);

else if (z > data)

if ( right == null)

right = new Tree(z);

else

right . insert (z);

12 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Search Trees

Theorem

The expected height of a binary search tree constructed by

inserting a sequence of n random values is c log n with

c 4.311.

order.

random variable giving the height of a tree after the

insertion of n keys chosen uniformly at random, then

Pr(H n)

E[H]

n

=

c log n

n

= O

_

log n

n

_

i.e., the probability of a tree having height linear in n

converges to zero. So, badly balanced trees are very

unlikely for large n.

13 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Search Trees

Result of 100 random insertions.

14 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Implementing maps and sets

since it can eciently insert, remove, and check

whether a key is in the tree.

Map[K, V] that maps keys to values (e.g., a telephone

book.) Simply include a value eld in each tree node:

class TreeNode

K key;

V value;

TreeNode left ;

TreeNode right;

15 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Iterator for a binary search tree I

order. This is just an inorder traversal, which can be

implemented recursively:

class BSTNode

...

void traverseInorder ()

if ( left != null)

left >traverseInorder ();

// do something with this key

if ( right != null)

right >traverseInOrder();

16 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Iterator for a binary search tree II

traversal will start by following left pointers until a leaf

is reached. Well call this a fathom operation.

that is held on the program stack (the sequence of tree

nodes along the path from root to the current node)

with an explicit stack object.

17 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Iterator for a binary search tree I

class BSTIterator implements Iterator

Stack stack;

public BSTIterator(BSTNode t)

fathom(t);

BSTNode t = (BSTNode)stack.pop();

if (t . right child != null)

18 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Iterator for a binary search tree II

fathom(t. right child );

return t ;

void fathom(BSTNode t)

do

stack. push(t );

t = t. left child ;

while (t != null );

19 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Rotations I

subtrees shorter and others deeper.

D

.~

~

~

~

d

d

d

d

B

d

d

d

d

E

A C

rotate right at D

rotate left at B

B

d

d

d

d

A D

.~

~

~

~

d

d

d

d

C E

of keys in the tree remains the same.

Any two binary search trees on the same set of keys can

be transformed into one another by a sequence of

rotations.

20 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Rotations II

each n there is a sequence of rotations that produces

every possible binary tree without any duplicates, and

eventually returns the tree to its initial conguration.

(i.e., the rotation graph G

n

, where vertices are trees and

edges are rotations, contains a Hamiltonian cycle [16].)

tree.

21 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Rotations

Tree rotateRight ()

if ( left == null)

throw new RuntimeException(Cannot rotate here);

Tree A = left . left ;

Tree B = left ;

Tree C = left . right ;

Tree D = this;

Tree E = right ;

return new Tree(B.data, A, new Tree(D.data, C, E));

22 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Rotation example

a, then left at c:

a

b

b

b

b

b

c

c

c

d

c

c

e

b

b

b

a c

c

c

d

c

c

e

b

a

a

a

d

c

c

c e

23 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Part I

Balancing strategies for binary search trees

24 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Binary Search Trees

required to nd or insert an element is O(h).

expected height of c log n with c 4.311.

introduce the ideas of unique representation and

persistence.

25 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Red-black trees

with a deterministic balancing strategy.

path to a leaf is no more than twice the length of the

shortest path.

2

(n + 1), which implies

search, min, max in O(log n) worst-case time.

worst-case time.

Guibas and Sedgewick [14]. Other sources: [7, 19].

26 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Red-Black Trees: Invariants

Balance invariants:

1. No red node has a red child.

2. Every path in a subtree contains the same number of

black nodes.

labelled with a 0, and a right subchild with a 1.

27 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Red-Black Trees

28 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Red-Black Trees: Balance I

Let bh(x) be the number of black nodes along any path

from a node x to a leaf, excluding the leaf.

Lemma

The number of internal nodes in the subtree rooted at x is

at least 2

bh(x)

1.

Proof.

29 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Red-Black Trees: Balance II

By induction on height:

1. Base case: If x has height 0, then x is a leaf, and

bh(x) = 0; the number of internal (non-leaf)

descendents of x is 0 = 2

bh(x)

1.

2. Induction step: assume the hypothesis is true for height

h. Consider a node of height h + 1. From invariant

(2), the children have black height either bh(x) 1 (if

the child is black) or bh(x) (if the child is red). By

induction hypothesis, each child subtree has at least

2

bh(x)1

1 internal nodes. The total number of

internal nodes in the subtree rooted at x is therefore

(2

bh(x)1

1) + 1 + (2

bh(x)1

1) = 2

bh(x)

1.

30 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Red-Black Trees: Balance

Theorem

A red-black tree with n internal nodes has height at most

2 log

2

(n + 1).

Proof.

Let h be the tree height. From invariant 1 (a red node must

have both children black), the black-height of the root must

be h/2. Applying Lemma 0.2, the number of internal

nodes n of the tree satises n 2

h/2

1. Rearranging,

h 2 log

2

(n + 1).

31 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Red-Black Trees: Balance

operations are performed.

rotations and recolourings are needed to restore them.

1. Insert the new key as a red node, using the usual binary

tree insert.

2. Perform restructurings and recolourings along the path

from the newly added leaf to the root to restore

invariants.

3. Root is always coloured black.

32 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Red-Black Trees: Balance

cases becomes

33 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Red-Black Trees: Example

bit involved.

34 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Treaps

Treaps (binary TRee + hEAP)

where = [(A B) (B A)[ is the dierence between

the sets

Aragon [22]. Additional references: [5, 27, 26].

35 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Treaps: Basics

assigning each key a random integer, or using an

appropriate hash function

tree traversal visits keys in order).

36 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Treaps: the binary tree part

total order K, )

ascending order.

d

b h

a c f i

37 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Treaps: the heap part

order structure P, )

max heap.)

23

11 14

7 1 6 13

38 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Treap = Tree + Heap

key, and p P is a priority.

(d,23)

(b,11) (h,14)

(a,7) (c,1) (f,6) (i,13)

39 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Treap ordering

Ordering invariants:

(k

2

, p

2

)

t

t

t

t

t

t

t

t

t

u

u

u

u

u

u

u

u

u

(k

1

, p

1

) (k

3

, p

3

)

k

1

k

2

k

3

Key order

_

p

2

p

p

1

p

2

p

p

3

Priority order

Every node has a higher priority than its descendents.

40 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Balance of treaps

randomly.

insertion order results in a tree of expected height

c log n, with c 4.311.

exactly the same structure as a binary search tree

created by inserting keys in descending order of priority

c 4.311.

as its priority. (Especially if the hash function is chosen

randomly). We will cover hash functions soon.

41 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Insertion into treaps

red-black trees.

1. Insert the (k, p) pair as for a binary search tree, by key

alone: the new node will be placed somewhere at the

bottom of the tree.

2. Perform rotations along the path from the new leaf to

the root to restore invariants:

priority, rotate left at x.

priority, rotate right at x.

42 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Insertion into treaps

Example: the treap below has just had (e, 19) inserted

as a new leaf. Rotations have not yet been performed.

(d,23)

(b,11) (h,14)

(a,7) (c,1) (f,6) (i,13)

(e,19)

at f .

43 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Insertion into treaps

(d,23)

(b,11) (h,14)

(a,7) (c,1) (e,19) (i,13)

(f,6)

at h.

44 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Insertion into treaps

(d,23)

(b,11) (e,19)

(a,7) (c,1) (h,14)

(f,6) (i,13)

45 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Unique representation I

unique, then the treap has the unique representation

property: the shape of the treap depends only on the

keys and priorities, not the history of the order in which

keys were inserted/deleted.

by a unique conguration of the data structure

[1, 24, 23]. It is not possible to have two

representations for the same set.

depending on order of inserts, deletes, etc. the tree can

have dierent forms for the same set of keys.

n

4

n

n

3/2

1/2

ways to place n

keys in a binary search tree (Catalan numbers). e.g.

C

20

= 6564120420.

46 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Unique representation II

represented search trees are known to require (

n)

worst-case time for insert, delete, search [23].

O(log n) average-case time for insert, delete, search

represented data structure, you can do equality testing

in O(1) time by comparing pointers.

47 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Unique Representation property of treaps

the unique representation property: given a set of (k, p)

pairs, there is only one way to build the tree.

(k, p) pair that can be the root: the one with the

highest priority.

The left subtree of the root will contain all keys < k,

and the right subtree of the root will contain all keys

> k.

Of the keys < k, the one with the highest priority must

occupy the left child of the root. This then splits

constructing the left subtree into two subproblems.

etc.

48 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Unique Representation property of treaps

(i , 13), (c, 1), (d, 23), (b, 11), (h, 14), (a, 7), (f , 6),

unique choice of root: (d, 23) (the key with the highest

priority)

(d, 23)

j

j

j

j

j

{(c, 1), (b, 11), (a, 7)} {(i , 13), (h, 14), (f , 6)}

element: (b, 11). And so forth.

(d, 23)

t

t

t

t

(b, 11)

u

u

u

u

u

u

u

u

{(i , 13), (h, 14), (f , 6)}

(a, 7) (c, 1)

49 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Part II

Persistence

50 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Persistent Data Structures

Literature: [10, 11, 6, 9]

data structure, but cannot derive new versions from

them (read-only access to a linear past.)

of the data structure: versions can fork.

in-degree can be made fully persistent with amortized

O(1) space and time overhead, and worst case O(1)

overhead for access [10]

the data structure, and later reconcile these branches

51 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

The Version Graph

The version graph shows how versions of a data structure

are derived from one another.

another

52 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Version graph

versions, each derived from the previous version.

acyclic graph)

X

{

{

{

{

{

{

{

{

e

e

e

e

e

e

e

e

Y1

Y2

g

g

g

g

g

g

g

g

W

53 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Applications of persistent data structures

versions of the database, for audit or historical analysis.

of a data structure without the need for locking; if the

data structure is conuently persistent, the processors

can periodically merge their versions.

store state, etc. simultaneously at many parts of the

program.

nding the polygon enclosing a point which uses

persistent data structures [21].

54 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Purely Functional Data Structures

Literature: [19]

the data structure once it is created. (One implication:

no cyclic data structures.)

persistent: we can always hold onto pointers to old

versions of the data structure.

55 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Contrast: BST insert in a nonfunctional style

void insert (int z)

if (z == data) return;

else if (z < data)

if ( left == null)

left = new Tree(z);

else

left . insert (z);

else if (z > data)

if ( right == null)

right = new Tree(z);

else

right . insert (z);

56 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Contrast: BST insert in a pure functional style

No assignments to elements of the node; changes are made by creating

new nodes. The insert operation returns a new root pointer. The

previous root pointer still points to the original version of the tree.

Tree insert (int z)

if (z == data)

return this ;

else if (z < data)

if ( left == null)

return new Tree(data, new Tree(z), right );

else

return new Tree(data, left . insert (z), right );

else if (z > data)

if ( right == null)

right = new Tree(data, left , new Tree(z));

else

return new Tree(data, left , right . insert (z ));

57 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Fully persistent treaps

versions) by implementing them in a purely functional

style. Insertion requires duplicating at most a sequence

of nodes from the root to a leaf: an O(log n) space

overhead. The remaining parts of the tree are shared.

Version 2

(d,23)

(b,11)

(e,19)

(a,7) (c,1)

(h,14)

(f,6) (i,13)

(d,23)

Version 1

58 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Unique representation + persistence

constructed, we can look it up in a cache to see if we

have constructed that tree before. If so, we return a

pointer to the original version.

the same keys just by comparing their pointers.

59 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Treap: Example

Treap A1 = R.insert("f"); // Insert the key f

Treap A2 = A1.insert("u"); // Insert the key u

Treap B1 = R.insert("u"); // Insert the key u into R

Treap B2 = R.insert("f"); // Insert the key f

assert(A2 == B2);

60 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

61 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

62 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

63 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Part III

Tries

64 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Strings

alphabet . We will often use = 0, 1: binary

strings.

We write

1

composed of

characters from . (

If w, v

mean the concatenation of w and v.

) together with an

associative binary operator () and an identity element (). For

any strings u, v, w

,

u (v w) = (u v) w

v = v = v

1

Innite strings are very useful also: if we write a real number

x [0, 1] as a binary number e.g. 0.101100101000 , this is a

representation of x by an innite string from

.

65 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries

binary tree with 0 (for left) and 1 (for right):

0

y

y

y

y 1

d

d

d

d

x

0

1

X

X

X

X

y

z

of left/right branches to take from the root. E.g., 10

gives y, 11 gives z.

P

= 0, 10, 11

nodes is: P

string indicating the path starting and ending at the

root.

66 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries

The set P

of any other string. Otherwise, there would be a path

to a leaf passing through another leaf!

The set P

is prex-closed: if wv P

, then w P

also. i.e., P

.

2

2

We can dene

as an operator by A

{w : wv A}.

is a

closure operator. A useful fact: every closure operator has as its range a

complete lattice, where meet and join are given by (X Y)

= X

and (X Y)

= (X

binary trees by strings,

induces a lattice of binary trees.

67 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries

or P

or P

, we can

reproduce the tree.

3

the corresponding tree can be built by simply adding

the paths one-by-one to an initially empty tree:

0

o

o

o

o

o

o

o

o

o

o

o

o

o

1

y

y

y

y

y

y

y

y

y

y

y

y

y

0

1

c

c

c

c

c

c

c

c

0

1

c

c

c

c

c

c

c

c

1

c

c

c

c

c

c

c

c

0

3

Formally we can say there is a bijection (a 1-1 correspondence)

between binary trees and prex-closed (resp. prex-free) sets.

68 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries

of strings as paths of the tree is called a trie. (The

term comes from reTRIEval; pronounced either tree

or try depending on taste. Tries were invented by de

la Briandais, and independently by Fredkin [13].)

DictionaryK, V), i.e., maintaining a map

f : K V by associating each k K with a path

through the trie to a node where f (k) is stored.

4

compression, sorting, SAT solving, routing, natural

language processing, very large databases (VLDBs),

data mining, etc.

with caching and sharing of subtrees.

4

The notation K V indicates a partial function from K to V: a

function that might not be dened for some keys.

69 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Trie example: word list

saxophone, tried, saxifrage, squeak, try, squeak,

squeaky, squeakily, squeakier.

there. (Can use 1 bit on internal nodes to indicate

whether a key terminates there.)

there, rather than including a possibly long chain of

nodes with single children.

are storing are V = 0, 1. The function the trie

represents is a map : K 0, 1 where is the

characteristic function of the set: (k) = 1 if and only

if k is in the set.

little BST at each node with up to 26 elements in it (a

ternary search trie [3])

70 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Trie example: wordlist

larch

l

s

t

a

q

r

x

u e a

squeak

k

i

squeaky

y

squeakier

e

squeakily

l

saxifrage

i

saxophone

o

i

try

y

trie

e

tried

d

71 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Trie example: coding

string of bits to decode.

that maps binary codewords to plaintext. The incoming

transmission is then just a sequence of codewords that

we will replace, one by one, with their corresponding

plaintext.

5

only at the leaves, is an example of a uniquely

decodeable code: there is only one way an encoded

message can be decoded. Specically, such codes are

called prex codes or instantaneous codes.

5

This strategy is asymptotically optimal (achieves a bitrate H +

for any > 0) for stationary ergodic random processes, with an

appropriate choice of codebook.

72 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Trie example: coding

to sequences of three letters, giving the most frequent

words shorter codes:

Three-letter combination Codeword

the 000

and 001

for 010

are 011

but 100

not 1010

you 1011

all 1100

.

.

.

.

.

.

etc 11101101

.

.

.

.

.

.

qxw 1111011001101001

73 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Trie example: coding

0 1

0 1 0 1

the

0

and

1

for

0

are

1

but

01 0 1

not

0

you

1

all

0 1 0 1

0 1

74 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Trie example: decoding

bits. When a leaf is reached, output the word there,

and return to the root.

100

..

but

1010

..

not

010

..

for

1011

..

you

1100

..

all

as ASCII text (24 bits per 3-letter sequence).

frequently-occurring strings; if a string occurs with

probability p

i

, one wants the codeword to have length

about log

2

p

i

.

constructed optimally using a greedy algorithm.

75 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries: Krafts inequality

of codewords in a prex code (equivalently, leaf depths

in a binary tree.)

Theorem (Kraft)

Let (d

1

, d

2

, . . .) be a sequence of code lengths of a code.

There is a prex code with code lengths d

1

, d

2

, . . .

(equivalently, a binary tree with leaves at depth d

1

, d

2

, . . .) if

and only if

n

i =1

2

d

i

1 (1)

76 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries: Krafts inequality I

Krafts inequality:

1

8

+

1

8

+

1

4

+

1

4

=

3

4

. Possible trie

realization:

0

o

o

o

o

o 1

y

y

y

y

y

0

1

c

c

c

0

0

1

c

c

c

violate Krafts inequality: sum is

9

8

.

every internal node has two children.

77 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries: Krafts inequality

Two ways to prove Krafts inequality:

subinterval of [0, 1] on the real line: root is [0, 1], its children get

[0,

1

2

] and [

1

2

, 1]. Each node at depth d receives an interval of

length 2

d

and splits it in half for its children. The union of the

intervals at the leaves is [0, 1], and the intervals at the leaves

are pairwise disjoint, so the sum of their interval lengths is 1.

argument. The list of valid codeword length sequences can be

generated from the initial sequence 1, 1 (codewords {0, 1}) by

the rewrite rules k k + 1, k + 1 (expand a node into two

children) and k k + 1 (expand a node to have a single child).

Base case: with 1, 1 obviously 2

1

+ 2

1

= 1. Induction step: if

sum is 1, consider expanding a single element of the sequence:

have either the rewrite k k +1, k +1, and 2

k

2

k1

+2

k1

; or

the rewrite k k + 1, and 2

k

2

k1

. So rewrites never increase

the weight of a node.

78 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries: Krafts inequality I

It is occasionally useful to have an innite set of codewords

handy, in case we do not know in advance how many

dierent objects we might need to code.

For an innite set of codewords (or innite binary tree),

Krafts inequality implies

d

k

c + log

+

k + log log

+

log

where

log

+

x log x + log log x + log log log x +

with the sum taken only over the positive terms, and log

x

is the iterated logarithm

log

x =

_

0 if x 1

1 + log

(log x) otherwise

79 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries: Krafts inequality II

See e.g., [4, 20].

Where does this bound come from? Well, a necessary condition for

X

k=0

2

d

k

1

to hold is that the series

P

k=0

2

d

k

converges. For example, if

d

k

= log k, then 2

d

k

=

1

k

, the Harmonic series. The Harmonic series

diverges, so Krafts inequality cant hold.

We can parlay this into an inequality by remembering the comparison

test for convergence of series: if a

k

, b

k

are two positive series, and

a

k

b

k

for all k, then

P

a

k

P

b

k

. If we stick the Harmonic series in

for a

k

and 2

d

k

for b

k

, we get:

If

1

k

2

d

k

for all k then

P

2

d

k

.

The premiss of this test must be false if

P

2

d

k

does not diverge to

innity. Therefore 2

d

k

must be <

1

k

for at least some k. If 2

d

k

<

1

k

for only some nite number of choices of k, the series would still

diverge. So, a necessary condition for 2

d

k

to converge is that 2

d

k

<

1

k

80 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries: Krafts inequality III

for innitely many terms. Taking logarithms and multiplying through by

1 we get d

k

> log k for innitely many i .

We can generalize this by saying that if g (1) is any diverging

function, then d

k

> log g

series bound follows from choosing g(x) = log x.) Unfortunately there

is no slowest growing function g(x) from which we could obtain a

tightest possible bound.

Eqn. (2) is from [4]; Bentley credits the result to Ronald Graham and

Fan Chung, apparently unpublished.

81 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries: Variations on a theme I

There are many useful variants of tries [12]:

one can choose any nite alphabet, and allow each

node to have [[ children.

number of leaves descended from it; when this

threshold is not met, the subtree is converted into a

compact form (e.g., an array of keys and values)

suitable for secondary storage. This technique can also

be used to increase performance in main memory [15].

Information Coded in Alphanumeric

6

) Introduce skip

pointers to avoid long sequences of single-branch nodes

like

0

1

1

0

82 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Tries: Variations on a theme II

almost a complete binary tree of some depth, which can

be collapsed into an array of pointers to tries [18].

BST; can require substantially less space than a trie.

For a large [[, replace a [[-way branch at each

internal node with a BST of depth log [[.

6

Almost better than my all-time favourite strained CS acronym,

PERIDOT: Programming by Example for Real-time Interface Design

Obviating Typing. Great project, despite the acronym.

83 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Bibliography I

[1] A. Andersson and T. Ottmann.

Faster uniquely represented dictionaries.

In IEEE, editor, Proceedings: 32nd annual Symposium

on Foundations of Computer Science, San Juan, Puerto

Rico, October 14, 1991, pages 642649, 1109 Spring

Street, Suite 300, Silver Spring, MD 20910, USA, 1991.

IEEE Computer Society Press. bib pdf

[2] Rudolf Bayer.

Symmetric binary B-trees: Data structure and

maintenance algorithms.

Acta Inf, 1:290306, 1972. bib

84 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Bibliography II

[3] Jon L. Bentley and Robert Sedgewick.

Fast algorithms for sorting and searching strings.

In SODA 97: Proceedings of the eighth annual

ACM-SIAM symposium on Discrete algorithms, pages

360369, Philadelphia, PA, USA, 1997. Society for

Industrial and Applied Mathematics. bib

[4] Jon Louis Bentley and Andrew Chi Chih Yao.

An almost optimal algorithm for unbounded searching.

Information Processing Lett., 5(3):8287, 1976. bib

[5] Guy E. Blelloch and Margaret Reid-Miller.

Fast set operations using treaps.

In Proceedings of the 10th Annual ACM Symposium on

Parallel Algorithms and Architectures, pages 1626,

Puerto Vallarta, Mexico, June 1998. bib ps

85 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Bibliography III

[6] Adam L. Buchsbaum and Robert E. Tarjan.

Conuently persistent deques via data-structural

bootstrapping.

In Proceedings of the fourth annual ACM-SIAM

Symposium on Discrete algorithms, pages 155164.

ACM Press, 1993. bib pdf ps

[7] Thomas H. Cormen, Charles E. Leiserson, and

Ronald R. Rivest.

Intoduction to algorithms.

McGraw Hill, 1991. bib

[8] Luc Devroye.

A note on the height of binary search trees.

Journal of the ACM (JACM), 33(3):489498, 1986.

bib pdf

86 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Bibliography IV

[9] P. F. Dietz.

Fully persistent arrays.

In F. Dehne, J.-R. Sack, and N. Santoro, editors,

Proceedings of the Workshop on Algorithms and Data

Strucures, volume 382 of LNCS, pages 6774, Berlin,

August 1989. Springer. bib

[10] James R. Driscoll, Neil Sarnak, Daniel Dominic Sleator,

and Robert Endre Tarjan.

Making data structures persistent.

In ACM Symposium on Theory of Computing, pages

109121, 1986. bib pdf

87 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Bibliography V

[11] Amos Fiat and Haim Kaplan.

Making data structures conuently persistent.

In Proceedings of the Twelfth Annual ACM-SIAM

Symposium on Discrete Algorithms (SODA-01), pages

537546, New York, January 79 2001. ACM Press.

bib pdf

[12] Philippe Flajolet.

The ubiquitous digital tree.

In Bruno Durand and Wolfgang Thomas, editors,

STACS 2006, 23rd Annual Symposium on Theoretical

Aspects of Computer Science, Marseille, France,

February 23-25, 2006, Proceedings, volume 3884 of

Lecture Notes in Computer Science, pages 122.

Springer, 2006. bib pdf

88 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Bibliography VI

[13] Edward Fredkin.

Trie memory.

Commun. ACM, 3(9):490499, 1960. bib

[14] Leonidas J. Guibas and Robert Sedgewick.

A dichromatic framework for balanced trees.

In FOCS, pages 821. IEEE, 1978. bib

[15] Steen Heinz, Justin Zobel, and Hugh E. Williams.

Burst tries: a fast, ecient data structure for string

keys.

ACM Trans. Inf. Syst., 20(2):192223, 2002. bib

[16] J. M. Lucas, D. R. van Baronaigien, and F. Ruskey.

On rotations and the generation of binary trees.

Journal of Algorithms, 15(3):343366, November 1993.

bib ps

89 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Bibliography VII

[17] Donald R. Morrison.

PATRICIApractical algorithm to retrieve information

coded in alphanumeric.

J. ACM, 15(4):514534, 1968. bib pdf

[18] Stefan Nilsson and Gunnar Karlsson.

IP-address lookup using LC-tries.

IEEE Journal on Selected Areas in Communications,

17:10831092, June 1999. bib

[19] Chris Okasaki.

Purely Functional Data Structures.

Cambridge University Press, Cambridge, UK, 1998. bib

90 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Bibliography VIII

[20] Jorma Rissanen.

Stochastic Complexity in Statistical Inquiry, volume 15

of Series in Computer Science.

World Scientic, 1989. bib

[21] Neil Sarnak and Robert E. Tarjan.

Planar point location using persistent search trees.

Commun. ACM, 29(7):669679, 1986. bib pdf

[22] Raimund Seidel and Cecilia R. Aragon.

Randomized search trees.

Algorithmica, 16(4/5):464497, 1996. bib pdf ps

91 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Bibliography IX

[23] Lawrence Snyder.

On uniquely representable data structures.

In 18th Annual Symposium on Foundations of

Computer Science, pages 142146, Long Beach, Ca.,

USA, October 1977. IEEE Computer Society Press. bib

[24] R. Sundar and R. E. Tarjan.

Unique binary search tree representations and

equality-testing of sets and sequences.

In Baruch Awerbuch, editor, Proceedings of the 22nd

Annual ACM Symposium on the Theory of Computing,

pages 1825, Baltimore, MY, May 1990. ACM Press.

bib pdf

92 / 1

ECE750 Lecture 4:

Trees, Tree

Iterators, Treaps,

Unique

Representation,

Persistence,

Red-Black Trees,

Tries

Todd Veldhuizen

tveldhui@acm.org

Bibliography X

[25] Jean Vuillemin.

A unifying look at data structures.

Communications of the ACM, 23(4):229239, 1980.

bib pdf

[26] M. A. Weiss.

A note on construction of treaps and Cartesian trees.

Information Processing Letters, 54(2):127127, April

1995. bib

[27] Mark Allen Weiss.

Linear-time construction of treaps and Cartesian trees.

Information Processing Letters, 52(5):253257,

December 1994. bib pdf

93 / 1