Sei sulla pagina 1di 26

2E3: Binary Search Tree

Dr. Ivana Dusparic


http://www.scss.tcd.ie/Ivana.Dusparic/2E3/

Binary Search Tree




So far we used

Unordered list (e.g., array or linked list of MP3)


Ordered list (e.g., linked list of students
alphabetically arranged)

Searching for an item in unordered/ordered


list not very efficient

Use binary search trees

Binary Search Tree




searching for an item in an unordered list with N


nodes

searching for an item in an ordered list with N nodes

average N/2 comparisons if item in list


N if item NOT in list (need to search to end of list)
average N/2 comparisons if in list
N/2 if item NOT in list (can end search as soon as we reach an
item being grater than the item we are looking for)

O(n) - order n in big O notation

Not very efficient!

Algorithm Complexity



Expressed in Big O Notation


Performance of algorithm depends on the
number of times dominant operation needs
to be executed

E.g., in search algorithms, comparison is a


dominant operation

Big O expresses how number of dominant


operations grows as the size of the problem
growns

Algorithm Complexity


For example, consider algorithms with


following complexities:

log2 n
n log2n
n2
2n

Algorithm Complexity
log2 n

n2

n log2n

2n

16

16

24

64

256

16

64

256

65536

32

180

1024 4294967296

Binary Tree

Binary tree
1.
2.

3.

Has a special node called root node


Has two sets of nodes called the left
subtree and the right subtree
Left subtree and right subtree are binary
trees

Binary Search Tree

Binary Search Tree Definition


1.
2.

3.

4.

Has a special node called root node


Has two sets of nodes called the left subtree and
the right subtree
For any node N every node in the left sub tree is
less than N and every node in the right sub tree
is greater than or equal to N
left subtree and right subtree are binary search
trees

Binary Search Tree Terminology











Parent node - Has left child and/or right child


Branch connects a child and its parent
Leaf a node that has no left or right children
Path from node a to node b
Length of a path - number of branches on the path
Level/depth of a node - number of branches on the path
between root and that node
Siblings nodes that share same parent
if a path exists from node p to node q where p is closer to
the root than q, then p is an ancestor of q and q is a
descendant of p
Height of a tree- length of the path from root to deepest
node

Binary Search Tree





Each node contains information (e.g.,


integer, Student object etc)
Each node contains a left and a right pointer

Inserting Items


insert nodes in order 4, 1, 5, 2, 7, 3, 6


root
Height = 3
4

Anscestor of 3

siblings

Average path to a node:


(1*0+2*1+2*2+2*3)/7=1.71

5
2

7
3

Descendant of 1

Is at depth 3

leaf

Inserting items


Examine the root and recursively insert the


new node:

to the left subtree if the new value is less than the


root
the right subtree if the new value is greater than
or equal to the root.

Inserting items

root
1

insert in order 1, 2, 3, 4, 5, 6, 7
 VERY unbalanced tree
 balanced tree is where the depth of
the leaves differ by at most 1
 worst case
 behaves as a list
 Average path to a node
(1+2+3+4+5+6+7)/7= 3


2
3
4
5
6
7

Inserting items



insert nodes in order 4, 2, 6,


1, 3, 5, 7
full binary tree is a tree in
which every node other than
the leaves has two children
perfect binary tree is a full
binary tree in which all leaves
are at the same depth

root
4

2
1

6
3

Binary Search Tree


perfect binary tree
 average path length to node
((1*0) + (2*1) + (4*2)) / 7 = 1.42
 average number of comparisons
needed to find an integer in tree
((1*1) + (2*2) + (4*3)) / 7 = 2.42
 for binary search trees search time is
log2(n)
if n = 7, log2(n) = 2.8 [an approximation]
 search time O(log2 n)


root
4

2
1

6
3

Binary Search Tree


if N = 10000
search times
list: N/2 = 5000
binary tree: log2(10000) = 13.28

5000/13.28 = 377 times faster

Binary Search Tree Implementation




Need to consider
inserting a node
removing a node
finding a node
traversing a tree (visiting ALL nodes in a tree)
Need to define
class Node
class Tree

Binary Search Tree Implementation


class Node {
public:

int data;
Node *left;
Node *right;

class Tree {
public:
Node *root;
Tree();

};

// initially NULL
// default constructor

void insert(Node*);
Node *find(int);
int remove(Node*);

};

10

Binary Search Tree Constructor


// Node constructor
Node::Node()
{
data=0;
left=right = NULL;
}
// Tree constructor
Tree::Tree()
{
root = NULL;
}

Insert Node
void Tree::insert(Node *n)
{
if (root == NULL) {
root = n;
return;
}
..

// empty tree

11

Insert Node
{

void Tree::insert(Node *n)


..//if null, see previous slide
Node *curr = root;
Node *prev;
// find insertion point
while (curr!=NULL) {
// find a correct leaf to insert at
prev = curr;
if (n->data > curr->data)
curr = curr->right;
else
curr = curr->left;
}
if (n->v > previous->v)
previous->right = n; // insert node as a right child of the leaf
else
previous->left = n; // insert node to left

Finding nodes




Iterative and recursive find


Iteration while, for
Recursion
Recursive algorithm algorithm that finds a solution by
reducing the problem to a smaller version of itself
Recursive function function that calls itself
E.g., binary search tree definition left/right subtrees are
binary search trees themselves
Some problems naturally suited for (and much easier
implemented using) iteration some for recursion

12

Finding nodes - Iterative


Node* Tree::find(int data)
{
Node *n = root;
while (n) {
if (data == n->data)
return n;
if (data > n->data) {
n = n->right;
} else {
n = n->left;
}
}
return NULL;
}

// make local copy of root


// match
// return pointer to found node
// go right
// go left

// NOT found

Finding nodes - Recursive


Node* Tree::find(int v)
{
return find(root, v);
}


// public method
// call helper with root of tree

Root passed in to find(root, v) will change each time function is called


recursively it will first be tree root, then root of one of its subtrees etc

13

Finding nodes - Recursive


Node *Tree::find(Node *n, int data)
{
if (n == NULL)
return NULL;
if (n->data == data)
return n;
if (v < n->data)
return find(n->left, data);
return find(n->right, data);
}


// private helper method


//if empty sub tree
// return NULL
// if found
// return pointer to Node
// if v less than n->data
// search left sub tree
// otherwise search right sub tree

Can add cout <<calling funciton find <<endl; to keep track how many times its
called recursively

Finding nodes - Recursive


 Find 5 recursively
//search for 5
find(5)
//search 5 in tree with root 4
find(4, 5)
//search 5 in tree with root 6
find(6, 5)
//search 5 in tree with root 5
find(5, 5)
//found
return 5
return 5
return 5
return 5

root
4

2
1

6
3

14

Finding nodes


Binary search tree finding nodes


Iterative version more efficient, but probably not
much in it
Matter of programming style

Finding Tree Height - Recursive


Node* Tree::height()
{
return height(root) - 1;
}

// public method
// call helper with root node
// NB. -1

Node *Tree::height(Node *n)


// private helper method
{
if (n) {
// if n != NULL
int lh = height(n->left);
// calculate height of left sub tree
int rh = height(n->right);
// calculate height of right sub tree
return (lh > rh) ? lh+1 : rh+1 // return greater of lh and lr plus 1
}
return 0;
// return 0
}

15

Finding Tree Height - Recursive


height()
-- height of tree
height(4)-- height of tree with root 4
height(2)
-- height of tree with root 2
height(1)
-- height of tree with root 1
height(NULL)
return 0
height(NULL) -- right
return(0)
return 1
-- 1 + max(0, 0)
height(NULL) -- right
return 0
return 2
-- 1 + max(1, 0)
height(6)
height(NULL) -- left
return 0
height(NULL) -- right
return 0;
return 1;
-- 1 + max(0, 0)
return 3
-- 1 + max(2, 1)
return 2
-- subtract 1

root
4

-- left

Finding Tree Height - Improved


Node* Tree::height()
{
return (root) ? height(root) 1 : -1;
}
Node *Tree::height(Node *n)
{
if (n) {
int lh = (n->left) ? height(n->left) : 0;
int rh = (n->right) ? height(n->right) : 0;
return (lh > rh) ? lh+1 : rh+1;
return 0;
}

// public method
// check for root == NULL
// NB. -1
// private helper method
// return NULL
// check n->left
// check n->right
// return greater of lh and lr + 1
// return 0

16

Finding Tree Height - Improved


height() -- height of tree
height(4)
-- height of tree with root 4
height(2) -- height of tree with root 2
height(1) -- height of tree with root 1
return 1 -- 1 + max(0, 0)
return 2 -- 1 + max(1, 0)
height(6) -- height of tree with root 6
return 1; -- 1 + max(0, 0)
return 3
-- 1 + max(2, 1)
return 2
-- subtract 1


root
4

Eliminated all the calls to height (NULL) previous version had

Tree Traversal
 visit ALL nodes in tree and perform some
action on each node (eg. print)
 in-order - traverse left subtree then
node then right subtree
1, 2, 4, 6
 reverse order traverse the right
subtree then node then left subtree
6, 4, 2, 1
 pre-order visit the node then traverse
left subtree then right subtree
4, 2, 1, 6
 post-order - traverse left subtree then
right subtree then node
1, 2, 6, 4

root
4

17

Tree Traversal

Recursive implementation by far the easier


approach
Iterative versions possible, but are surprisingly
complex

In-order tree traversal


Node* Tree::inOrder()
{
return inOrder(root) ;
}

// public method
// call helper with root node

Node *Tree::inOrder(Node *n) // private helper method


{
if (n) {
// if n != NULL
inOrder(n->left);
// handle left sub tree
cout << n->data << endl;// perform action on node
inOrder(n->right);
// handle right sub tree
}
}

18

Reverse-order tree traversal


Node* Tree::revOrder() // public method
{
return revOrder(root) ;
// call helper with root node
}
Node *Tree::revOrder(Node *n) // private helper method
{
if (n) {
// if n != NULL
revOrder(n->right);
// handle right sub tree
cout << n->data << endl;
// perform action on node
revOrder(n->left);
// handle left sub tree
}
}

Pre-order tree traversal


Node* Tree::preOrder()
// public method
{
return preOrder(root) ; // call helper with root node
}
Node *Tree::preOrder(Node *n)
// private helper method
{
if (n) {
// if n != NULL
cout << n->data << endl; // perform action on node
preOrder(n->left); // handle left sub tree
preOrder(n->right);
// handle right sub tree
}
}

How would we implement post-order tree traversal?

19

Node deletion
root

3 cases to consider

10

1. deleting a leaf node (eg.


node 8)
6

2. deleting a node with one


sub tree (eg. node 7)

2
1

3. deleting a node with two


sub trees (eg. node 6)

12
7

Node deletion leaf node





E.g. delete 8
Simply remove the node

root
10

6
2
1

12
7

20

Node deletion one subtree






e.g. delete 7
Delete node
Make parent point to subtree

root
10

6
2
1

12
7

Node deletion two subtrees


E.g. delete node 6
 Find largest node in left sub tree of node 6
(value 4) (or smallest in the right subtree)
 overwrite "node 6" with largest value
 Make parent of "largest node" point to node
pointed to by "largest node's" left link [NB. its
right link will be NULL, nothing larger than
that node in the subtree]
 overwriting the content of a node may not be
such a good idea if other pointers are pointing
to it [eg. node may be in more than one tree]
or if a pointer to the deleted node is to be
returned

root

 may need to move "largest node" in left


sub tree

64
2
1

12
7

21

Delete implementation


Node* remove(int data)


 find Node to remove and call removeN
 returns a pointer to removed Node or NULL

helper method void removeN(Node*&np)


 parameter is a Node* reference
 need to modify pointer to Node which is being removed
 pointer to Node being removed is in its parent's node hence
pass a reference to it so it can be modified
 handle the 3 cases here

Good example in Malik!

Delete implementation
Node* Tree::remove(int v)
{ Node *pp = NULL, *p = root;
while (p) {
if (v == p->v) {
if (pp == NULL)
removeN(root); // special case if root
else
// pass reference to pointer to be modified
removeN (v < pp->v? pp->left: pp->right);
p->left = p->right = NULL; // set left and right pointers to NULL
return p; // return pointer to removed node
}
pp = p; // remember parent
p = (v < p->v) ? p->left : p->right; // go left or right
}
return NULL; // NOT found
}

22

Delete implementation
void Tree::removeN(Node*&np)
// removeN helper
{
if ((np->left == NULL) && (np->right == NULL)) { // case 1
np = NULL; // update pointer to removed Node
return;
}
if ((np->left == NULL) || (np->right == NULL)) { // case 2
np = (np->left == NULL) ? np->right : np->left;
// update pointer to removed Node
return;
}
Node*p = np->left, *pp = NULL;
// case 3
while (p->right) {
// find largest Node in left sub tree
pp= p;
// pointer to parent
p = p->right; // go right
}
if (pp == NULL)
// by pass largest node
np->left = p->left;
// need to think about this!
else
pp->right = p->left;
// need to think about this!
p->left = np->left;
// node p replaces node np
p->right = np->right; // copy left and right
np = p;
// make switch
}

Delete example

delete node 4
node 3 is the largest node in left sub
tree of node 4
node 4 must be made point to node 2
[np->left = p->left]
need to swap nodes 3 and 4 so that
a pointer to a detached node 4 can be
returned

root
8

np
4

p
3
2

12
6
7

23

Delete example
root

"swap" node 3 and node 4


p->left = np->left [1]
p->right = np->right [2]
np = p [3]

8
[3]
p
3

12
[2]

left and right pointers of removed


node 4 set to NULL in remove()

[1]

6
7

Delete example



Check out animation of inserting, searching


and removing items in binary search tree:
http://www.cs.jhu.edu/~goodrich/dsa/trees/btr
ee.html

24

Destructor



Destructor for the Node class


Destructor for the Tree class

Node::~Node(){
cout << node destructor"<<endl;
if (left != NULL) delete left;
if (right != NULL) delete right;
}

Tree::~Tree()
{
cout << tree destructor" <<endl;
if (root != NULL) delete root;
}

Nonrecursive in order traversal




Why does recursive version work?

Node *Tree::inOrder(Node *n)


// private helper method
{
if (n) {
// if n != NULL
inOrder(n->left);
// handle left sub tree
cout << n->v << endl; // perform action on node
inOrder(n->right);
// handle right sub tree
}
//
}






each recursive call creates a procedure frame on stack


each frame contains its own local copy of parameter n
when function returns, previous stack frame and n restored
hence a stack is used to remember n
non-recursive solution based on creating own stack

25

Nonrecursive in order traversal


Node *Tree::inOrder()
// private helper method
{
Stack stack;
// local stack of Node*
Node *p = root;
while (p|| !stack.isEmpty()) {
// keep going while or stack not empty
if (p) {
// before going left
stack.push(p);
// push p on stack
p = p->left;
// then go left
} else {
p = stack.pop();
// pop p from stack
cout << p->v << endl;
// output value
p = p->right;
// go right
}
}

Nonrecursive in order traversal


go left first pushing pointers to nodes 8, 4
and 3 on stack
as p == NULL and stack not empty pop
stack [p -> node 3]
output "3"
go right [p = NULL]
as p == NULL and stack not empty pop
stack [p-> node 4]
output "4
go right [p-> node 6]
push ->node 6 and go left [p = NULL]
as p == NULL and stack not empty pop
stack [p-> node 6]
output "6"
go right [p -> node 7]
etc

root

stack

8
4

->8
->4

12
sp

->3

6
7

26