Sei sulla pagina 1di 14

Unit

3: advanced abstract data types and


structures

Computers can store and process vast amounts of data. Formal data structures enable a
programmer to mentally structure large amounts of data into conceptually manageable
relationships.

Sometimes we use data structures to allow us to do more: for example, to accomplish fast
searching or sorting of data. Other times, we use data structures so that we can do less: for
example, the concept of the stack is a limited form of a more general data structure. These
limitations provide us with guarantees that allow us to reason about our programs more easily.
Data structures also provide guarantees about algorithmic complexity — choosing an
appropriate data structure for a job is crucial for writing good software.

The data structure where data items are organized sequentially or linearly where data elements
attached one after another is called linear data structure. Data elements in a liner data structure
are traversed one after the other and only one element can be directly reached while traversing.
All the data items in linear data structure can be traversed in single run. These kind of data
structures are very easy to implement because memory of computer is also organized in linear
fashion. Examples of linear data structures are Arrays, Stack, Queue and Linked List.

The data structure where data items are not organized sequentially is called nonlinear data
structure. In other words, A data elements of the nonlinear data structure could be connected
to more than one element to reflect a special relationship among them. All the data elements in
nonlinear data structure cannot be traversed in single run. Examples of nonlinear data structures
are Trees and Graphs.

p. 1
Chapter 7: Trees
We have introduced and used several sequential structures such as the array, linked list, stacks,
and queues. These structures organize data in a linear fashion in which the data elements have
a ‘’before" and ‘’after" relationship. They work well with many types of problems, but some
problems require data to be organized in a nonlinear fashion. In this chapter, we explore the
tree data structure, which can be used to arrange data in a hierarchical order. Trees can be used
to solve many different problems, including those encountered in data mining, database
systems, encryption, artificial intelligence, computer graphics, and operating systems.

I. Definitions
A tree structure consists of nodes and edges that organize data in a hierarchical fashion. The
relationships between data elements in a tree are similar to those of a family tree: “child,"
“parent," “ancestor," etc. The data elements are stored in nodes and pairs of nodes are
connected by edges. The edges represent the relationship between the nodes that are linked
with arrows or directed edges to form a hierarchical structure resembling an upside-down tree
complete with branches, leaves, and even a root.
Formally, we can define a tree as a set of nodes that either is empty or has a node called the
root that is connected by edges to zero or more subtrees to form a hierarchical structure. Each
subtree is itself by definition a tree.
A classic example of a tree structure is the representation of directories and subdirectories in a
file system. The top tree in Figure 1 illustrates the hierarchical nature of a student’s home
directory in the UNIX file system. Trees can be used to represent structured data, which results
in the subdivision of data into smaller and smaller parts. A simple example of this use is the
division of a book into its various parts of chapters, sections, and subsections, as illustrated by
the bottom tree in Figure 1. Trees are also used for making decisions. One that you are most
likely familiar with is the phone, or menu, tree. When you call customer service for most
businesses today, you are greeted with an automated menu that you have to traverse. The
various menus are nodes in a tree and the menu options from which you can choose are
branches to other nodes.

p. 2
Figure 1: Example tree structures: a UNIX file system home directory (top) and the
subdivision of a book into its parts (bottom).

II. Characteristics of a tree


We use many terms to describe the different characteristics and components of trees. Most of
the terminology comes from that used to describe family relationships or botanical descriptions
of trees. Knowing some of these terms will help you grasp the tree structure and its use in
various applications.

Root
The topmost node of the tree is known as the root node. It provides the single access point into
the structure. The root node is the only node in the tree that does not have an incoming edge
(an edge directed toward it). Consider the sample tree in Figure 2(a). The node with value T is
the root of the tree. By definition, every non-empty tree must contain a root node.

p. 3
Figure 2: A sample tree with: (a) the root node; and (b) a path from T to K.

Path
The other nodes in the tree are accessed by following the edges starting with the root and
progressing in the direction of the arrow until the destination node is reached. The nodes
encountered when following the edges from a starting node to a destination form a path. As
shown in Figure 2(b), the nodes labeled T, C, R, and K form a path from node T to node K.

Parent
The organization of the nodes form relationships between the data elements. Every node,
except the root, has a parent node, which is identified by the incoming edge. A node can have
only one parent (or incoming edge) resulting in a unique path from the root to any other node
in the tree. There are a number of parent nodes in the sample tree: one is node X, which is the
parent of B and G, as shown in Figure 3(a).

Children
Each node can have one or more child nodes resulting in a parent-child hierarchy. The children
of a node are identified by the outgoing edges (directed away from the node). For example,
nodes B and G are the children of X. All nodes that have the same parent are known as siblings,
but there is no direct access between siblings. Thus, we cannot directly access node C from
node X or vice versa.

Nodes
Nodes that have at least one child are known as interior nodes while nodes that have no
children are known as leaf nodes. The interior nodes of the sample tree are shown with gray
backgrounds in Figure 3(b) and the leaf nodes are shown in white.

p. 4
Figure 3: The sample tree with: (a) the parent, child, and sibling relationships; and (b)
the distinction between interior and leaf nodes.

Subtree
A tree is by definition a recursive structure. Every node can be the root of its own subtree,
which consists of a subset of nodes and edges of the larger tree. Figure 4 shows the subtree
with node C as its root.

Figure 4: A subtree with root node C.

Relatives
All of the nodes in a subtree are descendants of the subtree’s root. In the example tree, nodes
J, R, K, and M are descendants of node C. The ancestors of a node include the parent of the
node, its grandparent, its great-grandparent, and so on all the way up to the root. The ancestors
of a node can also be identified by the nodes along the path from the root to the given node.
The root node is the ancestor of every node in the tree and every node in the tree is a descendant
of the root node.

Note: The trees illustrated above used directed edges to indicate the parent-child relationship
between the nodes. But it’s not uncommon to see trees drawn using straight lines or undirected
edges. When a tree is drawn without arrows, we have to be able to deduce the parent-child
relationship from the placement of the nodes. Thus, the parent is always placed above its

p. 5
children. With binary trees, the left and right children are always drawn offset from the parent
in the appropriate direction in order to easily identify the specific child node.

III. The Binary Tree


Trees can come in many different shapes, and they can vary in the number of children allowed
per node or in the way they organize data values within the nodes. One of the most commonly
used trees in computer science is the binary tree. A binary tree is a tree in which each node can
have at most two children. One child is identified as the left child and the other as the right
child. In the remainder of the chapter, we focus on the use and construction of the binary tree.
In the next chapter, we will continue our discussion of binary trees but also explore other types.

1) Properties

Binary trees come in many different shapes and sizes. The shapes vary depending
on the number of nodes and how the nodes are linked. Figure 5 illustrates three
different shapes of a binary tree consisting of nine nodes. There are a number of
properties and characteristics associated with binary trees, all of which depend on
the organization of the nodes within the tree.

Figure 5: Three different arrangements of nine nodes in a binary tree.

Tree Size
The nodes in a binary tree are organized into levels with the root node at level 0, its children at
level 1, the children of level one nodes are at level 2, and so on. In family tree terminology,
each level corresponds to a generation. The binary tree in Figure 5(a), for example, contains
two nodes at level one (B and C), four nodes at level two (D, E, F, and G), and two nodes at
level three (H and I). The root node always occupies level zero.

p. 6
The depth of a node is its distance from the root, with distance being the number of levels that
separate the two. A node’s depth corresponds to the level it occupies. Consider node G in the
three trees of Figure 5. In tree (a), G has a depth of 2, in tree (b) it has a depth of 3, and in (c)
its depth is 6.

The height of a binary tree is the number of levels in the tree. For example, the three binary
trees in Figure 5 have different heights: (a) has a height of 4, (b) has a height of 6, and (c) has
a height of 8.

The width of a binary tree is the number of nodes on the level containing the most nodes. In
the three binary trees of Figure 5, (a) has a width of 4, (b) has a width of 3, and (c) has a width
of 1. Finally, the size of a binary tree is simply the number of nodes in the tree. An empty tree
has a height of 0 and a width of 0, and its size is 0.

Figure 6: Possible slots for the placement of nodes in a binary tree.


A binary tree of size n can have a maximum height of n, which results when there is one node
per level. This is the case with the binary tree in Figure 5(c). What is the minimum height of a
binary tree with n nodes? To determine this, we need to consider the maximum number of
nodes at each level since the nodes will have to be organized with each level at full capacity.
Figure 6 illustrates the slots for the possible placement of nodes within a binary tree. Since
each node can have at most two children, each successive level in the tree doubles the number
of nodes contained on the previous level. This corresponds to a given tree level i having a
capacity for 2i nodes. If we sum the size of each level, when all of the levels are filled to
capacity, except possibly the last one, we find that the minimum height of a binary tree of size
n is [log2 n] + 1.

p. 7
Tree Structure
The height of the tree will be important in analysing the time-complexities of
various algorithms applied to binary trees. The structural properties of binary
trees can also play a role in the efficiency of an algorithm. In fact, some algorithms
require specific tree structures.

A full binary tree is a binary tree in which each interior node contains two children. Full trees
come in many different shapes, as illustrated in Figure 7.

Figure 7: Examples of full binary trees.

A perfect binary tree is a full binary tree in which all leaf nodes are at the same level. The
perfect tree has all possible node slots filled from top to bottom with no gaps, as illustrated in
Figure 8.

Figure 8: A perfect binary tree.


A binary tree of height h is a complete binary tree if it is a perfect binary tree down to height
h - 1 and the nodes on the lowest level fill the available slots from left to right leaving no gaps.
Consider the two complete binary trees in Figure 9. If any of the three leaf nodes labeled A, B,
or C in the left tree were missing, that tree would not be complete. Likewise, if either leaf node
labeled X or Y in the right tree were missing, it would not be complete.

2) Implementation

p. 8
Binary trees are commonly implemented as a dynamic structure in the same fashion
as linked lists. A binary tree is a data structure that can be used to implement
many different abstract data types. Since the operations that a binary tree supports depend on
its application, we are going to create and work with the trees directly
instead of creating a generic binary tree class.

Trees are generally illustrated as abstract structures with the nodes represented
as circles or boxes and the edges as lines or arrows. To implement a binary tree,
however, we must explicitly store in each node the links to the two children along
with the data stored in that node. We define the BinTreeNode storage class,
shown in Listing 1, for creating the nodes in a binary tree. Like other storage
classes, the tree node class is meant for internal use only. Figure 10 illustrates
the physical implementation of the sample binary tree from Figure 4.

Figure 9: Examples of complete binary trees.

Listing 1: The binary tree node class.

p. 9
Figure 10: The physical implementation of a binary tree.

3) Tree Traversals
The operations that can be performed on a binary tree depend on the application, especially the
construction of the tree. In this section, we explore the tree traversal operation, which is one of
the most common operations performed on collections of data. Remember, a traversal iterates
through a collection, one item at a time, in order to access or visit each item. The actual
operation performed when “visiting" an item is application dependent, but it could involve
something as simple as printing the data item or saving it to a file.

With a linear structure such as a linked list, the traversal is rather easy since we can start with
the first node and iterate through the nodes, one at time, by following the links between the
nodes. But how do we visit every node in a binary tree? There is no single path from the root
to every other node in the tree. Remember, the links between the nodes lead us down into the
tree. If we were to simply follow the links, once we reach a leaf node we cannot directly access
another node in the tree.

Preorder Traversal
A tree traversal must begin with the root node, since that is the only access into
the tree. After visiting the root node, we can then traverse the nodes in its left
subtree followed by the nodes in its right subtree. Since every node is the root
of its own subtree, we can repeat the same process on each node, resulting in a
recursive solution. The base case occurs when a null child link is encountered since
there will be no subtree to be processed from that link. The recursive operation
can be viewed graphically, as illustrated in Figure 11.

p. 10
Figure 11: Trees are traversed recursively.

Figure 12: The logical ordering of the nodes with a preorder traversal.
Consider the binary tree in Figure 12. The dashed lines show the logical
order the nodes would be visited during the traversal: A, B, D, E, H, C, F, G, I, J. This traversal
is known as a preorder traversal since we first visit the node followed by the subtree traversals.
The recursive function for a preorder traversal of a binary tree is rather simple, as shown here
above. The subtree argument will either be a null reference or a reference to the root of a subtree
in the binary tree. If the reference is not None, the node is first visited and then the two subtrees
are traversed. By convention, the left subtree is always visited before the right subtree. The
subtree argument will be a null reference when the binary tree is empty, or we attempt to follow
a non-existent link for one or both of the children.
preorder(node)
visit(node)
if node.left ≠ null then preorder(node.left)
if node.right ≠ null then preorder(node.right)

Figure 13: The logical ordering of the nodes with an inorder traversal.

p. 11
Inorder Traversal
In the preorder traversal, we chose to first visit the node and then traverse both
subtrees. Another traversal that can be performed is the inorder traversal, in
which we first traverse the left subtree and then visit the node followed by the
traversal of the right subtree. Figure 13 shows the logical ordering of the node
visits in the example tree: D, B, H, E, A, F, C, I, G, J.

The recursive function for an inorder traversal of a binary tree is provided in


Listing 13.3. It is almost identical to the preorder traversal function. The only
difference is the visit operation is moved following the traversal of the left subtree.
inorder(node)
if node.left ≠ null then inorder(node.left)
visit(node)
if node.right ≠ null then inorder(node.right)

Postorder Traversal
We can also perform a postorder traversal, which can be viewed as the opposite
of the preorder traversal. In a postorder traversal, the left and right subtrees of each node are
traversed before the node is visited. The recursive function is provided hereafter.
postorder(node)
if node.left ≠ null then postorder(node.left)
if node.right ≠ null then postorder(node.right)
visit(node)

The example tree with the logical ordering of the node visits in a postorder
traversal is shown in Figure 14. The nodes are visited in this order: D, H, E,
B, F, I, J, G, C, A. You may notice that the root node is always visited first
in a preorder traversal but last in a postorder traversal.

Figure 14: The logical ordering of the nodes with a postorder traversal.

Breadth-First Traversal
The preorder, inorder, and postorder traversals are all examples of a depth-first traversal. That
is, the nodes are traversed deeper in the tree before returning to higher-level nodes. Another

p. 12
type of traversal that can be performed on a binary tree is the breadth-first traversal. In a
breadth-first traversal, the nodes are visited by level, from left to right. Figure 15 shows the
logical ordering of the nodes in a breadth-first traversal of the example tree.

Figure 15: The logical ordering of the nodes with a breadth-first traversal.

Recursion cannot be used to implement a breadth-first traversal since the recursive calls must
follow the links that lead deeper into the tree. Instead, we must devise another approach. Your
first attempt might be to visit a node followed by its two children. Thus, in the example tree
we would visit node A followed by nodes B and C, which is the correct ordering. But what
happens when we visit node B? We can’t visit its two children, D and E, until after we have
visited node C. What we need is a way to remember or save the two children of B until after C
has been visited. Likewise, when visiting node C, we will have to save its two children until
after the children of B have been visited. After visiting node C, we have saved four nodes: D,
E, F, and G, which are the next four to be visited, in the order they were saved. The best way
to save a node’s children for later access is to use a queue. We can then use an iterative loop to
move across the tree in the correct node order to produce a breadth-first traversal.

The following algorithm uses a queue to implement the breadth-first traversal. The process
starts by saving the root node and in turn priming the iterative loop. During each
iteration, we remove a node from the queue, visit it, and then add its children to
the queue. The loop terminates after all nodes have been visited.
levelorder(root)
queue<node> q
q.push(root)
while not q.empty do
node = q.pop
visit(node)
if node.left ≠ null then q.push(node.left)
if node.right ≠ null then q.push(node.right)

p. 13
Examples of Tree Traversals

preorder: 50,30, 20, 40, 90, 100


inorder: 20,30,40,50, 90, 100
postorder: 20,40,30,100,90,50

Exercise: types of trees

p. 14

Potrebbero piacerti anche