Sei sulla pagina 1di 59

Introduction to Data Structure 1

Unit 1: Introduction to Data Structure


Notes
Structure
1.1 Introduction
1.2 Data Type
1.3 Array
1.3.1 One Dimensional Array
1.3.2 Strings
1.3.3 Two Dimensional Array
1.3.4 Multi-dimensional Array
1.4 Pointer
1.4.1 Pointer Notation
1.4.2 Pointer Declaration
1.4.3 Initialization of Pointer Variables
1.4.4 Accessing the Address of Variable
1.5 Pointer Variable
1.6 Summary
1.7 Check Your Progress
1.8 Questions and Exercises
1.9 Key Terms
1.10 Further Readings

Objectives
After studying this unit, you should be able to:
 Understand the concept of data types.
 Discuss the arrays and its type.
 Explain the pointers.

1.1 Introduction
The power of a programming language depends, among other things, on the range of
different types of data it can handle.
Inside a digital computer, at the lowest level, all data and instructions are stored
using only binary digits (0 and 1). Thus, decimal number 65 is stored as its binary
equivalent: 0100 0001. Also the character “A” is stored as binary equivalent of 65(A’s
ASCII): 0100 0001. Both the stored values are same but represent different type of
values. How’s that?
Actually, the interpretation of a stored value depends on the type of the variable in
which the value is stored even if it is just 0100 0001 as long as it is stored on the
secondary storage device. Thus, if 0100 0001 is stored in an integer type variable, it will
be interpreted to have integer value 65, whereas, if it is stored in character type of
variable, it will represent “A”.

Amity Directorate of Distance & Online Education


2 Data and File Structure Using ‘C’

Therefore, the way a value stored in a variable is interpreted is known as its data
type. In other words, data type of a variable is the type of data it can store.
Notes
1.2 Data Type
Every computer language has its own set of data types it supports. Also, the size of the
data types (number of bytes necessary to store the value) varies from language to
language. Besides, it is also hardware platform dependent.
C has a rich set of data types that is capable of catering to all the programming
requirements of an application. The C-data types may be classified into two categories:
Primary and Composite data types as shown below.

void Array
char Pointer
int Structure
float Union
double Enum, etc.

Figure 1.1: C Data Type

Primary Data Types


There are five primary data types in C language.
1. char : stores a single character belonging to the defined character set of C
language.
2. int : stores signed integers. e.g., positive or negative integers.
3. float : stores real numbers with single precision (precision of six digits after
decimal points).
4. double : stores real numbers with double precision, i.e., twice the storage space
required by float.
5. void : specify no values.
The following table shows the meaning and storage spaces required by various primary
data types.
Table 1.1: Primary Data Type

Data Type Meaning Storage Format Range of Values


Space
char A character 1 byte %c ASCII character set
int An integer 2 bytes %d –32768 to +32767
float A single precision floating 4 bytes %f –3.4*1038 to +3.4*1038
point number
double A double precision floating 8 bytes %If –1.7x10308 to +1.7*10308
point number
void valueless or empty 0 byte – –

In addition to these data types, C also has data type qualifiers – short, long, signed
and unsigned. Thus an integer type data may be defined in C as short int, int, unsigned
int, long int. The range of values and size of these qualified data-types is

Amity Directorate of Distance & Online Education


Introduction to Data Structure 3
implementation dependent. However, short is smaller than or equal int, which in turn, is
smaller than long. Unsigned int contains larger range since it does not store negative
integers. Notes
Composite Data Types
Also known as derived data types, composite data types are derived from the basic
data types. They are 5 in number.
1. array : Sequence of objects, all of which are of same types and have
same name.
e.g.: int num [5];
Reserves a sequence of five locations of 2 bytes, each, for storing
integers num[0], num[1], num[2], num[3] and num[4].
2. pointer : Used to store the address of any memory location.
3. structure : Collection of variables of different types.
e.g.: A structure of employee’s data, i.e., name, age, salary.
4. union : Collection of variables of different types sharing common memory
space.
5. Enumerated : Its members are the constants that are written as
Identifiers data type though they have signed integer values.
These constants represent values that can be assigned to
corresponding enumeration variables.
Enumeration may be defined as enum tag { member1, member2
…. member n};
e.g.: enum colors {red, green, blue, cyan}; colors foreground,
background; The first line an enumeration named “colors” which
may have any one of the four colors defined in the curly braces. In
the second line, variables of the enumerated data type “colors”
are declared.

1.3 Array
An array is a group of data items of same data type that share a common name.
Ordinary variables are capable of holding only one value at a time. If we want to store
more than one value at a time in a single variable, we use arrays.
An array is a collective name given to a group of similar variables. Each member in
the group is referred to by its position in the group. Arrays are allotted the memory in a
strictly contiguous fashion. The simplest array is one dimensional array which is a list of
variables of same data type. An array of one dimensional arrays is called a two
dimensional array.

1.3.1 One Dimensional Array


A list of items can be given one variable name using only one subscript and such a
variable is called a one dimensional array.
e.g.: If we want to store a set of five numbers by an array variable number. Then it
will be accomplished in the following way:
int number [5];
This declaration will reserve five contiguous memory locations as shown below:
Number [0] Number [1] Number [2] Number [3] Number [4]

As C performs no bounds checking, care should be taken to ensure that the array
indices are within the declared limits. Also, indexing in C begins from 0 and not from 1.

Amity Directorate of Distance & Online Education


4 Data and File Structure Using ‘C’

Array Declaration

Notes Arrays are defined in the same manner as ordinary variables, except that each array
name must be accompanied by the size specification.
The general form of array declaration is:
data-type array-name [size];
data-type specifies the type of array, size is a positive integer number or symbolic
constant that indicates the maximum number of elements that can be stored in the
array.
e.g.: float height [50];
This declaration declares an array named height containing 50 elements of type
float.
Note: The compiler will interpret first element as height [0]. As in C, the array
elements are induced for 0 to [size-1].

Array Initialization
The elements of an array can be initialized in the same way as the ordinary variables,
when they are declared. Given below are some examples which show how the arrays
are initialized.
static int num [6] = {2, 4, 5, 45, 12};
static int n [ ] = {2, 4, 5, 45, 12};
static float press [ ] = {12.5, 32.4, -23.7, -11.3};
In these examples note the following points:
(a) Till the array elements are not given any specific values, they contain garbage
value.
(b) If the array is initialized where it is declared, its storage class must be either static
or extern. If the storage class is static, all the elements are initialized by 0.
(c) If the array is initialized where it is declared, mentioning the dimension of the array
is optional.

Accessing Elements of an Array


Once an array is declared, individual elements of the array are referred using subscript
or index number. This number specifies the element’s position in the array. All the
elements of the array are numbered starting from 0. Thus number [5] is actually the
sixth element of an array.

Entering Data into an Array


It can be explained by the following examples:
main( )
{
int num [6];
int count;
for (count = 0; count < 6; count ++)
{
printf (“\n Enter %d element:” count+1);
scanf (“%d”, &num [count]);
}

Amity Directorate of Distance & Online Education


Introduction to Data Structure 5
}
In this example, using the for loop, the process of asking and receiving the marks is Notes
accomplished. When count has the value zero, the scanf() statement will cause the
value to be stored at num [0]. This process continues until count has the value greater
than 5.

Reading Data from an Array


Consider the program given above. It has entered 6 values in the array num. Now to
read values from this array, we will again use for Loop to access each cell. The given
program segment explains the retrieval of the values from the array.
for (count = 0; count < 6; count ++)
{
printf (“\n %d value =“, num [count]);
}

Memory Representation of Array


Consider the following array declaration:
int arr[8];
16 bytes get immediately reserved in memory because each of the 8 integers would
be 2 bytes long and since the array is not being initialized; all eight values present in it
would be garbage values.
Whatever be the initial values, all the array elements would always be present in
contiguous memory location. This arrangement of array elements in memory is shown
below.
In C, there is no check to see if the subscript used for an array exceeds the size of
the array. Data entered with a subscript exceeding the array size will simply be placed
in memory outside the array. This will lead to unpredictable results and there will be no
error message to warn you that you are going beyond the array size. So to see to it that
you do not reach beyond the array size is entirely the programmer’s botheration and not
the compiler’s.
e.g.:
1. A program to find average marks obtained by a class of 30 students in a test.
main( )
{
float avg, sum = 0; int i;
int marks [30]; /* array declaration */
for (i = 0; i < 30; i++)
{
printf (“\n Enter marks: \t”);
scanf (“%d”, &marks [i]); /* store data in array */
}
for (i = 0; i < 30; i++) sum = sum + marks [i]; / * read
data from an array */
avg = sum / 30;
printf (“\nAverage marks: %f”, avg);
}

Amity Directorate of Distance & Online Education


6 Data and File Structure Using ‘C’

2. Program to read in a one dimensional character array, convert all the elements to
upper case, and then write out the converted array
Notes
#include <stdio.h>
# define SIZE 80
main( )
{
char letter [SIZE]; int count;
for (count = 0; count <SIZE; ++ count) /* read in the line
*/
letter [count] = getchar( );
for (count = 0; count <SIZE; ++ count) /*write out the
line in upper case */
putchar (toupper (letter[count]));
}
It is sometimes convenient to define an array size in terms of a symbolic constant
rather than a fixed integer quantity. This makes it easier to modify a program that
utilizes an array, since all references to the maximum array size can be altered simply
by changing the value of the symbolic constant.

1.3.2 Strings
Just as a group of integers can be stored in an integer array, group of characters can be
stored in a character array or “strings”. The string constant is a one dimensional array of
characters terminated by null character (‘\0’). This null character ‘\0’ (ASCII value 0) is
different from ‘0’ (ASCII value 48).
The terminating null character is important because it is the only way the function
that works with string can know where the string ends.
e.g.: Static char name [ ] = {‘K’, ‘R’, ‘I’, ‘S’, ‘H’, ‘\0’};
This example shows the declaration and initialization of a character array. The array
elements of a character array are stored in contiguous locations with each element
occupying one byte of memory.

Note:
1. Contrary to the numeric array where a 5 digit number can be stored in one array
cell, in the character arrays only a single character can be stored in one cell. So in
order to store an array of strings, a 2-dimensional array is required.
2. As scanf() function is not capable of receiving multi word string, such strings should
be entered using gets().

1.3.3 Two Dimensional Array


It is possible to have an array of more than one dimension. Two dimensional arrays (2-
D array) is an array of number of 1-dimensional arrays.
A two dimensional array is also called a matrix. Consider the following table:
Item1 Item 2 Item 3
Sales 1 300 275 365
Sales 2 210 190 325
Sales 3 405 235 240
Sales 4 260 300 380

Amity Directorate of Distance & Online Education


Introduction to Data Structure 7
This is a table of four rows and three columns. Such a table of items can be defined
using two dimensional arrays.
Notes
General form of declaring a 2-D array is
data_type array_name [row_size] [colum_size];
e.g.: (i) int marks [4] [2];
It will declare integer array marks of four rows and two columns. An element of this
array can be accessed by the manipulation of both the indices.
printf (“%d”, marks [2] [1]) will print the element present in third row and second
column.

Initialization of a 2-Dimensional Array


Two dimensional arrays may be initialized by a list of initial values enclosed in braces
following their declaration.
e.g.: static int table [2] [3] = {0, 0, 0, 1, 1, 1};
initializes the elements of the first row to 0 and the second row to one. The initialization
is done by row.
The aforesaid statement can be equivalently written as
static int table [2] [3] = {{0, 0, 0}, {1, 1, 1}};
by surrounding the elements of each row by braces.
We can also initialize a two dimensional array in the form of a matrix as shown
below:
static int table [2] [3] = {{0, 0, 0},
{1, 1, 1}};
The syntax of the above statement. Commas are required after each brace that
closes off a row, except in the case of the last row.
If the values are missing in an initializer, they are automatically set to 0. For
instance, the statement
static int table [2] [3] = {{1, 1},
{2}};
will initialize the first two elements of the first row to one, the first element of the second
row to two, and all the other elements to 0.
When all the elements are to be initialized to 0, the following short cut method may
be used.
static int m [3] [5] = {{0}, {0}, {0}};
The first element of each row is explicitly initialized to 0 while other elements are
automatically initialized to 0.
While initializing an array, it is necessary to mention the second (column)
dimension, whereas the first dimension (row) is optional. Thus, the following
declarations are acceptable.
static int arr [2] [3] = {12, 34, 23, 45, 56, 45};
static int arr [ ] [3] = {12, 34, 23, 45, 56, 45 };

Amity Directorate of Distance & Online Education


8 Data and File Structure Using ‘C’

Memory Representation of Two Dimensional Array

Notes In memory, whether it is a one dimensional or a two dimensional array, the array
elements are stored in one continuous chain.
The arrangement of array elements of a two dimensional array of students, which
contains roll numbers in one column and the marks in the other (in memory), is shown
below:
e.g.:
1. Program that stores roll number and marks obtained by a student side by side in a
matrix
main( )
{
int stud [4] [2];
int i, j;
for (i = 0; i < = 3; i++)
{
printf (“\n Enter roll no. and marks”);
scanf (“%d%d”, &stud [i] [0], &stud[i] [1]);
}
for (i = 0; i < = 3; i++)
printf (“%d%d\n”, stud [i] [0], stud [i] [0];
}
There are two parts to the program, in the first part through a for Loop we read in
the values of roll number and marks, whereas in second part through another for
Loop we print out these values.
2. Program to print a multiplication table, using two dimensional array.
#define ROWS 5
#define COLUMNS 5
main( )
{
int row, column, product [ROWS] [COLUMNS];
int i, j;
printf (“MULTIPLICATION TABLE \N”);
printf (“ “);
for (j = 1; j < = COLUMNS; j++)
printf (“%4d”, j);
printf (“\n”);
for (i = 0; i < ROWS; i++)
{
row = i + 1;
printf (“%2d\n”, row);
for (j = 1; j < = COLUMNS; j++)
{
column = j;
product [i] [j] = row *column;

Amity Directorate of Distance & Online Education


Introduction to Data Structure 9
printf (“%4d”, product [i] [j]);
}
Notes
printf (“\n”);
}
}
Output: Multiplication Table
1 2 3 4 5
1 1 2 3 4 5
2 2 4 6 8 10
3 3 6 9 12 15
4 4 8 12 16 20
5 5 10 15 20 25

1.3.4 Multi-dimensional Array


C allows arrays of three or more dimensions. Multi-dimensional arrays are defined in
much the same manner as one-dimensional arrays, except that a separate pair of
square brackets is required for each subscript.
The general form of a multi-dimensional array is
data_type array_name [s1] [s2] [s3] . . . [sm];
e.g.: int survey [3] [5] [12];
float table [5] [4] [5] [3];
Here, survey is a 3-dimensional array declared to contain 180 integer_type
elements. Similarly, table is a 4-dimensional array containing 300 elements of floating
point type.
An example of initializing a 4-dimensional array:
static int arr [3] [4] [2] = {{{2, 4}, {7, 8}, {3, 4}, {5, 6},},
{{7, 6}, {3, 4}, {5, 3}, {2, 3}, },
{{8, 9}, {7, 2}, {3, 4}, {6, 1}, }
};
In this example, the outer array has three elements, each of which is a two
dimensional array of four rows, each of which is a one dimensional array of two
elements.
e.g.:
1. Sorting an integer array.
# include <stdio.h>
void main( )
{
int arr [5];
int i, j; temp;
printf (“\n Enter the elements of the array:”};
scanf (“%d”, & arr [i]);
for (i = 0; i < = 4; i ++);

Amity Directorate of Distance & Online Education


10 Data and File Structure Using ‘C’
{
for (J = 0; J < = 3; J ++)
Notes
if (arr [J] > arr [J+1])
{
temp = arr [J];
arg [J] = arr [J+1];
arr [J+1] = temp;
}
}
printf (“\ n The Sorted array is:”);
for (i = 0; i < 5; i++)
printf (“\ t %d”, arr [i]);
}
2. To insert an element into an existing sorted array (Insertion Sort).
# include <stdio.h>
main( )
{
int i, k, y, x [20], n;
for (i = 0; i < 20; i++)
x [ i] = 0;
printf (“\ Enter the number of items to be
inserted:\n”);
scanf (“%d”, &n);
printf (“\n Input %d values \n”, n);
for (k = 0; k < n; k++)
{
scanf (“%d”, &x [k]);
y = x [x]
for (i = k-1; i > = 0 && y < x [i]; i - -)
x [i+1] = x[i];
x [i+1] = y;

}
printf (“\n The sorted numbers are:”);
for (i = 0; i < n; i++)
printf (“\n %d”, x [i]);
}
3. Accept character string and find its length.
Note: We will solve this question by looping instead of using Library function
strlen().
# include <stdio.h>
void main( )
{
char name [20];
Amity Directorate of Distance & Online Education
Introduction to Data Structure 11
int i, len;
printf (“\n Enter the name:”);
Notes
scanf (“%s”, name);
for (i = 0; name [i] ! = ‘\0’; i++);
Len = i - 1;
print f(“\n Length of array is % d”, len);
}

1.4 Pointer
A memory variable is merely a symbolic reference given to a memory location. Now let
us consider that an expression in a C program is as follows:
int a = 10, b = 5, c;
c = a + b;
The above expression implies that a, b and c are the variables which can hold the
integer data. Now from the above mentioned statement let us assume that the variable
‘a’ occupies the address 3000 in the memory, ‘b’ occupies 3020 and the variable ‘c’
occupies 3040 in the memory. Then the compiler will generate the machine instruction
to transfer the data from the location 3000 and 3020 into the CPU, add them and
transfer the result to the location 3040 referenced as c. Hence we can conclude that
every variable holds two values:
Address of the variable in the memory (l-value)
Value stored at that memory location referenced by the variable. (r-value)
Pointer is nothing but a simple data type in C programming language, which has a
special characteristic to hold the address of some other memory location as its r-value.
C programming language provides ‘&’ operator to extract the address of any object.
These addresses can be stored in the pointer variable and can be manipulated.
The syntax for declaring a pointer variable is,
<data type> *<identifier>;
Example:
int n;
int *ptr;/* pointer to an integer*/
The following statement assigns the address location of the variable n to ptr, and ptr
is a pointer to n.
ptr=&n;
Since a pointer variable points to a location, the content of that location is obtained
by prefixing the pointer variable by the unary operator * (also called the indirection or
dereferencing operator) like, *<pointer_variable>.
Example:
# include<stdio.h>
main()
{
int a=10, *ptr;
ptr=&a; /* ptr points to the location of a */

Amity Directorate of Distance & Online Education


12 Data and File Structure Using ‘C’
printf(“The value of a pointed by the pointer ptr is: %d”,
*ptr);
Notes /* printing the value of a pointed by ptr through the pointer
ptr*/
}
A null value can be assigned to a pointer when it does not point to any data or in the
other words, as a good programming habit every pointer should be initialized with the
null value. A pointer with a null value assigned to it is nothing but a pointer which
contains the address zero.
The precedence of the unary operators ‘&’ and ‘*’ are same in C language. Here as
a special case we can mention that ‘&’ operator can not be used or applied to any
arithmetic expression, it can only be used with an operand which has unique address.
Pointer is a variable which can hold the address of a memory location. The value
stored in a pointer type variable is interpreted as an address. Consider the following
declarative statement:
int num = 197;
This statement instructs the compiler to reserve a 2-byte memory location
(assuming that the target machine stores an int type in two bytes) and to put the value
84 in that location. Assume that a system allocates memory location 1001 for num.
diagrammatically it can be shown as:
num Name of the variable

197 Bytes in the memory

1001 Address of the variable in the memory

Figure 1.2: Memory allocation

As the memory addresses are numbers, they can be assigned to some other
variable. Let ptr be the variable which holds the address of variable num. We can
access the value of num by the variable ptr. Thus, we can say “ptr points to num”.
Diagrammatically, it can be shown as:
num ptr

197 1001

1001 2341

Figure 1.3: Pointer usage.

Note that ptr is itself a variable therefore it will also be stored at a location in the
memory having some address (2341 in above case). Here we say that – ptr is a pointer
variable which is currently pointing to an integer type variable num which holds the
value 197.

1.4.1 Pointer Notation


The actual address of a variable is not known immediately. We can determine the
address of a variable using ‘address of’ operator (&). We have already seen the use of
‘address of’ operator in the scanf() function.
Another pointer operator available in C is “*” called “value a address” operator. It
gives the value stored at a particular address. This operator is also known as
‘indirection operator’.

Amity Directorate of Distance & Online Education


Introduction to Data Structure 13
e.g.: main( )
{
Notes
int i = 3;
printf ("\n Address of i: = %u", & i);
/* returns the address * /
printf ("\t value i = %d", * (&i));
/* returns the value of address of i */
}

1.4.2 Pointer Declaration


Since pointer variables contain address that belongs to a separate data type, they must
be declared as pointers before we use them. Pointers can be declared just a any other
variables. The declaration of a pointer variable takes the following form:
data_type *pt_name;
The above statement tells the compiler three things about the variable pt_name.
The asterisk (*) tells that the variable pt_name is a pointer variable.
pt_name needs a memory location.
pt_name points to a variable of type data type.
For example, the statement
int *p;
declares the variable p as a pointer variable that points to an integer data type (int).
The type int refers to the data type of the variable being pointed to by p and not the type
of the value of the pointer.
Table 1.2: Pointer Declaration

Pointer Declaration Interpretation


Int *rollnumber; Create a pointer variable roll number capable of pointing to an integer
type variable or capable of holding the address of an integer type
variable
char *name; Create a pointer variable name capable of pointing to a character type
variable or capable of holding the address of a character type variable
float *salary; Create a pointer variable salary capable of pointing to a float type
variable or capable of holding the address of a float type variable

Address Operator - &


Once a pointer variable has been declared, it can be made to point to a variable by
assigning the address of that variable to the pointer variable. The address of a variable
can be extracted using address operator - &.
An expression having & operator generates the address of the variable it precedes.
Thus, for example,
&num
produces the address of the variable num in the memory. This address can be assigned
to any pointer variable of appropriate type (i.e., the data type of variable num) using an
assignment statement such as p = &num; which causes p to point to num. That is, p
now contains the address of num.

Amity Directorate of Distance & Online Education


14 Data and File Structure Using ‘C’

The assignment shown above is known as pointer initialization. Before a pointer is


initialized, it should not be used. A pointer variable can be initialized in its declaration
Notes itself.
int x;
int *p = &x;
statement declares x as an integer variable and p as a pointer variable and then
initializes p to the address of x. This is an initialization of p, not *p. On the contrary, the
statement
int *p = &x, x;
is invalid because the target variable x is not declared before the pointer.

Indirection Operation - *
Since a pointer type variable contains an assigned address of another variable the
value stored in the target variable can be obtained using this address. The value store
in a variable can be referred to using a pointer variable pointing to this variable using
indirection operator (*).
For example, consider the following code.
int x = 109;
int *p;
p = &x;
Then the following expression
*p
Represents the value 109.

1.4.3 Initialization of Pointer Variables


Since pointer variables contain address that belongs to a separate data type, they must
be declared as pointers before we use them.
The declaration of a pointer variable takes the following form:
data_type *pt_name
This tells the compiler three things about the variable pt_name.
The asterisk (*) tells that the variable pt_name is a pointer variable.
pt_name needs a memory location.
pt_name points to a variable of type data type.
e.g.: int *p; declares the variable p as a pointer variable that points to an integer
data type. The type int refers to the data type of the variable being pointed to by p and
not the type of the value of the pointer.
Once a pointer variable has been declared, it can be made to point to a variable
using an assignment statement such as p = & quantity; which causes p to point to
quantity. That is, p now contains the address of quantity. This is known as pointer
initialization. Before a pointer is initialized, it should not be used. A pointer variable can
be initialized in its declaration itself.
e.g.: int x, *p=&x; statement declares x as an integer variable and p as a pointer
variable and then initializes p to the address of x. This is an initialization of p, not *p. On
the contrary, the statement int *p = &x, x; is invalid because the target variable x is
declared first.

Amity Directorate of Distance & Online Education


Introduction to Data Structure 15
1.4.4 Accessing the Address of Variable
The actual location of a variable in the memory is system dependent and therefore, the Notes
address of a variable is not known to us immediately. How can we then determine the
address of a variable? This can be done with the help of the operator & available in C.
The operator & immediately preceding a variable return the address of the variable
associated with it.
For example, the statement
P = &quantity;
Would assign the address 5000 to the variable p. The & operator can be remembered
as ‘address of’.
The & operator can be used only with a simple variable or an array element. The
following are illegal use of address operator:
& 125 (pointing at constant).
Int x[10];
&x (pointing at array names).
&(x+y) (pointing at expressions).
If x is an array, then expression such as
&x[0] and &x[i+3]
are valid and represent the addresses of 0th and (i+3)th elements of x.

1.5 Pointer Variable


The actual address of a variable is not known immediately. We can determine the
address of a variable using ‘address of’ operator (&). We have already seen the use of
‘address of’ operator in the scanf() function.
Another pointer operator available in C is “*” called “value a address” operator. It
gives the value stored at a particular address. This operator is also known as
‘indirection operator’.
e.g.: main( )
{
int i = 3;
printf ("\n Address of i: = %u", & i);
/* returns the address * /
printf ("\t value i = %d", * (&i));
/* returns the value of address of i */
}

Accessing a Variable through its Pointer


Consider the following statements:
int q, * i, n;
q = 35;
i = & q;
n = * i;
i is a pointer to an integer containing the address of q. In the fourth statement we have
assigned the value at address contained in i to another variable n. Thus, indirectly we
have accessed the variable q through n. using pointer variable i.
Amity Directorate of Distance & Online Education
16 Data and File Structure Using ‘C’

1.6 Summary
Notes An array is a group of memory locations related by the fact that they all have the same
name and same data type. An array including more than one dimension is called a
multidimensional array. The size of an array should be a positive number. If an array in
declared without a size and in initialized to a series of values it is implicitly given the
size of number of initializers. Array subscript always starts with 0. Last element’s
subscript is always one less than the size of the array e.g., An array with 10 elements
contains element 0 to 9. Size of an array must be a constant number.
Pointers are often passed to a function as arguments by reference. This allows data
items within the calling function to be accessed, altered by the called function, and then
returned to the calling function in the altered form. There is an intimate relationship
between pointers and arrays as an array name is really a pointer to the first element in
the array. Access to the elements of array using pointers is enabled by adding the
respective subscript to the pointer value (i.e. address of zeroth element) and the
expression proceeded with an indirection operator.

1.7 Check Your Progress


Multiple Choice Questions
1. What is the output of this C code?
#include <stdio.h>
void main()
{
int a[2][3] = {1, 2, 3, 4, 5};
int i = 0, j = 0;
for (i = 0; i < 2; i++)
for (j = 0; j < 3; j++)
printf(“%d”, a[i][j]);
}
(a) 1 2 3 4 5 0
(b) 1 2 3 4 5 junk
(c) 1 2 3 4 5 5
(d) Run time error
2. What is the output of this C code?
#include <stdio.h>
void main()
{
int a[2][3] = {1, 2, 3,, 4, 5};
int i = 0, j = 0;
for (i = 0; i < 2; i++)
for (j = 0; j < 3; j++)
printf(“%d”, a[i][j]);
}
(a) 1 2 3 junk 4 5
(b) Compile time error
(c) 1 2 3 0 4 5
(d) 1 2 3 3 4 5
Amity Directorate of Distance & Online Education
Introduction to Data Structure 17
3. What is the output of this C code?
#include <stdio.h> Notes
void f(int a[][3])
{
a[0][1] = 3;
int i = 0, j = 0;
for (i = 0; i < 2; i++)
for (j = 0; j < 3; j++)
printf(“%d”, a[i][j]);
}
void main()
{
int a[2][3] = {0};
f(a);
}
(a) 0 3 0 0 0 0
(b) Junk 3 junk junk junk junk
(c) Compile time error
(d) All junk values
4. What is the output of this C code?
#include <stdio.h>
void f(int a[][])
{
a[0][1] = 3;
int i = 0, j = 0;
for (i = 0;i < 2; i++)
for (j = 0;j < 3; j++)
printf(“%d”, a[i][j]);
}
void main()
{
int a[2][3] = {0};
f(a);
}
(a) 0 3 0 0 0 0
(b) Junk 3 junk junk junk junk
(c) Compile time error
(d) All junk values
5. What is the output of this C code?
#include <stdio.h>
void f(int a[2][])
{
a[0][1] = 3;

Amity Directorate of Distance & Online Education


18 Data and File Structure Using ‘C’
int i = 0, j = 0;
for (i = 0;i < 2; i++)
Notes
for (j = 0;j < 3; j++)
printf(“%d”, a[i][j]);
}
void main()
{
int a[2][3] = {0};
f(a);
}
(a) 0 3 0 0 0 0
(b) Junk 3 junk junk junk junk
(c) Compile time error
(d) All junk values
6. What is the output of this C code?
#include <stdio.h>
void foo(int*);
int main()
{
int i = 10;
foo((&i)++);
}
void foo(int *p)
{
printf(“%d\n”, *p);
}
(a) 10
(b) Some garbage value
(c) Compile time error
(d) Segmentation fault/code crash
7. What is the output of this C code?
#include <stdio.h>
void foo(int*);
int main()
{
int i = 10, *p = &i;
foo(p++);
}
void foo(int *p)
{
printf(“%d\n”, *p);
}

Amity Directorate of Distance & Online Education


Introduction to Data Structure 19
(a) 10
(b) Some garbage value
Notes
(c) Compile time error
(d) Segmentation fault
8. What is the output of this C code?
#include <stdio.h>
void foo(float *);
int main()
{
int i = 10, *p = &i;
foo(&i);
}
void foo(float *p)
{
printf(“%f\n”, *p);
}
(a) 10.000000
(b) 0.000000
(c) Compile time error
(d) Undefined behaviour
9. What is the output of this C code?
#include <stdio.h>
int main()
{
int i = 97, *p = &i;
foo(&i);
printf(“%d “, *p);
}
void foo(int *p)
{
int j = 2;
p = &j;
printf(“%d “, *p);
}
(a) 2 97
(b) 2 2
(c) Compile time error
(d) Segmentation fault/code crash
10. What is the output of this C code?
#include <stdio.h>
int main()
{
int i = 97, *p = &i;

Amity Directorate of Distance & Online Education


20 Data and File Structure Using ‘C’
foo(&p);
printf(“%d “, *p);
Notes
return 0;
}
void foo(int **p)
{
int j = 2;
*p = &j;
printf(“%d “, **p);
}
(a) 2 2
(b) 2 97
(c) Undefined behaviour
(d) Segmentation fault/code crash

1.8 Questions and Exercises


1. Define the term ‘Pointer’. List down the various advantages of using pointers in a C
program.
2. How pointer are initialized and implemented in C. Write a program to explain the
concept.
3. Explain with the help of a C program, the concept of Pointer Arithmetic in C.
4. How printer in C incorporates the concept of Arrays. Write a suitable program to
demonstrate the concept.
5. Twenty-five numbers are entered from the keyboard into an array. Write a program
to find out how many of them are positive, how many are negative, how many are
even and how many odd.
6. How are integer and character data types similar and different from each other?
7. What are the usages of enum type data?
8. Compare the sizes of different types of integer data types.
9. What are composite data types?

1.9 Key Terms


 Array of Pointer: A multi-dimensional array can be expressed in terms of an array
of pointers rather than as a pointer to a group of contiguous arrays. In such
situations the newly defined array will have one less dimension than the original
multi-dimensional array. Each pointer will indicate the beginning of a separate
(n – 1) dimensional array.
 Double: It stores real numbers with double precision, i.e., twice the storage space
required by float
 Pointer Notation: The actual address of a variable is not known immediately. We
can determine the address of a variable using ‘address of’ operator (&).
 Pointer: It is a variable which can hold the address of a memory location rather
than the value at the location.
 Structure: It’s a Collection of variables of different types.

Check Your Progress: Answers


1. (a) 1 2 3 4 5 0
2. (b) Compile time error
Amity Directorate of Distance & Online Education
Introduction to Data Structure 21
3. (a) 0 3 0 0 0 0
4. (c) Compile time error
Notes
5. (c) Compile time error
6. (c) Compile time error
7. (a) 10
8. (b) 0.000000
9. (a) 2 97
10. (a) 2 2

1.10 Further Readings


 Balagurusamy, Programming In Ansi C, 5E, Tata McGraw-Hill Education, 2011.
 Harsha Priya, R. Ranjee, Programming and Problem Solving Through “C”
Language, Firewall media, 2006.
 V. Rajaraman, computer programming in c,PHI Learning Pvt. Ltd., 2006.
 Yogish Sachdeva,Beginning Data Structures Using C, 2003.
 Hanan Samet,Foundations of Multidimensional and Metric Data Structures, Morgan
Kaufmann, 2006.
 Ajay Kumar, Data Structure for C Programming, Laxmi Publications, 2004.

Amity Directorate of Distance & Online Education


22 Data and File Structure Using ‘C’

Unit 2: Searching and Sorting Techniques


Notes
Structure
2.1 Introduction
2.2 Sorting
2.3 Simple Sorting Schemes
2.3.1 Bubble Sort
2.3.2 Insertion Sort
2.3.3 Heap Sort
2.3.4 Quick Sort
2.3.5 Merge Sort
2.3.6 Selection Sort
2.3.7 Hash Table
2.3.8 Radix Sort
2.4 Comparison of Sorting Algorithm
2.5 Order Statistics
2.5.1 ith Order Statistic
2.5.2 A Median
2.5.3 Selection Problem
2.5.4 Minimum and Maximum
2.5.5 Simultaneous Minimum and Maximum
2.6 Searching
2.7 Summary
2.8 Check Your Progress
2.9 Questions and Exercises
2.10 Key Terms
2.11 Further Readings

Objectives
After studying this unit, you should be able to:
 Describe internal and external sorting.
 Explain various simple sorting schemes like quick sort, merge sort, etc.
 Compare various sorting algorithms
 Define and explain order statistics
 Describe linear and binary search.

2.1 Introduction
In this lesson we have four topics, in first topic, that is internal sorting we discuss
definition and example of internal sorting. In the next topic we will see various simple
sorting schemes like bubble sort, insertion sort, heap sort, and quick sort, merge sort
Amity Directorate of Distance & Online Education
Searching and Sorting Techniques 23
and bin sort with their algorithms and one example for each. In the next section we will
compare various sorting algorithm. Last section shows the concept of order statistics,
here, topics to be discussed are ith order statistic, median, selection problem, minimum Notes
and maximum, simultaneous minimum and maximum.

2.2 Sorting
A sorting algorithm is an algorithm which puts elements of a list in a certain order. Order
can either be a numerical order or lexicographical order. Efficient sorting is an important
for optimizing the use of other algorithms which require input data to be in sorted lists.
Data is often taken to be in an array, which allows random access, rather than a list,
which only allows sequential access, though often algorithms can be applied with
suitable modification to either type of data.
Sorting algorithms are prevalent in introductory computer science classes, where
the abundance of algorithms for the problem provides a gentle introduction to a variety
of core algorithm concepts, such as big O notation, divide and conquer algorithms, data
structures such as heaps and binary trees, randomized algorithms, best, worst and
average case analysis, time-space trade-offs, and upper and lower bounds.

Types of Sorting
There are two type of sorting, external and internal sorting.
Internal Sorting: In this all the data to be sorted is available in the high-speed main
memory of the computer. An internal sort is a sorting process which takes place entirely
inside the main memory of a computer. This is only possible whenever the data which is
needed to be sorted is small enough, so that it can be held in the main memory. For
sorting larger datasets, it may be necessary to hold only a small amount (chunk) of data
in memory at a time, since it won’t all fit. The rest of the data is normally held on some
larger, but slower medium, like a hard-disk. Any reading or writing of data to and from
this slower media can slow the sortation process considerably. This issue has
implications for different sort algorithms.
Example:
 Bubble sort
 Insertion sort
 Quick sort
 Two-way merge sort
 Heap sort
Consider a Bubble sort, where adjacent records are swapped in order to get them
into the right order, so that records appear to “bubble” up and down through the data
space. If this has to be done in chunks, then when we have sorted all the records in
chunk 1, we move on to chunk 2, but we find that some of the records in chunk 1 need
to “bubble through” chunk 2, and vice versa (i.e., there are records in chunk 2 that
belong in chunk 1, and records in chunk 1 that belong in chunk 2 or later chunks). This
will cause the chunks to be read and written back to disk many times as records cross
over the boundaries between them, resulting in a considerable degradation of
performance. If the data can all be held in memory as one large chunk, then this
performance hit is avoided.
On the other hand, some algorithms handle external sorting rather better. A Merge
sort breaks the data up into chunks, sorts the chunks by some other algorithm (maybe
bubble sort or Quick sort) and then recombines the chunks two by two so that each
recombined chunk is in order. This approach minimises the number or reads and writes
of data-chunks from disk, and is a popular external sort method.

Amity Directorate of Distance & Online Education


24 Data and File Structure Using ‘C’

External Sorting: In this sorting methods are employed when the data to be sorted
is too large to fit in primary memory.
Notes
Characteristics of External Sorting
 During the sort, some of the data must be stored externally. Typically the data will
be stored on tape or disk.
 The cost of accessing data is significantly greater than either bookkeeping or
comparison costs.
 There may be severe restrictions on access. For example, if tape is used, items
must be accessed sequentially.

Criteria for Developing an External Sorting Algorithm


 Minimize number of times an item is accessed.
 Access items in sequential order

2.3 Simple Sorting Schemes


Sorting is any process of arranging items according to a certain meaningful sequence,
Like in increasing, decreasing or alphabetical order.
Here we will discuss some popular but simple sorting schemes.

2.3.1 Bubble Sort


In this sorting algorithm, multiple swapping take place in one pass. Smaller elements
move (or ‘bubble’ up) to the top of the list, hence the name given to the algorithm.
In this method, adjacent members of the list to be sorted are compared. If the item
on top is greater than the item immediately below it, then they are swapped. This
process is carried on until the list is sorted. The detailed algorithm follows:
Algorithm: Bubble Sort (A[ ], n)
{
for i = 1 to n
for j = n downto i+1
if a[j] <= a[j–1]
swap(a[j], a[j–1])
}

Bubble Sort Process


There are almost (n – 1) passes required.
 During the first pass A1 and A2 are compared, and if they are out of order, then A1
and A2 are interchanged. This process is repeated for records A2 and A3, A3 and A4,
… An – 2 and An – 1, An – 1 and An.
 This method will cause records with small keys to move or “bubble up”.
 After the first pass, the record with the largest key will be in the nth position. On
each successive pass, the next largest element will be placed in position (n – 1), (n
– 2) …, 2 respectively, thereby resulting in a sorted order.
 After each pass, a check can be made to determine whether any interchanges were
made during that pass. If no interchanges occurred, then the elements are sorted
and no further passes are required.

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 25
Analysis of Bubble Sort

Pass No. of Comparisons Notes


1 (n – 1)
2 (n – 2)
3 (n – 3)
… …
i (n – i)
(n – 1) n – (n – 1) = n – n + 1 = 1

Thus the total no. of comparisons is


= (n – 1) + (n – 2) + … + (n – (n – 1))
= (n – 1) + (n – 2) + … + 1
= n (n – 1)/2 comparisons and exchanges.
Thus the time complexity T(n) = O(n2)
Worst-case occurs when the array elements are in descending order.
Example: n = 6,

1 2 3 4 5 6
A 10 2 5 12 7 9

Pass = 1
2 10 5 12 7 9
2 5 10 12 7 9
2 5 10 7 12 9
A 2 5 10 7 9 12 6th location is
sorted
nth location

Pass = 2
2 5 7 10 9 12 5th location is
sorted
2 5 7 9 10 12 6th location is
sorted
(n – 1) n

Pass = 3 No Exchanges occurred, All elements are correctly phased. Thus sorted array
is
2 5 7 9 10 12

Implementation of bubble sort in c:


#include<stdio.h>
#include<conio.h>
int main( )
{
int a[100];

Amity Directorate of Distance & Online Education


26 Data and File Structure Using ‘C’
int i, j, temp, n ;
printf("how many numbers you want to sort : \n");
Notes
scanf("%d",&n);
printf("Enter %d number values you want to sort\n", n);
for(j=0; j<n; j++)
scanf("%d",&a[j]);

for(j=1;j<n;j++)
{
for(i=0; i<n; i++)
{
if(a[i]>a[i+1])
{
temp=a[i];
a[i]=a[i+1];
a[i+1]=temp;
}
}
}

printf ( "\n\nArray after sorting:\n") ;

for ( i = 0 ; i <n ; i++ )


printf ( "%d\t", a[i] ) ;
getch();
}
Output

Figure 1.1: Output of bubble sort

2.3.2 Insertion Sort


This is a naturally occurring sorting method exemplified by a card player arranging the
cards dealt to him. He picks up the cards as they are dealt and inserts them into the

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 27
required position. Thus at every step, we insert an item into its proper place in an
already ordered list.
Notes
Algorithm: Insertion Sort (A[ ], n)
{
for (p=1 to n)
temp = A[j]
i = p-1
while (i >=1 and temp < A[i])
A[i+1] = A[i]
i = i-1
A[i+1] = temp
}
Thus to find the correct position search the list till an item just greater than the target is
found. Shift all the items from this point one down the list. Insert the target in the
vacated slot. Repeat this process for all the elements in the list. This results in sorted
list.
Example: Sort the following array using insertion sort.

A 10 2 5 12 7

Here n = 5
j=2
A 2 10 5 12 7

j=3
A 2 5 10 12 7

j=4
A 2 5 10 12 7

j=5
A 2 5 10 12 7

Sorted Array is
A 2 5 7 10 12

Insertion Sort – Worst-case Analysis

Amity Directorate of Distance & Online Education


28 Data and File Structure Using ‘C’

The worst-case running time w(n) is given by

Notes n n n
w(n) = c1 n + c2(n – 1) + c3(n – 1) + c4  j + c5  ( j  1) + c6  ( j  1) + c7(n – 1)
j 2 j 2 j 2

 n(n  1)   n(n  1)   n(n  1) 


= c1 n + c2(n – 1) + c3(n – 1) + c4   1 + c5   + c6   + c7(n – 1)
 2   2   2 
n
n(n  1)
 j = 2 + 3 + … + n =
j 2 2
1

n
(n  1)n(n  1  1 ) n(n  1)
 ( j  1) 1 + 2 + 3 + … + (n – 1) =
j 2 2

2

So, T(n) = O(n2).

Insertion Sort – Best-case Analysis


The best-case occurs if the array is already sorted.
Under the assumption that the body of while loop is never executed.
T(n) = c1 n + c2 (n – 1) + c3 (n – 1) + c4 (n – 1) + c5 (n – 1).T(n) = O(n).

Implementation of insertion sorting in C:


/*C Program to Implement Insertion Sort */
#include <stdio.h>
#define MAX 7
void insertion_sort(int *);
void main()
{
int a[MAX], i;
printf("enter elements to be sorted:");
for (i = 0;i < MAX;i++)
{
scanf("%d", &a[i]);
}
insertion_sort(a);
printf("sorted elements:\n");
for (i = 0;i < MAX; i++)
{
printf(" %d", a[i]);
}
}
/* sorts the input */
void insertion_sort(int * x)
{
int temp, i, j;

for (i = 1;i < MAX;i++)

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 29
{
temp = x[i];
Notes
j = i - 1;
while (temp < x[j] && j >= 0)
{
x[j + 1] = x[j];
j = j - 1;
}
x[j + 1] = temp;
}
}
Output
cc insertionsort.c
/* Average case */
$ a.out
enter elements to be sorted:8 2 4 9 3 6 1
sorted elements:
1 2 3 4 6 8 9
/* Best case */
$ a.out
Enter elements to be sorted:1 2 3 4 5 6 7
Sorted elements:
1 2 3 4 5 6 7
/* Worst case */
$ a.out
Enter elements to be sorted:7 6 5 4 3 2 1
Sorted elements:
1 2 3 4 5 6 7

2.3.3 Heap Sort

Heaps
Heap sort introduces another algorithm design technique which is using a data structure
called “heap”, to manage information during the execution of the algorithm. Not only is
the heap data structure useful for heap sort, but it also makes an efficient priority queue.
The term “heap” was originally coined in the context of heap sort, but it has since
come to refer to “garbage-collected storage,” such as the programming languages Lisp
and Java provide. Our heap data structure is not garbage-collected storage, and
whenever we refer to heaps in this book, we shall mean the structure defined in this
chapter.
The (binary) heap data structure is an array object that can be viewed as a nearly
complete binary tree, as shown in following figure. Each node of the tree corresponds to
an element of the array that stores the value in the node. The tree is completely filled on
all levels except possibly the lowest, which is filled from the left up to a point.

Amity Directorate of Distance & Online Education


30 Data and File Structure Using ‘C’

Notes 16

2 3
14 10
4 7 1 2 3 4 5 6 7 8 9 10
5 6
8 7 9 3 16 14 10 8 7 9 3 2 4 1
8 9 10
2 4 1

(a) (b)

Figure 1.2: A heap viewed as (a) a binary tree and (b) an array

An array A that represents a heap is an object with two attributes: length [A]., which
is the number of elements in the array, and heap-size[A], the number of elements in the
heap stored within array A. That is, although A[1 ... length[A]] may contain valid
numbers, no element past A [heap-size [A]], where heap-size[A]  length[A], is an
element of the heap.
The root of the tree is A[1], and given the index i of a node, the indices of its parent
PARENT(i), left child LEFT(i), and right child RIGHT(i) can be computed simply:
PARENT(i) return i/2
LEFT (i) return 2i
RIGHT (i) return 2i + 1
There are two kinds of binary heaps: max-heaps and min-heaps. In both kinds, the
values in the nodes satisfy a heap property, the specifics of which depend on the kind of
heap. In a max-heap, the max-heap property is that for every node i other than the root,

A[PARENT(i)]  A[i]

that is, the value of a node is at most the value of its parent. Thus, the largest element
in a max-heap is stored at the root, and the sub tree rooted at a node contains values
no larger than that contained at the node itself. A min-heap is organized in the opposite
way; the min-heap property is that for every node i other than the root,

A[PARENT(i)]  A[i]

The smallest element in a min-heap is at the root.


For the heapsort algorithm, we use max-heaps. Min-heaps are commonly used in
priority queues.

Maintaining the Heap Property


HEAPIFY is a most important subroutine for manipulating heaps after insert or delete
operation. In all heapify, inputs are an array ‘A’ and an index ‘i’ into the array. When
HEAPIFY is called, it is assumed that the binary trees rooted at LEFT(i) and RIGHT(i)
are heaps, but that A[i] may be smaller than its children, thus violating the heap
property. The function of HEAPIFY is to let the value at A[i] “float down” in the heap so
that the subtree rooted at index i becomes a heap.
Algorithm HEAPIFY(A, i)
{
l  LEFT(i)
r  RIGHT(i)

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 31
if(l  heap-size[A] and A[l] > A[i] )
largest  l Notes
else
largest  i
if (r heap-size[A] and A[r] > A[largest] )
largest  r
if (largest  i)
exchange A[i]  A[largest]
HEAPIFY(A, largest)
}
Following figure illustrates the action of HEAPIFY. At each step, the largest of the
elements A[i], A[LEFT(i)], and A[RIGHT(i)] is determined, and its index is stored in
largest. If A[i] is largest, then the subtree rooted at node i is a heap and the procedure
terminates. Otherwise, one of the two children has the largest element, and A[i] is
swapped with A[largest], which causes node i and its children to satisfy the heap
property. The node indexed by largest, however, now has the original value A[i], and
thus the subtree rooted at largest may violate the heap property. Consequently,
HEAPIFY must be called recursively on that subtree.
1 1

16 16

2 3 2 3

I 4 10 14 10

4 5 6 7 4 5 6 7
i 4 7 9 3
14 7 9 3

8 9 10 8 9 10

2 8 1 2 8 1

(a) (b)

1
16

2 3
I 14 10
4 5 6 7
8 7 9 3

8 9 10
I
2 4 1

(c)

Figure 1.3: The action of HEAPIFY

The running time of HEAPIFY


The running time of HEAPIFY on a subtree of size n rooted at given node i is the (l)
time to fix up the relationships among the elements A[i], A[LEFT(i)], and A[RIGHT(i)],
plus the time to run HEAPIFY on a subtree rooted at one of the children of node i. The
children’s subtrees each have size at most 2n/3, the worst case occurs when the last
row of the tree is exactly half full, and the running time of HEAPIFY can therefore be
described by the recurrence

T(n)  T(2n/3) + (1)

Amity Directorate of Distance & Online Education


32 Data and File Structure Using ‘C’

Solution to this recurrence, by case 2 of the master theorem, is T(n) = O(lgn).

Notes We can characterize the running time of HEAPIFY on a node of height h as O(h).

Building a Heap
We can use the procedure HEAPIFY in a bottom-up manner to convert an array
A[1. .n], where n = length [A], into a heap. Now, it can be proven that the elements in
the sub array A [(n/2 + 1) ... n] are all leaves of the tree, and so each is a l-element
heap to begin with. The procedure BUILD-HEAP goes through the remaining nodes of
the tree and runs HEAPIFY on each one.
Algorithm BUILD-HEAP (A)
{
heap-size[A]  length[A]
for(i  length[A]/2 downto 1)
HEAPIFY(A, i)
}
Following figure shows an example of the action of BUILD-HEAP.

A 4 1 3 2 16 9 10 14 8 7

1 1

4 4

2 3 2 3

1 3 1 3

4 5 6 7 4 5 6 7

2 i 16 i 2 16 9 10
9 10
8 9 10 8 9 10

14 8 7 14 8 7

(a) (b)

1 1

4 4
2 3 2 3
1 10 i i 1 10
4 5 6 7 4 5 6 7
14 16 9 3 14 16 9 3

8 9 10 8 9 10
2 8 7 2 8 7

(c) (d)

1 1
i 4 16
2 3 2 3

16 10 i 14 10
4 5 6 7 4 5 6 7
14 7 9 3 8 7 9 3

8 9 10 8 9 10
2 8 1 2 4 1

(e) (f)

Figure 1.4: The operation of BUILD-HEAP

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 33
We can compute a simple upper bound on the running time of BUILD-HEAP as follows.
Each call to HEAPIFY costs O(log n) time, and there are O(n) such calls. Thus, the
running time is O(n log n). This upper bound, though correct, is not asymptotically tight. Notes
[lgn]
 n   [lgn] h 
  2
h0
h 1 

O(h)  O  n 2h 
h0

The last summation can be evaluated by substituting x = 1/2, which yields



h 1/ 2
2
h0
h

(1– 1/ 2)2
2

Thus, the running time of BUILD-HEAP can be bounded as

 [lgn] h    h
O  n  h   O  n h 
 h0 2   h0 2 
 O(n)

Hence, we can build a heap from an unordered array in linear time.

Heap Sort Algorithm


The heap sort algorithm starts by using BUILD-HEAP to build a heap on the input array
A[1 . . n], where n = length[A].
Algorithm HEAPSORT(A)
{
BUILD-HEAP(A)
for(i  length[A] downto 2)
exchange A[l]  A[i]
heap-size[A]  heap-size[A]-1
HEAPIFY(A, l)
}

16 14

14 10 8 10

8 7 9 3 4 7 9 3

2 4 1 2 1 16 i

(a) (b)

10 9

8 9 8 3

4 7 1 3 4 7 1 2
i
2 14 16 10 14 16
(c) (d)
Contd...

Amity Directorate of Distance & Online Education


34 Data and File Structure Using ‘C’

8 7
Notes 7 3 3
4

4 2 1 i 9 1 2 8 i 9

10 11 16 10 14 16
(e) (f)

4 3

2 3 2 1

1 i 7 8 9 i 4 7 8 9

10 14 16 10 14 16

(g) (h)

2 1

1 3 i i 2 3

4 7 8 9 4 7 8 9

10 14 16 10 14 16
(i) (j)

A 1 2 3 4 7 8 9 10 14 16
(k)

Figure 1.5: The Operation of Heap Sort

Analysis of Heap Sort


The time complexity of heap sort is O(n log n).

Implementation of Heap Sort in C


/* C Program to sort an array based on heap sort algorithm(MAX
heap) */
#include <stdio.h>
void main()
{
int heap[10], no, i, j, c, root, temp;

printf("\n Enter no of elements :");


scanf("%d", &no);
printf("\n Enter the nos : ");
for (i = 0; i < no; i++)
scanf("%d", &heap[i]);
for (i = 1; i < no; i++)
{

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 35
c = i;
do
Notes
{
root = (c - 1) / 2;
if (heap[root] < heap[c]) /* to create MAX heap
array */
{
temp = heap[root];
heap[root] = heap[c];
heap[c] = temp;
}
c = root;
} while (c != 0);
}
printf("Heap array : ");
for (i = 0; i < no; i++)
printf("%d\t ", heap[i]);
for (j = no - 1; j >= 0; j--)
{
temp = heap[0];
heap[0] = heap[j /* swap max element with rightmost
leaf element */
heap[j] = temp;
root = 0;
do
{
c = 2 * root + 1; /* left node of root element */
if ((heap[c] < heap[c + 1]) && c < j-1)
c++;
if (heap[root]<heap[c] && c<j) /* again rearrange
to max heap array */
{
temp = heap[root];
heap[root] = heap[c];
heap[c] = temp;
}
root = c;
} while (c < j);
}
printf("\n The sorted array is : ");
for (i = 0; i < no; i++)
printf("\t %d", heap[i]);
printf("\n Complexity : \n Best case = Avg case = Worst case
= O(n logn) \n");
}
Amity Directorate of Distance & Online Education
36 Data and File Structure Using ‘C’
Output

Notes $ cc heap.c
$ a.out
Average case
Enter no of elements: 7
Enter the no’s: 6
5
3
1
8
7
2
Heap array: 8 6 7 1 5 3 2
The sorted array is: 1 2 3 5 6 7 8
Complexity:
Best case = Avg case = Worst case = O(n logn)
$ a.out
/* Best case
Enter no of elements: 7
Enter the no’s: 12
10
8
9
7
4
2
Heap array: 12 10 8 9 7 4 2
The sorted array is: 2 4 7 8 9 10 12
Complexity:
Best case = Avg case = Worst case = O(n logn)
$ a.out
/* Worst case
Enter no of elements: 7
Enter the no’s: 5
7
12
6
9
10
14
Heap array: 14 9 12 5 6 7 10
The sorted array is: 5 6 7 9 10 12 14
Complexity:
Best case = Avg case = Worst case = O(n logn)
*/
Amity Directorate of Distance & Online Education
Searching and Sorting Techniques 37
2.3.4 Quick Sort
This is the most widely used internal sorting algorithm. Its popularity lies in the ease of Notes
implementation, moderate use of resources and acceptable behaviour among a variety
of sorting cases. The basis of quick sort is the divide and conquer strategy i.e. Divide
the problem into sub-problems, until solved sub problems are found. In its basic form, it
was invented by C.A.R. Hoare in 1960.
Algorithm quicksort( a, low, high )
{
if ( high > low )
q = partition (a, low, high)
quicksort (a, low, q-1)
quicksort (a, q+1, high)
}
Algorithm partition(a[], lo, hi)
{
// lo is the lower index, hi is the upper index
i=lo
j=hi
piv=a[lo];
// partition
while (i<=j)
while (a[i]< piv) i++
while (a[j]> piv) j--
if (i<=j)
a[i]a[j] //swapping
i++
j--
a[i]  a[piv] //swapping
return j
}
Example: Given list 50, 40, 20, 60, 80, 100, 45, 70, 105, 30, 90, 75

Pivot

Numbers 50 40 20 60 80 100 45 70 105 30 90 75


Location I=1

Numbers 50 40 20 60 80 100 45 70 105 30 90 75


Location I=1 I=2

Numbers 50 40 20 60 80 100 45 70 105 30 90 75


Location I=1 I=2 I=3

Amity Directorate of Distance & Online Education


38 Data and File Structure Using ‘C’

Numbers 50 40 20 60 80 100 45 70 105 30 90 75


Notes Location I=1 I=2 I=3 I=4

Numbers 50 40 20 60 80 100 45 70 105 30 90 75


Location I=1 I=2 I=3 I=4 J=12

Numbers 50 40 20 60 80 100 45 70 105 30 90 75


Location I=1 I=2 I=3 I=4 J=11 J=12

Numbers 50 40 20 60 80 100 45 70 105 30 90 75


Location I=1 I=2 I=3 I=4 J=10 J=11 J=12

Exchange A[4]  A[3], I++, J++

Numbers 50 40 20 30 80 100 45 70 105 60 90 75


Location I=5

Numbers 50 40 20 30 80 100 45 70 105 60 90 75


Location I=5 J=9

Numbers 50 40 20 30 80 100 45 70 105 60 90 75


Location I=5 J=8 J=9

Numbers 50 40 20 30 80 100 45 70 105 60 90 75


Location I=5 J=7 J=8 J=9

Exchange A[5]  A[7], I++, J++

Numbers 50 40 20 30 45 100 80 70 105 60 90 75


Location I=6,

Numbers 50 40 20 30 45 100 80 70 105 60 90 75


Location I=6,J=6

Numbers 50 40 20 30 45 100 80 70 105 60 90 75


Location J=7 I=6,J=6

Here i  j so Exchange A[1]  A[7],

Numbers 45 40 20 30 50 100 80 70 105 60 90 75

and partition this list in two lists

Numbers 45 40 20 30 50 100 80 70 105 60 90 75

Use this method recursively for these two lists. Now we get sorted array as:

Numbers 20 30 40 45 50 60 70 75 80 90 100 105

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 39
Analysis of Quick Sort
We count only the number of element comparisons c(n). The frequency count of other Notes
operations is the order of c(n).
Worst-case: T(n) = O(n2)

Average-case: T(n) = O(n ◊ log e n)

Average-case: T(n) = O(n ◊ log e n)

Quick Sort Algorithm Steps


1. Make a list of items that need to be sorted, lets apply in an array.
2. Choose any element as pivot element from the array list. (Complexity largely
depends on choosing the pivot element)
3. Rearrange the array list so that all the elements with value less than the pivot will
come before the pivot and the element with value greater will come after the pivot
with in the same array, which make pivot element in the sorted position.(If the
reverse the order we are reversing the sorting order that is descending).
4. Apply recursively the 3rd step to the sub array of the element with smaller values
and separate the sub array of the elements with the greater values.
Quick sort algorithm performs bad in worst case it means when data is already
arranged in decreasing order.

Implementation of Quick Sort in C


#include<stdio.h>
#include<conio.h>
//quick Sort function to Sort Integer array list
void quicksort(int array[], int firstIndex, int lastIndex)
{
//declaaring index variables
int pivotIndex, temp, index1, index2;

if(firstIndex < lastIndex)


{
//assigninh first element index as pivot element
pivotIndex = firstIndex;
index1 = firstIndex;
index2 = lastIndex;

//Sorting in Ascending order with quick sort


while(index1 < index2)
{
while(array[index1] <= array[pivotIndex] && index1 <
lastIndex)
{
index1++;
}
while(array[index2]>array[pivotIndex])

Amity Directorate of Distance & Online Education


40 Data and File Structure Using ‘C’
{
index2--;
Notes
}

if(index1<index2)
{
//Swapping opertation
temp = array[index1];
array[index1] = array[index2];
array[index2] = temp;
}
}

//At the end of first iteration, swap pivot element with


index2 element
temp = array[pivotIndex];
array[pivotIndex] = array[index2];
array[index2] = temp;

//Recursive call for quick sort, with partiontioning


quicksort(array, firstIndex, index2-1);
quicksort(array, index2+1, lastIndex);
}
}

int main()
{
//Declaring variables
int array[100],n,i;

//Number of elements in array form user input


printf("Enter the number of element you want to Sort : ");
scanf("%d",&n);

//code to ask to enter elements from user equal to n


printf("Enter Elements in the list : ");
for(i = 0; i < n; i++)
{
scanf("%d",&array[i]);
}

//calling quickSort function defined above


quicksort(array,0,n-1);

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 41
//print sorted array
printf("Sorted elements: ");
Notes
for(i=0;i<n;i++)
printf(" %d",array[i]);

getch();
return 0;
}
Output

Figure 1.6: Output of Quick Sort

2.3.5 Merge Sort


Merge sort is also one of the ‘divide and conquer’ classes of algorithms. The basic idea
in this is to divide the list into a number of sub lists, sort each of these sub lists and
merge them to get a single sorted list. The illustrative implementation of 2 way merge
sort sees the input initially as n lists of size 1. These are merged to get n/2 lists of size
2. These n/2 lists are merged pair wise and so on till a single list is obtained. This can
be better understood by the following example. This is also called concatenate sort.
Figure 8.5 depicts 2-way merge sort.

Figure 1.7: 2-way Merge Sort

Merge sort is the best method for sorting linked lists in random order. The total
computing time is O(n log2 n).
The disadvantage of using merge sort is that it requires two arrays of the same size
and space for the merge phase. That is, to sort a list of size n, it needs space for 2n
elements.

Amity Directorate of Distance & Online Education


42 Data and File Structure Using ‘C’
Algorithm Merge sort (A)
{
Notes
if array size > 1
Divide array in half
Call Merge sort on first half.
Call Merge sort on second half.
Merge two halves.
}

Algorithm Merge (Passed two arrays)


{
Compare leading element in each array
Select the smallest value and place in temporary array.
(If one input array is empty then place remainder of other array
in output array)
}

Analysis of Merge Sort


In all cases time complexity of merge sort is O(n log n)

Program to Implement Merge Sort in C


/* c program for merge sorting */
#include<stdio.h>
#include<conio.h>
void merge(int [],int ,int ,int );
void part(int [],int ,int );
int main()
{
int arr[30];
int i,size;
printf("\n\t------- Merge sorting method -------\n\n");
printf("Enter total no. of elements : ");
scanf("%d",&size);
for(i=0; i<size; i++)
{
printf("Enter %d element : ",i+1);
scanf("%d",&arr[i]);
}
part(arr,0,size-1);
printf("\n\t------- Merge sorted elements -------\n\n");
for(i=0; i<size; i++)
printf("%d ",arr[i]);
getch();
return 0;
}

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 43
void part(int arr[],int min,int max)
{
Notes
int mid;
if(min<max)
{
mid=(min+max)/2;
part(arr,min,mid);
part(arr,mid+1,max);
merge(arr,min,mid,max);
}
}

void merge(int arr[],int min,int mid,int max)


{
int tmp[30];
int i,j,k,m;
j=min;
m=mid+1;
for(i=min; j<=mid && m<=max ; i++)
{
if(arr[j]<=arr[m])
{
tmp[i]=arr[j];
j++;
}
else
{
tmp[i]=arr[m];
m++;
}
}
if(j>mid)
{
for(k=m; k<=max; k++)
{
tmp[i]=arr[k];
i++;
}
}
else
{
for(k=j; k<=mid; k++)

Amity Directorate of Distance & Online Education


44 Data and File Structure Using ‘C’
{
tmp[i]=arr[k];
Notes
i++;
}
}
for(k=min; k<=max; k++)
arr[k]=tmp[k];
}
Output

Figure 1.8: Output of merge Sort

2.3.6 Selection Sort


Selection sort is rather simple: we repeatedly find the next largest (or smallest) element
in the array and move it to its final position in the sorted array. Assume that we wish to
sort the array in increasing order, i.e. the smallest element at the beginning of the array
and the largest element at the end. We begin by selecting the largest element and
moving it to the highest index position. We can do this by swapping the element at the
highest index and the largest element. We then reduce the effective size of the array by
one element and repeat the process on the smaller (sub) array. The process stops
when the effective size of the array becomes 1 (an array of 1 element is already sorted).
For example, consider the following array, shown with array elements in sequence
separated by commas:
63, 75, 90, 12, 27
The leftmost element is at index zero, and the rightmost element is at the highest
array index, in our case, 4 (the effective size of our array is 5). The largest element in
this effective array (index 0-4) is at index 2. We have shown the largest element and the
one at the highest index in bold. We then swap the element at index 2 with that at index
4. The result is:
63, 75, 27, 12, 90
We reduce the effective size of the array to 4, making the highest index in the
effective array now 3. The largest element in this effective array (index 0-3) is at index
1, so we swap elements at index 1 and 3 (in bold):
63, 12, 27, 75, 90
The next two steps give us:
27, 12, 63, 75, 90
12, 27, 63, 75, 90

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 45
The last effective array has only one element and needs no sorting. The entire array
is now sorted. The algorithm for an array, x, with lim elements is easy to write down:
Notes
for (eff_size = lim; eff_size > 1; eff_size--)
find maxpos, the location of the largest element in the
effective
array: index 0 to eff_size – 1
swap elements of x at index maxpos and index eff_size - 1

Program to Implement Selection Sort in C


#include <stdio.h>
int main()
{
int array[100], n, c, d, position, swap;

printf("Enter number of elements\n");


scanf("%d", &n);

printf("Enter %d integers\n", n);

for ( c = 0 ; c < n ; c++ )


scanf("%d", &array[c]);

for ( c = 0 ; c < ( n - 1 ) ; c++ )


{
position = c;

for ( d = c + 1 ; d < n ; d++ )


{
if ( array[position] > array[d] )
position = d;
}
if ( position != c )
{
swap = array[c];
array[c] = array[position];
array[position] = swap;
}
}

printf("Sorted list in ascending order:\n");

for ( c = 0 ; c < n ; c++ )


printf("%d\n", array[c]);

Amity Directorate of Distance & Online Education


46 Data and File Structure Using ‘C’
return 0;
}
Notes
Output

Figure 1.9: Output of Selection Sort

2.3.7 Hash Table


Hash tables are a simple and effective method to implement dictionaries. Average time
to search for an element is O(1), while worst-case time is O(n). A hash table is simply
an array that is addressed via a hash function. For example, in Figure 1.6, hash Table
is an array with 8 elements. Each element is a pointer to a linked list of numeric data.
The hash function for this example simply divides the data key by 8, and uses the
remainder as an index into the table. This yields a number from 0 to 7. Since the range
of indices for hash Table is 0 to 7, we are guaranteed that the index is valid.

Figure 1.10: A Hash Table

To insert a new item in the table, we hash the key to determine which list the item
goes on, and then insert the item at the beginning of the list. For example, to insert 11,
we divide 11 by 8 giving a remainder of 3. Thus, 11 goes on the list starting at
hashTable[3]. To find a number, we hash the number and chain down the correct list to
see if it is in the table. To delete a number, we find the number and remove the node
from the linked list.

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 47
Entries in the hash table are dynamically allocated and entered on a linked list
associated with each hash table entry. This technique is known as chaining. An
alternative method, where all entries are stored in the hash table itself, is known as Notes
open addressing and may be found in the references.
If the hash function is uniform, or equally distributes the data keys among the hash
table indices, then hashing effectively subdivides the list to be searched. Worst-case
behavior occurs when all keys hash to the same index. Then we simply have a single
linked list that must be sequentially searched. Consequently, it is important to choose a
good hash function. Several methods may be used to hash key values. To illustrate the
techniques, I will assume unsigned char is 8-bits, unsigned short int is 16-bits and
unsigned long int is 32-bits.
Division method (tablesize = prime). This technique was used in the preceeding
example. A hashValue, from 0 to (HASH_TABLE_SIZE - 1), is computed by dividing
the key value by the size of the hash table and taking the remainder. For example:
typedef int HashIndexType;
HashIndexType hash(int key) {
return key % HASH_TABLE_SIZE;
}
Selecting an appropriate HASH_TABLE_SIZE is important to the success of this
method. For example, a HASH_TABLE_SIZE divisible by two would yield even hash
values for even keys, and odd hash values for odd keys. This is an undesirable
property, as all keys would hash to even values if they happened to be even. If
HASH_TABLE_SIZE is a power of two, then the hash function simply selects a subset
of the key bits as the table index. To obtain a more random scattering,
HASH_TABLE_SIZE should be a prime number not too close to a power of two.
Multiplication method (tablesize = 2n). The multiplication method may be used
for a HASH_TABLE_SIZE that is a power of 2. The key is multiplied by a constant, and
then the necessary bits are extracted to index into the table. Knuth recommends using
the the golden ratio, or (sqrt(5) - 1)/2, to determine the constant. Assume the hash table
contains 32 (25) entries and is indexed by an unsigned char (8 bits). First construct a
multiplier based on the index and golden ratio. In this example, the multiplier is 28 x
(sqrt(5) - 1)/2, or 158. This scales the golden ratio so that the first bit of the multiplier is
“1”.
xxxxxxxx key
xxxxxxxx multiplier (158)
xxxxxxxx
x xxxxxxx
xx xxxxxx
xxx xxxxx
xxxx xxxx
xxxxx xxx
xxxxxx xx
xxxxxxx x
xxxxxxxx bbbbbxxx product
Multiply the key by 158 and extract the 5 most significant bits of the least significant
word. These bits are indicated by “bbbbb” in the above example, and represent a
thorough mixing of the multiplier and key. The following definitions may be used for the
multiplication method:

Amity Directorate of Distance & Online Education


48 Data and File Structure Using ‘C’
/* 8-bit index */
typedef unsigned char HashIndexType;
Notes
static const HashIndexType M = 158;
/* 16-bit index */
typedef unsigned short int HashIndexType;
static const HashIndexType M = 40503;
/* 32-bit index */
typedef unsigned long int HashIndexType;
static const HashIndexType M = 2654435769;
/* w=bitwidth(HashIndexType), size of table=2**n */
static const int S = w - n;
HashIndexType hashValue = (HashIndexType)(M * key) >> S;
For example, if HASH_TABLE_SIZE is 1024 (210), then a 16-bit
index is sufficient and S would be assigned a value of 16 - 10 =
6. Thus, we have:
typedef unsigned short int HashIndexType;
HashIndexType hash(int key) {
static const HashIndexType M = 40503;
static const int S = 6;
return (HashIndexType)(M * key) >> S;
}
Variable string addition method (tablesize = 256). To hash a variable-length
string, each character is added, modulo 256, to a total. A hashValue, range 0-255, is
computed.
unsigned char hash(char *str) {
unsigned char h = 0;
while (*str) h += *str++;
return h;
}
Variable string exclusive-or method (tablesize = 256). This method is similar to
the addition method, but successfully distinguishes similar words and anagrams. To
obtain a hash value in the range 0-255, all bytes in the string are exclusive-or’d
together. However, in the process of doing each exclusive-or, a random component is
introduced.
unsigned char rand8[256];
unsigned char hash(char *str) {
unsigned char h = 0;
while (*str) h = rand8[h ^ *str++];
return h;
}

2.3.8 Radix Sort


Radix sort is a small method that many people intuitively use when alphabetizing a
large list of names. (Here Radix is 26, 26 letters of alphabet). Specifically, the list of
names is first sorted according to the first letter of each names, that is, the names are
arranged in 26 classes. Intuitively, one might want to sort numbers on their most

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 49
significant digit. But Radix sort do counter-intuitively by sorting on the least significant
digits first. On the first pass entire numbers sort on the least significant digit and
combine in a array. Then on the second pass, the entire numbers are sorted again on Notes
the second least-significant digits and combine in a array and so on.
Following example shows how Radix sort operates on seven 3-digits number.
INPUT 1st pass 2nd pass 3rd pass
329 720 720 329
457 355 329 355
657 436 436 436
839 457 839 457
436 657 355 657
720 329 457 720
355 839 657 839

In the above example, the first column is the input. The remaining shows the list
after successive sorts on increasingly significant digits position. The code for Radix sort
assumes that each element in the n-element array A has d digits, where digit 1 is the
lowest-order digit and d is the highest-order digit.
RADIX_SORT (A, d)
for i ← 1 to d do
use a stable sort to sort A on digit i
// counting sort will do the job

Analysis
The running time depends on the stable used as an intermediate sorting algorithm.
When each digits is in the range 1 to k, and k is not too large, COUNTING_SORT is the
obvious choice. In case of counting sort, each pass over n d-digit numbers takes O(n+k)
time. There are d passes, so the total time for Radix sort is (n+k)time. There are d
passes, so the total time for Radix sort is (dn+kd). When d is constant and k = (n),
the Radix sort runs in linear time.

Program to Implement Radix Sort in C


/*
* C++ Program To Implement Radix Sort
*/
#include <iostream>
#include <cstdlib>
using namespace std;
/*
* get maximum value in arr[]
*/
int getMax(int arr[], int n)
{
int max = arr[0];
for (int i = 1; i < n; i++)
if (arr[i] > max)
max = arr[i];

Amity Directorate of Distance & Online Education


50 Data and File Structure Using ‘C’
return max;
}
Notes
/*
* count sort of arr[]
*/
void countSort(int arr[], int n, int exp)
{
int output[n];
int i, count[10] = {0};
for (i = 0; i < n; i++)
count[(arr[i] / exp) % 10]++;
for (i = 1; i < 10; i++)
count[i] += count[i - 1];
for (i = n - 1; i >= 0; i--)
{
output[count[(arr[i] / exp) % 10] - 1] = arr[i];
count[(arr[i] / exp) % 10]--;
}
for (i = 0; i < n; i++)
arr[i] = output[i];
}
/*
* sorts arr[] of size n using Radix Sort
*/
void radixsort(int arr[], int n)
{
int m = getMax(arr, n);
for (int exp = 1; m / exp > 0; exp *= 10)
countSort(arr, n, exp);
}

/*
* Main
*/
int main()
{
int arr[] = {170, 45, 75, 90, 802, 24, 2, 66};
int n = sizeof(arr)/sizeof(arr[0]);
radixsort(arr, n);
for (int i = 0; i < n; i++)
cout << arr[i] << " ";
return 0;
}

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 51
Output
$ g++ radix_sort.cpp
Notes
$ a.out
2 24 45 66 75 90 170 802

2.4 Comparison of Sorting Algorithm


Different sorting algorithm shows different behaviour with respect to time and space in
the following table. A comparison among the some sorting algorithms is given.
Table 1.1: Comparison of above discussed Sorting Algorithms

Name Average Worst Memory Stable Method


Bubble sort O (n2) O (n2) O (1) Yes Exchanging
Selection sort O (n2) O (n2) O (1) Yes Selection
2 2
Insertion sort O (n ) O (n ) O (1) Yes Insertion
Merge sort O (n log n) O (n log n) O (n) Yes Merging
Heapsort O (n log n) O (n log n) O (1) No Selection
Quicksort O (n log n) O (n2) O (log n) No Partitioning
Bin Sort/Bucket Sort O (n) O (n) 3n No Other than Exchanging

2.5 Order Statistics


2.5.1 ith Order Statistic
The ith order statistic of a set of n elements is the ith smallest element. For example,
the minimum of a set of elements is the first order statistic (i = 1), and the maximum is
the nth order statistic (i = n).

2.5.2 A Median
A median, informally, is the “halfway point” of the set. When n is odd, the median is
unique, occurring at i = (n + 1)/2. When n is even, there are two medians, occurring at
i = n/2 and i = n/2 + 1. Thus, regardless of the parity of n, medians occur at i = (n +
1)/2 and i = (n + 1)/2 .
Algorithm MEDIAN (A, n)
{
if (n mod 2 = 0)
print A[n/2] and A[n/2 + 1]
else
print A[(n+1)/2]
}
It requires only 1 comparison, thus time complexity is O(1).

2.5.3 Selection Problem


The selection problem can be specified formally as follows:
Input: A set A of n (distinct) numbers and a number i, with 1  i n.

Output: The element x A that is larger than exactly i -1 other elements of A.

Amity Directorate of Distance & Online Education


52 Data and File Structure Using ‘C’

The selection problem can be solved in O(n lg n) time, since we can sort the
numbers using heapsort or merge sort and then simply index the ith element in the
Notes output array.

2.5.4 Minimum and Maximum


How many comparisons are necessary to determine the minimum of a set of n
elements? We can easily obtain an upper bound of n-1 comparison: examine each
element of the set in turn and keep track of the smallest element seen so far. In the
following procedure, we assume that the set resides in array A, where length[A] = n.
Algorithm MINIMUM (A)
{
Min = A[1]
for(i = 2 to length[A])
if (min > A[i])
min = A[i]
return min
}
Algorithm MAXIMUM (A)
{
Max = A[1]
for(i = 2 to length[A])
if (max < A[i])
max = A[i]
return max
}
Finding the maximum or minimum can be accomplished with n-1 comparisons as
well.

2.5.5 Simultaneous Minimum and Maximum


In some applications, we must find both the minimum and the maximum of a set of n
elements. For example, a graphics program may need to scale a set of (x, y) data to fit
onto a rectangular display screen or other graphical output device. To do so, the
program must first determine the minimum and maximum of each coordinate.
It is not too difficult to devise an algorithm that can find both the minimum and the
maximum of n elements using the asymptotically optimal (n) number of comparisons.
Simply find the minimum and maximum independently, using n - 1 comparisons for
each, for a total of 2n - 2 comparisons.

In fact, only 3 n/2 comparisons are necessary to find both the minimum and the
maximum.
Algorithm MINIMAX (A)
{
Min = A[1]
Max = A[1]
for(i = 2 to length[A])
if (min > A[i]) min = A[i]
if (max < A[i]) max = A[i]

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 53
print min and max
}
Notes
This Algorithm takes O(n).

2.6 Searching
The process of finding a particular element of an array is called Searching”. If the item is
not present in the array, then the search is unsuccessful. There are two types of search
(Linear search and Binary Search)

Linear Search
The linear search compares each element of the array with the search key until the
search key is found. To determine that a value is not in the array, the program must
compare the search key to every element in the array. It is also called “Sequential
Search” because it traverses the data sequentially to locate the element.
/* This program use linear search in an array to find the LOCATION of the given
Key value */
/* This program is an example of the Linear Search*/
#include <iostream.h>
int const N=10;
int LinearSearch(int [ ], int); //Function Prototyping
int main()
{
int A[N]= {9, 4, 5, 1, 7, 78, 22, 15, 96, 45}, Skey, LOC;
cout<<“ Enter the Search Key\n”;
cin>>Skey;
LOC = LinearSearch(A, Skey); //call a function
if(LOC == -1)
cout<<“ The search key is not in the array\n Un-Successful
Search\n”;
else
cout<<“ The search key “<<Skey<< “ is at location “<<LOC<<endl;
return 0;
}
int LinearSearch (int b[ ], int skey)//function definition
{
int i;
for (i=0; i<= N-1; i++) if(b[i] == skey) return i;
return -1;
}

Algorithm: (Linear Search)


LINEAR (A, SKEY)
Here A is a Linear Array with N elements and SKEY is a given
item of information to search. This algorithm finds the location
of SKEY in A and if successful, it returns its location
otherwise it returns -1 for unsuccessful.

Amity Directorate of Distance & Online Education


54 Data and File Structure Using ‘C’
1. Repeat for i = 0 to N-1
2. if(A[i] = SKEY) return i [Successful Search]
Notes
[ End of loop ]
3. return -1 [Un-Successful]
4. Exit.

Binary Search
It is useful for the large sorted arrays. The binary search algorithm can only be used
with sorted array and eliminates one half of the elements in the array being searched
after each comparison. The algorithm locates the middle element of the array and
compares it to the search key. If they are equal, the search key is found and array
subscript of that element is returned. Otherwise the problem is reduced to searching
one half of the array. If the search key is less than the middle element of array, the first
half of the array is searched. If the search key is not the middle element of in the
specified sub array, the algorithm is repeated on one quarter of the original array. The
search continues until the sub array consist of one element that is equal to the search
key (search successful). But if Search-key not found in the array then the value of END
of new selected range will be less than the START of new selected range. This will be
explained in the following example:
A[9] 68
A[8] 37
A[7] 25
A[6] 22
A[5] 17
A[4] 15
A[3] 11
A[2] 9
A[1] 5
A[0] 3
Start=0
End = 9
Mid=int(Start+End)/2
Mid= int (0+9)/2
Mid=4
_________________
Start=4+1 = 5
End = 9
Mid=int(5+9)/2 = 7
_________________
Start = 5
End = 7 – 1 = 6
Mid = int(5+6)/2 =5
_________________
Start = 5+1 = 6
End = 6
Mid = int(6 + 6)/2 = 6
Found at location 6
Successful Search
Amity Directorate of Distance & Online Education
Searching and Sorting Techniques 55
Search-Key = 22
A[9] 68 Notes
A[8] 37
A[7] 25
A[6] 22
A[5] 17
A[4] 15
A[3] 11
A[2] 9
A[1] 5
A[0] 3
Search-Key = 8
Start=0
End = 9
Mid=int(Start+End)/2
Mid= int (0+9)/2
Mid=4
_________________
Start=0
End = 3
Mid=int(0+3)/2 = 1
_________________
Start = 1+1 = 2
End = 3
Mid = int(2+3)/2 =2
_________________
Start = 2
End = 2 – 1 = 1
End is < Start
Un-Successful Search
Algorithm: (Binary Search)
Here A is a sorted Linear Array with N elements and SKEY is a given item of
information to search. This algorithm finds the location of SKEY in A and if successful, it
returns its location otherwise it returns -1 for unsuccessful.
BinarySearch (A, SKEY)
1. [Initialize segment variables.]
Set START=0, END=N-1 and MID=INT((START+END)/2).
2. Repeat Steps 3 and 4 while START ≤ END and A[MID]≠SKEY.
3. If SKEY< A[MID]. Then
Set END=MID-1.
Else Set START=MID+1.
[End of If Structure.]
4. Set MID=INT((START +END)/2).

Amity Directorate of Distance & Online Education


56 Data and File Structure Using ‘C’
[End of Step 2 loop.]
5. If A[MID]= SKEY then Set LOC= MID
Notes
Else:
Set LOC = -1
[End of IF structure.]
return LOC and Exit

Computational Complexity of Binary Search


The Computational Complexity of the Binary Search algorithm is measured by the
maximum (worst case) number of Comparisons it performs for searching operations.
The searched array is divided by 2 for each comparison/iteration. Therefore, the
maximum number of comparisons is measured by: log2(n) where n is the size of the
array
Example:
If a given sorted array 1024 elements, then the maximum number of comparisons
required is:
log2(1024) = 10 (only 10 comparisons are enough)
Computational Complexity of Linear Search
Note that the Computational Complexity of the Linear Search is the maximum
number of comparisons you need to search the array. As you are visiting all the array
elements in the worst case, then, the number of comparisons required is:
n (n is the size of the array)
Example:
If a given an array of 1024 elements, then the maximum number of comparisons
required is:
n-1 = 1023 (As many as 1023 comparisons may be required)

2.7 Summary
Computer systems are often used to store large amounts of data from which individual
records must be retrieved according to some search criterion. Thus the efficient storage
of data to facilitate fast searching is an important issue. We’re interested in the average
time, the worst-case time and the best possible time.
However, we will generally be most concerned with the worst-case time as
calculations based on worst-case times can lead to guaranteed performance
predictions. Conveniently, the worst-case times are generally easier to calculate than
average times.
If there are n items in our collection - whether it is stored as an array or as a linked
list - then it is obvious that in the worst case, when there is no item in the collection with
the desired key, then n comparisons of the key with keys of the items in the collection
will have to be made.
To simplify analysis and comparison of algorithms, we look for a dominant operation
and count the number of times that dominant operation has to be performed. In the
case of searching, the dominant operation is the comparison, since the search requires
n comparisons in the worst case, we say this is a O(n) (pronounce this “big-Oh-n” or
“Oh-n”) algorithm. The best case in which the first comparison returns a match -
requires a single comparison and is O(1). The average time depends on the probability
that the key will be found in the collection - this is something that we would not expect to
know in the majority of cases. Thus in this case, as in most others, estimation of the
Amity Directorate of Distance & Online Education
Searching and Sorting Techniques 57
average time is of little utility. If the performance of the system is vital, i.e. it’s part of a
life-critical system, then we must use the worst case in our design calculations as it
represents the best guaranteed performance. Notes
2.8 Check Your Progress
Multiple Choice Questions
1. The worst case occur in linear search algorithm when ……………….
(a) Item is somewhere in the middle of the array
(b) Item is not in the array at all
(c) Item is the last element in the array
(d) Item is the last element in the array or item is not there at all
2. If the number of records to be sorted is small, then ……………… sorting can be
efficient.
(a) Merge
(b) Heap
(c) Selection
(d) Bubble
3. The complexity of sorting algorithm measures the ……………… as a function of the
number n of items to be sorter.
(a) average time
(b) running time
(c) average-case complexity
(d) case-complexity
4. Which of the following is not a limitation of binary search algorithm?
(a) must use a sorted array
(b) requirement of sorted array is expensive when a lot of insertion and deletions
are needed
(c) there must be a mechanism to access middle element directly
(d) binary search algorithm is not efficient when the data elements more than 1500.
5. The Average case occurs in linear search algorithm ……………….
(a) when item is somewhere in the middle of the array
(b) when item is not the array at all
(c) when item is the last element in the array
(d) Item is the last element in the array or item is not there at all
6. Binary search algorithm cannot be applied to ……………….
(a) sorted linked list
(b) sorted binary trees
(c) sorted linear array
(d) pointer array
7. Complexity of linear search algorithm is ………………
(a) O(n)
(b) O(logn)
(c) O(n2)
(d) O(n logn)

Amity Directorate of Distance & Online Education


58 Data and File Structure Using ‘C’

8. Sorting algorithm can be characterized as ………………


(a) Simple algorithm which require the order of n2 comparisons to sort n items.
Notes
(b) Sophisticated algorithms that require the O(nlog2n) comparisons to sort items.
(c) Both of the above
(d) None of the above
9. The complexity of bubble sort algorithm is ………………
(a) O(n)
(b) O(logn)
(c) O(n2)
(d) O(n logn)
10. State True or False for internal sorting algorithms.
(i) Internal sorting are applied when the entire collection if data to be sorted is
small enough that the sorting can take place within main memory.
(ii) The time required to read or write is considered to be significant in evaluating
the performance of internal sorting.
(a) (i) True, (ii) True
(b) (i) True, (ii) False
(c) (i) False, (ii) True
(d) (i) False, (ii) False

2.9 Questions and Exercises


1. What is searching?
2. What are the various techniques of searching?
3. What is linear search?
4. How can you perform binary search
5. How bubble sorting is done?
6. How insertion sorting is implemented in c.
7. How heap sorting is done.
8. Write an algorithm of selection sort.

2.10 Key Terms


 Bubble Sort: In this sorting algorithm, multiple swapping take place in one pass.
 External Sorting: In this sorting methods are employed when the data to be sorted
is too large to fit in primary memory.
 Internal Sorting: In this all the data to be sorted is available in the high-speed main
memory of the computer.
 Merge Sort: It is also one of the ‘divide and conquer’ classes of algorithms. The
basic idea in this is to divide the list into a number of sub lists, sort each of these
sub lists and merge them to get a single sorted list.
 Sorting: Sorting is any process of arranging items according to a certain
meaningful sequence

Check Your Progress: Answers


1. (d) Item is the last element in the array or item is not there at all
2. (c) Selection
3. (b) running time

Amity Directorate of Distance & Online Education


Searching and Sorting Techniques 59
4. (d) binary search algorithm is not efficient when the data elements more than 1500.
5. (a) when item is somewhere in the middle of the array
Notes
6. (d) pointer array
7. (a) O(n)
8. (c) Both of the above
9. (c) O(n2)
10. (b) (i) True, (ii) False

2.11 Further Readings


 Yogish Sachdeva, Beginning Data Structures Using C, 2003.
 Hanan Samet, Foundations of Multidimensional and Metric Data Structures, Morgan
Kaufmann, 2006.
 Ajay Kumar, Data Structure for C Programming, Laxmi Publications, 2004.
 Balagurusamy, Programming In Ansi C, 5E, Tata McGraw-Hill Education, 2011.
 Harsha Priya, R. Ranjee, Programming and Problem Solving Through “C”
Language, Firewall Media, 2006.

Amity Directorate of Distance & Online Education

Potrebbero piacerti anche