Sei sulla pagina 1di 23

Digital assignment – Design

and Analysis of Algorithms

JOE SHANNEL ARUJA


17MES0003
Problem1
The Voronoi polygon for a point p of a set S of points in the plane is defined to be
the perimeter of the set of all points in the plane closer to p than to any other point
in S. The union of all the Voronoi polygons of the points in S is called the Voronoi
diagram of S.

a. What is the Voronoi diagram for a set of three points?

b. Find a visualization of an algorithm for generating the Voronoi diagram on the


Web and study a few examples of such diagrams. Based on your observations, can
you tell how the solution to the previous question is generalized to the general case?
With your own programming language proficiency, implement the algorithm and
study the performance using profiling tools.

Problem Definition 1.a

Voronoi diagram is polygonal structure containing points which are enclosed by


convex polygonal boundaries .If there are n points then the Voronoi’s diagram for
the n points would contain n convex polygonal structures such that each point has
its own polygonal boundary. The Voronoi diagram for set of three points can be
developed by plotting unique points on a plain and then joining them to each other
with an imaginary line .The Voronoi boundary can be plotted by taking the
PERPENDICULAR BISECTOR of these imaginary lines and extending them in
both directions.

In this problem we deal randomly take three points – P1(1.2, 3.3)


P2( 2.7, 1.5)
P3( 1.5 ,1.3)

Program – (MATLAB)

clear all;
close all;
clc;
figure()
X = [ 1.2 3.3; 2.7 1.5; 1.5 1.3;]
voronoi(X(:,1),X(:,2))
FIG 1: Plotted Voronoi Diagram

RESULT
PROFILE SUMMARY

The profile summary shows that the entire program executes in 1.853 sec and the
function ‘voronoi’ takes the most time for execution , that is 0.787 sec .It is also
observed that as the number of points to be plotted increases , the total time is
further stretched.

CONCLUSION

The source code for obtaining the verronoi diagram was coded and the diagram was
plotted in MATLAB.The profiling f the written code was conducted with the
MATLAB tool and the time for execution was obtained.

Problem Definition 1.b


Visualization of Voronoi’s diagram can be implemented in different ways, example
Fortunes algorithms, Loyd’s algorithm , Boweyer Watson’s algorithm. The most
efficiency is provided by Fortunes sweep line method O(nlogn)

A Sweepline Algorithm for Voronoi Diagrams.


The sweep line technique conceptually sweeps a horizontal line upward across the
plane, noting the regions intersected by the line as the line moves. Computing the
Voronoi diagram directly with a sweep line technique is difficult, because the
Voronoi region of a site may be intersected by the sweep line long before the site
itself is intersected by the sweep line. Rather than compute the Voronoi diagram,
we compute a geometric transformation of it. The transformed Voronoi diagram has
the property that the lowest point of the transformed Voronoi region of a site
appears at the site itself. Thus the sweep line algorithm need consider the Voronoi
region of a site only when the site has been intersected by the sweep line.

The sweepline algorithms compute the Voronoi diagram of n sites in time


O(n log n) and space usage O(n).

FIG : sweep line methodology

Problem 2
Write a computer program that uses hashing for the following problem. Given a
natural-language text, generate a list of distinct words with the number of occurrences
of each word in the text. Insert appropriate counters in the program to compare the
empirical efficiency of hashing with the corresponding theoretical results.

Problem Definition
Here the problem is defined by an input which is nothing but a character string which
is fed through a user input variable. A function should be developed such that each
word from the input provided by the user is extracted and displayed with their
respective occurrence count. It should be noted that the word if repeated, can only be
displayed once. Here we have to utilize hashing in order to obtain the address of each
word and there by utilizing the same to check for no more than one occurrence while
displaying each unique word count. Creating a Hash table would be the essence as the
uniqueness of each word can be easily obtained from the hash address.

The usual procedure for the creation and storage of Hash Table would be with the help
of linked list, but in this case , we will be utilizing a structure to store the hash
addresses and the rest of the data associated with a word extracted from the string
entered by the user. The Mod value to be used in obtaining the hash address can be any
numerical value less than the number of words in the entered string. Each function
created for the problem can be sequentially invoked in the main function block.

Approach Followed(Algorithm)
1. Accept the string from the user, length may be as he desires.
2. Extract word by word from the entered string and store it in
structure.
3. The structure created has the following data. - Word, length of
the word, hash address of the word, count of the word and a
flag.
4. Next the function to calculate the Hash address is invoked
5. The function to calculate the number of count of each word in
the string is invoked.
6. This is followed by invoking the function to check for
repetition and this function sets the flag to 1 if the word is
unique in the entered string.
7. The structure is displayed in accordance with flag variable
[removes the repeating words].

NOTE- Hash function used in this program is Mod with the length of
the string entered, and word is searched in the database using the hash
address generated .

SOURCE CODE
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#Function PROTOTYPE
void address(); #Function to calculate Hash address
void number(); #Function to count the number of repetition
void repeat(); #Function to remove the repeating words
char arr[30];

struct word # Structure Defined


{
char ar[15];
int length;
int add;
int count;
int fl;
}w[20];

int gcnt=0;

int main()
{

int i,len,z,sum,j;
j=0;
z=0;
sum=0;
printf("\n ENTER THE STRING : ");
gets(arr);

len=strlen(arr);
for(i=0;i<=len;i++,z++)
{
if(arr[i]==' '||arr[i]=='\0')
{
w[j].ar[z]='\0';
w[j].length=strlen(w[j].ar);
gcnt++;

j++;
z=0;

i++;
}

w[j].ar[z]=arr[i];
}

address();
number();
repeat();

for(i=0;i<gcnt;i++)
{
if(w[i].fl==1)
{
// printf("\n add=%d, fl=%d",w[i].add,w[i].fl);
printf("\n| word = %s |",w[i].ar);
printf(" length = %d |",w[i].length);
printf(" address = %d |",w[i].add);
printf(" count = %d |",w[i].count);

}
}
printf("\n total number of words = %d",gcnt);

void address() #Hash address Function Definition


{
int i,sum=0,j,len;

for(j=0;j<gcnt;j++)
{

len=strlen(arr);
for(i=0;i<w[j].length;i++)
{
if(w[j].ar[i]=='a'||w[j].ar[i]=='A')
{
sum=sum+1;
}
if(w[j].ar[i]=='b'||w[j].ar[i]=='B')
{
sum=sum+2;
}
if(w[j].ar[i]=='c'||w[j].ar[i]=='C')
{
sum=sum+3;
}
if(w[j].ar[i]=='d'||w[j].ar[i]=='D')
{
sum=sum+4;
}
if(w[j].ar[i]=='e'||w[j].ar[i]=='E')
{
sum=sum+5;
}
if(w[j].ar[i]=='f'||w[j].ar[i]=='F')
{
sum=sum+6;
}
if(w[j].ar[i]=='g'||w[j].ar[i]=='G')
{
sum=sum+7;
}
if(w[j].ar[i]=='h'||w[j].ar[i]=='H')
{
sum=sum+8;
}
if(w[j].ar[i]=='i'||w[j].ar[i]=='I')
{
sum=sum+9;
}
if(w[j].ar[i]=='j'||w[j].ar[i]=='J')
{
sum=sum+10;
}
if(w[j].ar[i]=='k'||w[j].ar[i]=='K')
{
sum=sum+11;
}
if(w[j].ar[i]=='l'||w[j].ar[i]=='L')
{
sum=sum+12;
}
if(w[j].ar[i]=='m'||w[j].ar[i]=='M')
{
sum=sum+13;
}
if(w[j].ar[i]=='n'||w[j].ar[i]=='N')
{
sum=sum+14;
}
if(w[j].ar[i]=='o'||w[j].ar[i]=='O')
{
sum=sum+15;
}
if(w[j].ar[i]=='p'||w[j].ar[i]=='P')
{
sum=sum+16;
}
if(w[j].ar[i]=='q'||w[j].ar[i]=='Q')
{
sum=sum+17;
}
if(w[j].ar[i]=='r'||w[j].ar[i]=='R')
{
sum=sum+18;
}
if(w[j].ar[i]=='s'||w[j].ar[i]=='S')
{
sum=sum+19;
}
if(w[j].ar[i]=='t'||w[j].ar[i]=='T')
{
sum=sum+20;
}
if(w[j].ar[i]=='u'||w[j].ar[i]=='U')
{
sum=sum+21;
}
if(w[j].ar[i]=='v'||w[j].ar[i]=='V')
{
sum=sum+22;
}
if(w[j].ar[i]=='w'||w[j].ar[i]=='W')
{
sum=sum+23;
}
if(w[j].ar[i]=='x'||w[j].ar[i]=='X')
{
sum=sum+24;
}
if(w[j].ar[i]=='y'||w[j].ar[i]=='Y')
{
sum=sum+25;
}
if(w[j].ar[i]=='z'||w[j].ar[i]=='Z')
{
sum=sum+26;
}

}
w[j].add=sum%len; #Mod Function Formula
sum=0;

void number() #Count Function Definition


{
int x,i,j,sum;
sum=0;
for(j=0;j<gcnt;j++)
{
x=w[j].add;

for(i=0;i<gcnt;i++)
{
// printf("\ncomparing %d and %d",w[i].add,w[j].add);
if(x==w[i].add)
{
sum=sum+1;

}
w[j].count=sum;

sum=0;

}
void repeat() #Duplicate Word Extraction Function Definition
{
int i,j,x,sum=0;

for(i=0;i<gcnt;i++)
{
for(j=0;j<=i;j++)
{
if(w[i].add==w[j].add)
{
sum=sum+1;
}
}

w[i].fl=sum;
sum=0;

OUTPUT
The same source code was ran on two different systems and the resultant output was
saved.

PROFILING

The profiling of the source code as done with the ‘gprof’ tool in the linux and the
following result was obtained.
The function ‘address’ , ‘number’ and ‘repeat’ has been called once in the program as
the program is invoked for a particular input.

RESULT

The program was executed for an entered string –“algorithm is beautiful and
amazing” and the result was obtained as the printed format of the unique words as
‘algorithm’ , ‘is’ , ‘beautiful’ , ‘and’, ‘amazing’ and their corresponding word length,
hash address and the total number of repetition was displayed in the terminal.

CONCLUSION

The source code written for the program is of the order of O(n^2) and the code
execution was studied for further improvement in efficiency and implemented
improvements is the most apt way.
PROBLEM 3

Conduct a study on Bloom Filter which is hash based data structure used in
several search and query applications. Implement it using your known
programming language and analysis its performance.

PROBLEM DEFINITION
The problem is designing a BLOOM filter to search for stored words in a
given or updated database , which the user has the privilege to access.
BLOOM filter is a very well known filter used to search for data from a
huge data base. The bloom filter is a very efficient way of searching as it
uses HASHING to create a Hash table to save the Hash addresses of the
words entered. This is a unique way as it does not save the entire word
length , rather a small integer associated with that word is saved , this
increases optimization of the search in the time complexity as well as the
space complexity point of view. The HASH table is created with the usage
of Linked list and is to be coded in C. The search utilized in this logic is by
traversing the entire linked list which is predefined to a particular length.

The bloom filter may use more than one function to map each word to the
corresponding HASH table position. In this case we utilize only one HASH
function. The bloom filter can provide with possibility of presence in the
database or absolutely not present status. This is because the possibility of
more than one word having the same HASH address , which results in being
overwritten in the same position.
APPROACH(Algorithm)
1. A linked list of length 50 is predefined.
2. Word by word is accepted from the user and its corresponding
HASH address is calculated.
3. The created linked list is traversed to the particular position which
is nothing but the HASH address obtained.
4. The corresponding data bit of the linked list is made 1.
5. The user is requested to enter more words to be stored for to search
for a particular word.
6. Step 1 to step 6 is repeated till user chooses a different option.
7. If the user opts for search , the word is accepted from the user and
again the corresponding HASH address is calculated and searched
in the HASH table.
8. The linked list is traversed to that particular position of the HASh
address and searched for the bit to 1 or 0.
9. If the bit is ‘1’ the word is found else if bit is ‘0’ , the word is not
found in the data base.

SOURCE CODE

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<ctype.h>

struct node #The created linked list of two objects bit and a self pointer.
{
int bit;
struct node *next;
}*start=NULL;
struct node *current=NULL;
char arr[20];
int add=0;
int cnt=0;

void hash();
void define();
void display();
void insert();
void search();

void main()
{
int choice,x=1;
define();
while(x==1)
{
printf("\nEnter your option\n1.to insert words\n2.to display the hash table\n3.to search for a
particular word\n4.to exit :");
scanf("%d",&choice);

switch(choice)
{
case 1: insert();
//choice=0;
break;
case 2: display();
//choice=0;
break;
case 3: search();

}
printf("\n do you want to continue ? (yes 1/ NO 0)");
scanf("%d",&x);
}

void define()
{

int i;
struct node *temp;
for(i=0;i<=50;i++)
{
temp=(struct node *)malloc(sizeof(struct node));
temp->bit=0;
temp->next=NULL;
if(start==NULL)
{
start=temp;
current=temp;
cnt++;

else
{
current->next=temp;
current=temp;
cnt++;

void display()
{
int i;
struct node *temp;
printf("\n");
temp=start;
for(i=0;i<50;i++)
{
printf("%d--->",temp->bit);
temp=temp->next;

}
void hash()
{
int len,i,x,y,sum=0;
char c;
len=strlen(arr);

for(i=0;i<len;i++)
{

y=toupper(arr[i]);
y=y-64;
printf("\n %d",y);
sum=sum+y;
}

add=sum%13;
printf("\nthe hash address of the word is %d ",add);
sum=0;

void insert()
{
int i;
printf("\n Enter The word to be inserted : ");
scanf("%s",arr);
hash();

;
struct node *temp;
temp=start;

for(i=0;i<add;i++)
{
temp=temp->next;

temp->bit=1;

void search()
{
int i;
struct node *temp;
temp=start;
printf("\n Enter the word to be searched ");
scanf("%s",arr);
hash();
for(i=0;i<add;i++)
{
temp=temp->next;

}
if(temp->bit==1)
{
printf("\n the word is positively found");
}
else
{
printf("\n The word is not stored in the database");
}

OUTPUT
RESULT
The source code for the BLOOM filter was executed in the Linux terminal
and the result was as shown above. The words ‘algorithm’ , ‘daa’ , ‘joe’
were entered and stored in the corresponding HASH table . Later word ‘joe’
and ‘weather’ were searched from the menu driven program and the output
obtained was ‘ positively found ‘ and ‘ word not present in the database’ as
expected. The order of time complexity was obtained as O(n).
PROFILING

The profiling summary reveals that the existence of one unique function
and the time of execution depends on how much words exist in the HASH
table . This leads to more traversing to find the existence of the particular
word. The profiling of the source code as done with the ‘gprof’ tool in the
linux .

CONCLUSION

The bloom filter is an efficient way of filtering out the existence of a data
from a given data base , but the draw back of more that one word having the
same HASH address would lead to a collision or a faulty result. Utilization
of more than one function to create the HASH table may reduce this to an
extend , but again the execution time may increase correspondingly.

Potrebbero piacerti anche