Sei sulla pagina 1di 22

Compiler

Construction

1
Symbol Table
 It is a data structure, containing information about
identifiers (variables) used in the source program in
the form of various attributes.

 There are two types of identifiers:


1. Variable names
2. Function names

2
Variable Names
 In the case of Variable, the fields in the symbol table
are:
 Type of the variable
 Size of the variable
 Scope of the variable
 Storage class of the variable
 Offset address of the variable

3
Function Names
 In the case of function names the fields of symbol
table are:
 Arguments and return type of the function
 The field of arguments contains:
 The type of arguments
 The number of arguments
 The method of passing arguments, i.e. either by value or by
reference

4
Size of Symbol Table
 The size of symbol table can be static or dynamic
 If the size of symbol table is fixed then it will cause
memory wastage, in case of short programs. And will
adequate (suitable) for too large programs.
 So dynamic symbol table is more suitable in most
situation.

5
Operation on Symbol Table
 The most frequent operations that are usually
performed on symbol table are:
1. Lookup Operation
2. Insertion Operation
3. Deletion Operation

6
Operation on Symbol Table
1). Lookup Operation
 Retrieving information from symbol table when
needed, e.g. type, address and dimension information
is needed during semantic analysis and code
generation.
 This operation starts from the point, pointed by
available and goes up toward the initial entry in the
list, until the desired identifier is obtained.
 The data of the searched record lies between the
identifier name and the next pointer.
7
Operation on Symbol Table
2). Insert Operation
 Entering values into different attributes of symbol
table in different phases

3). Delete Operation


 When the scope of identifier is finished then it is
removed from the symbol table.

8
Data Structure Used for Symbol Table

 The data structures to be suitable for symbol table


are:
1. Linear List (array implementation)
2. Hash Table

9
1). Linear List
 It is one of the easiest approach

 A single array or equivalently several arrays are


used to store names and their associated
information

 In this case the identifiers are entered in order in


which they are encountered

10
Linear List
id1
 A pointer ‘available’ points to the info1
insertion point, i.e. the point where
data can be stored. id2
info2

idn
infon
Available

11
Insertion in Linear List
 New entries are made, as soon as they appear in the
program. e.g.
a — integer
b — real
 These identifiers will be entered at ‘available’ and
below ‘available’

 Compiler should adopt a mechanism to avoid


duplicate entries in the list.

12
Lookup Operation in Linear List
 This operation starts from the point pointed by
‘available’ and goes up towards the initial entry in the
list, until the desired identifier is obtained

 The data of the searched record lies between the


identifier name and the next pointer.

13
Deletion in Linear List
 The identifier in the current scope lie near the pointer
‘available’. When the current scope is ended, then the
identifier near to ‘available’ are deleted.

14
2). Hash Table
 This data structure maps a key into an integer
 Key is something to be searched or stored
 The hash table implementation will map a lexeme
into an integer, which will be used to index the
record for the lexeme
 For mapping a lexeme into an integer, a hash
function h(s) is applied on it.
 Whatever the hash value for the lexeme is, the record
for the lexeme is stored on that index.

15
Hash Table
 There are two parts of this data structure:
1. A hash table
 It an array of size ‘m’
 It is used to index the records
 It contains pointers to the records of symbol table
2. Linked List
 Used to store the actual records
 These are called buckets (loads)
 Hash function used to map a lexeme into an integer that falls
in the range of 0 to m-1

16
A Hash Table of size 211
Array of list headers
indexed by hash value
List of elements created
0 for names shown

… cp n
9
… match
20
… last action ws

32

210
17
Hash Table
 The hash function performs as follow:
1. First each character of the string is converted into a
corresponding integer and a value h is calculated from
them
S = C1 C2 C3 C4 … Cn
h = C 1 + C2 + C 3 + C4 + … + C n

2. Then h is divided by m i.e. the size of the hash table


and the remainder is the hash value
So
h (S) = h % m

18
Hash Table
 A hash collision may occur, in which more than one lexemes maps to
the same index

 For resolution chaining technique is used

 In symbol table open hashing technique is used, in which the size of


the symbol table is not bounded.

19
Lookup Operation in Hash Table
 The complexity of operation is Θ(1)
 For looking up any identifier, the hash function is
applied on the identifier and the record for the
identifier is found on that index

20
Insert Operation in Hash Table
 Hash function is applied on an identifier and the
identifier is stored in the index; obtained as a result of
hash function.

21
Delete Operation in Hash Table
 If there is one bucket then set the pointer in the hash table
to null and delete the bucket

 If there are two buckets, set the next pointer of the first
bucket to null and delete the 2nd bucket

 If ith bucket is to be deleted, then put the pointer to (i+1)th


bucket into the next field of (i-1)th bucket and delete the
bucket.

22

Potrebbero piacerti anche