Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Class Test I (7th Sem. B. Tech- CS & IT), 2010 Sub : Compiler Design Time 60 mins. Max. Marks 10 Date : 11.08.10 (Answer any Four including Q.1) Q.1. Short type (any Five) [0.5 x 5 = 2.5] a) What is a complier and how it is different from an interpreter? b) Define regular expression. c) Differentiate DFA & NFA. d) Differentiate left sentential form and right sentential form of a string. e) What is a cross compiler and its advantage? Q.2. Consider the following while statement : [2.5] While A > B && A <= 2 * B 5 do A=A+B Identify the Tokens. Generate the Parse tree, intermediate code and optimized code. Show that the following grammar is ambiguous. [2.5] E E+E | E*E | (E) | id Q.4. Using Thomson Construction Rule construct -NFA for the following regular expressions. [2.5] a) (a | b)* e) (a | b)+ abb Q5. Give an equivalent DFA for the regular expression (a|b)*abb. [2.5] Good Luck b) ab (a | b)* c) (a* | b*)* a d) (a | b)* a (a | b)+
Q.3.
Solution to
Regular Expression: Regular expressions over can be defined recursively as follows: 1. Any terminal symbol (i.e. an element of ), and are regular expressions. 2. The union of two regular expressions R1 and R2, written as R1 + R2, is also a regular expression. 3. The concatenation of two regular expressions R1 and R2, written as R1 R2, is also a regular expression. 4. The iteration (or closure) of a regular expression R, written as R*, is also a regular expression. 5. If R is a regular expression, then (R) is also a regular expression.
6. c)
A recursive application of the rules 1-5 once or several times results into a regular expression.
Non-deterministic Finite Automaton (NFA): A non-deterministic finite automaton is a 5-tuple (Q, , , q0, F), where Q is a finite nonempty set of states; is a finite nonempty set of inputs; iii) is the transition function mapping from Q x into 2Q which is the power set of Q, the set of all subsets of Q; iv) q0 Q is the initial state; and v) F Q is the set of final states. Deterministic Finite Automaton (DFA): A deterministic finite automaton is a 5-tuple (Q, , , q0, F), where i) Q is a finite nonempty set of states; ii) is a finite nonempty set of inputs; iii) is the transition function which maps Q x into Q; iv) q0 Q is the initial state; and v) F Q is the set of final states. NFA: DFA: a) In NFA when an input is a) In DFA when an input is given to a state it may transit given to a state it transits to a to multiple states. single state. b) NFA can not be implemented b) DFA can be implemented in in a system. a system. d) If by a step in which the leftmost non-terminal in is replaced, we write . Every leftmost step, has the form wA w in which w lm lm * consists of terminals only. If derives by a leftmost derivation, welm write * . If S , then we say is a left sentential form of the given lm grammar. Similarly, If by a step in which the rightmost non-terminal in is replaced, we write rm . Every rightmost step, has the form Aw w rm in which w consists of terminals only. If derives by a rightmost * * rm derivation, we write . If S , then we say is a right sentential rm form of the given grammar. For example: A given grammar is : S i C t S S i C t S e S
S a C b And a string w = i b t i b t a e a can be derived using leftmost derivation as follows: S i C t S lm lm i lm i lm i lm i lm i btS btiCtSeS btibtSeS btibtaeS btibtaea
It is in the form of S lm 2 lm n = w. Here all i, 1 i n, is a left-sentential 1 lm lm form of the given string. Similarly, the rightmost derivation of the string is as follows: S i C t S rm rm i rm i rm i rm i rm i CtiCtSeS CtiCtSea CtiCtaea Ctibtaea btibtaea
It is in the form of S rm 2 rm n = w. Here all i, 1 i n, is a right1 rm rm sentential form of the given string. The rightmost derivation is also known as canonical derivation. e) A compiler is characterized by three languages: its source language, its object language, and the language in which it is written. These languages may all be quite different. A compiler may run on one machine and produce object code for another machine. Such a compiler is often called a cross-compiler.
Suppose a new language L is made available to two different machines A and B. As a first step we may write for machine A, a small compiler that translates a subset S of language L into the machine or assembly code of A. Then we write a compiler when run through
SA A
LA S
SA A
, becomes
LA A
LA S
SA A
LA A
Now suppose we want to have another compiler for L to run on machine B and to produce code for B. If convert B is not that different from machine A, then with little modification we can
LA S LB L
C which produces object code for B. So, using C to produce C , a compiler for L on B, is a two step process which is given below. Here we first run C through C to produce C , a cross-compiler for L which runs on machine A but produces code for machine B. Then we run C through this cross-compiler to
S
LA
into a compiler
LB L
LB B
LB L
LA A
LB A
LB L
produce the desired compiler for L that runs on machine B and produces object code for B.
C C
LB L
C C
LA A
C C
LB A
LB L
LB A
LB B
Hence, we can see a cross-compiler helps us to create a compiler for another machine without starting from the scratch. It helps a lot in the form of reduction of amount of coding for a new compiler. 2) The given while statement is while A > B & A 2*B 5 do A := A + B; The list of tokens present in the given string are:
while [id, n1] > [id, n2] & [id, n1] [const, n3] * [id, n2] [const, n4] do [id, n1] [id, n1] + [id, n2]; Here n1, n2, n3 and n4 stand for pointers to the symbol table entries for A, B, 2, and 5, respectively. The parse tree for the given statement is given below.
statement
while-statement
while
condition
do
statement
condition
&
condition
assignment
relation
relation
location
exp
id(A) exp exp relop exp exp relop exp id(A) id(A) > id(B) id(A) exp exp id(B) + exp
exp
exp
const(5)
const(2)
id(B)
The intermediate code for the given string is as follows: L1: L2: if A > B goto L2 goto L3 T1 := 2 * B T2 := T1 5 if A T2 goto L4 goto L3 A := A + B goto L1
L4: L3:
In an attempt to code improvement we can have local transformations. Here we are having two instances of jumps over jumps in the intermediate code. if A > B goto L2 goto L3 L2: This sequence of code can be replaced by the single statement if A B goto L3. By applying such replacement the optimized code will be as follows L1: if A B goto L2 T1 := 2 * B T2 := T1 5 if A > T2 goto L2 A := A + B goto L1
L2:
3)
The given grammar is E E + E | E * E | (E) | id A given grammar is said to be ambiguous if it produces more than one parse tree for some sentence. In other words, if a string is derivable in more than one ways using a given grammar, then the given grammar is said to be ambiguous. Lets have a sentence like id + id * id. We can derive the above string in two ways, which is given below: E E + E id + E id + E * E id + id * E id + id * id. And the other way is E E * E E + E * E id + E * E id + id * E id + id * id. For the above two derivations we can have two different parse trees, which are shown below. E E
E id d id d
E id d
E id d
E id d id d
a)
4) The regular expression is (a | b)* The -NFA for the given regular expression is as below:
q0 q1
q2
q3
q6 q
7
q4 b q5
b) The regular expression is ab(a | b)*. The -NFA for the given regular expression is as below:
q
0
q4
q5
q8 q
9
q
1
q2
q3
q6 b q7
c)
The regular expression is (a* | b*)*a The -NFA for the given regular expression is as below:
q12
a
q11
q10
q
5
q
9
a
q
3
q
8
q
4
b
q
7
q
2
q
6
q
1
q
0
d)
The regular expression is (a | b)*a (a | b)+ The -NFA for the given regular expression is as below:
q14 q13
q12 q15
q17
q10
a
q16
b
q18
a
q
9
b
q
8
q11
q19
a
q
7
q20
q
3
q
5
a
q
2
b
q
q
1
q
0
q13
q10
q12
a
q
9
b
q
8
q11
a
q
7
q
3
q
5
a
q
2
b
q
q
1
q
0
e)
q13
q14
q15
a
q12
q11
q
8
q10
a
q
7
b
q
6
q9
q
2
q
4
a
q
1
b
q
q
0
q8
q9
q10
a
q7
q6
q
3
q5
a
q
2
b
q
1
q4
5)
The given regular expression is (a | b)*abb. The -NFA for the given regular expression is given below:
q0 q1
q2
q3
q6 q
7
q8 b q9
q4 b q5
b q10
Here the -closure of {q0}={q0, q1, q2, q4, q7}= A. Here all the states in A are equivalent. Now applying a as the input to the states in A, we get {q3, q8}. Now to get all the equivalent states of {q3, q8}, we compute the -closure of {q3, q8}. -closure {q3, q8} = {q1, q2, q3, q4, q6, q7, q8} = B. Applying b as the input to the states in A, we get {q5}. Now to get all the equivalent states of {q5}, we compute the -closure of {q5}. -closure {q5} = {q1, q2, q4, q5, q6, q7} = C. Similarly, applying a to B, we get, {q3, q8}. -closure {q3, q8} = B. Applying b to B, we get, {q5, q9}. -closure {q5, q9} = {q1, q2, q4, q5, q6, q7, q9} = D. Now applying a to C, we get, {q3, q8}. -closure {q3, q8} = B. Applying b to C, we get, {q5}. -closure {q5} = C. Now applying a to D, we get, {q3, q8}. -closure {q3, q8} = B. Applying b to D, we get, {q5, q10}. -closure {q5, q10} = {q1, q2, q4, q5, q6, q7, q10} = E. Now applying a to E, we get, {q3, q8}. -closure {q3, q8} = B. Applying b to E, we get, {q5}. -closure {q5} = C. Now the transition table for the DFA is shown below
State A B C D E* a B B B B B
Input b C D C E C
Now let us minimize the above DFA to get an equivalent minimized DFA. Here the 0-equivalent classes will be given by Q10={E}, Q20={A, B, C, D}, which is a set of all final states and a set of all non-final states. Hence the set of 0-equivalent classes is given by 0={ Q10, Q20}. Now for 1-equivalent classes we got, Q11={E}, Q21={A, B, C}, Q31={D}. Hence the set of 1-equivalent classes is given by 1={ Q11, Q21, Q31}, where Q11, Q21, Q31 are given above. Now for 2-equivalent classes we got, Q12={E}, Q22={A, C}, Q32={B}, Q42={D}. Hence the set of 2-equivalent classes is given by 2={ Q12, Q22, Q32, Q42}, where Q12, Q22, Q32, Q42 are given above. Now for 3-equivalent classes we got, Q13={E}, Q23={A, C}, Q33={B}, Q43={D}. Hence the set of 3-equivalent classes is given by 3={ Q13, Q23, Q33, Q43}, where Q13, Q23, Q33, Q43 are given above. We can see that 2= 3. Hence 2 is the set of equivalence classes. So, now the transition table for minimized DFA is given below. State A B D E* a B B B B Input b A D E A
b a b