Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
UNIT II
GRAMMARS
B → 1|1S|0BB. For the string 00110101, Let us discuss its left and right
most derivation and derivation tree.
Solution:
(a)Leftmost derivation:
S ⇒ 0B
⇒ 00BB (B→0BB)
⇒ 001B (B→l)
⇒ 0011S (B→1S)
⇒ 00110B (S→0B)
⇒ 001101S (B→1S)
⇒ 0011010B (S→0B)
⇒ 00110101 (B→l)
(b)Rightmost derivation:
S ⇒ 0B
⇒ 00BB (B→0BB)
⇒ 00B1S (B→1S)
⇒ 00B10B (S→0B)
⇒ 00B101S (B→1S)
⇒ 00B1010B (S→0B)
⇒ 00B10101 (B→1)
⇒ 00110101 (B→l)
(c)Derivation tree:
Type 0 grammars:
Type 0 grammars are those where the rules are of the form
α→β
where α, β ∈ (Σ ∪ V ) ∗
Example:
C# → D# | E aD → Da aE → Ea
$E → ε
Type 1 Grammars:
α→β
Example:
S → aSBC | aBC CB → HB HB → HC
HC → BC aB → ab bB → bb
bC → bc cC → cc
L(G) = {a n b n c n | n ≥ 0}
Type 2 Grammars:
A→β
where A ∈ V and β ∈ (Σ ∪ V ) ∗ .
Example:
S → | 0S1
L(G) = {0 n1 n | n ≥ 0}
Type 3 Grammars:
A → aB or A → a
Example:
S → 1S | 0A A → | 1A | 0S
S→aSb | aAb
A→bAa
A→ba
S⇒aSbS⇒aAbS⇒aSb
⇒aaAbb⇒abab⇒aaSbb
⇒aababb⇒aaaAbbb
⇒aaababbb
S⇒aAbS⇒aAb
⇒abAab⇒abAab
⇒abbaab⇒abbaab
L = {The set of strings over Σ={a,b}, starting with a and ending with b and
substring ba}.
Let us construct a CFG over {a, b} generating a language consisting of equal
number of a’s and b’s.
S ( aSb | ab | bSa | ba and try to discuss whether the language {am bmcm,
m ≥ 0} is context free or not.
Solution:
Let z = apbp cp ∈ L
z = uvwxy
z = ap bp cp as
u = ap
vx = bp–mwhere |vx| ≥ 1
y = cp
= uvwx (vx)i - 1 y
= ap bp (bp-m)i - 1 cp
Let i = 0.
= ap bm cp ∉ L
Given:
Solution:
S → AC | BC | DE | DF
A → 0 | 0A | 0A1
B → 1 | B1 | 0B1
C → 2 | 2C
D → 0| 0D
E → 1 | 1E | 1E2
F → 2 | F2 | 1F2
(1)Given:
Solution:
CFG:
S → aSb
S → aC|a|bD|b
C → aC|a
D → bD|b
(2)Given:
Solution:
CFG:
S →aSa
S →b.
Theorem:
Proof:
(i)Let L be regular.
(ii)Given a DFA (Finite Automata) for L, add a stack, but do not use the stack.
δ(p,a,z) = {(q,λ)}
(v)Therefore L is context-free
Theorem:
Let L1 and L2 be CFLs. Then L1∪L2 is also a CFL.
Proof:
(i)Let L1 have grammar (V1, T1, P1, S1) and let.L2 have grammar (V2, T2, P2, S2)
where
•V3 = V1 ∪ V2 ∪ S3
•T3 = T1 ∪ T2
Proof:
(i)Let L1 have grammar (V1, T1, P1, S1) and L2 have grammar (V2, T2, P2, S2)
V = V1 ∪V1 ∪ {S}
T = T, ∪ T2
P = P1 ∪ P2 ∪ {S→S1S2}
S = start symbol
Theorem:
where
V = V1 ∪ {S}
T = T1
P = P1 ∪ (S → e, S→SS1}
S = start symbol
(iii)Therefore, L is a CFL.
Theorem:
Proof:
(ii)Need to show:
where M = (Q, Σ, Γ, δ, S, F)
(iii)Idea:
Construct a PDA, M that operates in the same way as M1 except that it also keeps
track of the change in state in M2 caused by reading the same input.
(iv)Construction:
Q = Q1 × Q2
Σ = Σ1∪Σ2
Γ = Γ1
S = {S1, S2}
F = F1 × F2
- for each transition {(q1, a, β), ((p1, γ)} ∈δ1 and for each state q2∈Q2 add to δ the
transition
- for each transition {(q1, λ, β), (p1,γ)} ∈ δ1 and for each state q2 ∈ Q2 add to δ the
transition
•The fanout of G φ (G) is the largest number of symbols on the RHS of any rule
in R.
•The height of a parse tree is defined as the length of the longest path from the
root to some leaf.
Prove that if ‘w’ is a string of a language then there is a parse tree with yield ‘w’
and also prove that if A => w then it implies that ‘w’ is a string of the language L
defined by a CFG.
Let us prove if ‘w’ is a string of a language then there is a parse tree with
yield ‘w’ and also prove that if A => w then it implies that ‘w’ is a string of
the language L defined by a CFG.
Theorem:
Proof:
Step 1:
We prove that A ⇒ α if and only if there is an A-tree which derives α. Once this
is proved, the theorem follows by assuming that A=S.
Let a be the yield of an A-tree T. We prove that AA⇒ α by induction on the number
of internal vertices in T.
When the tree has only one internal vertex, the remaining vertices are leaves and
are the sons of the root.
By the definition of derivation tree (iv) A→A1 A2........ Am=α is a production in G
(i.e.) A⇒α.
This is a basis step for induction (k = 1). Now assume the result is true fork–1
internal vertices (k>1).
Let T be an A-tree with k internal vertices (k≥2). Let v1, v2,..... vm be the sons of
the root in the left-to-right ordering.
Let their labels be x1, x2...... xm. By the definition of derivation tree (iv)
A→x1, x2....xm is one of the production P. Therefore:
A ⇒ x1, x2........... Xm
By the left-to-right ordering of leaves, α can be written as α1, α2, .........αm, where
αi is obtained by:
Step 2:
When A⇒α, A→α is a production in P. If α = x1 x2....... xm, the A-tree with yield α
is basis for induction. That is:
Assume the result for derivations in atmost k steps. Let split this as:
A ⇒ x1 x2....... xm is a production in P.
Given:
S→AB | aB
A→aab | ε
B → bbA
Solution:
Step 1:
The null production A→ε is eliminated and the resultant productions are:
S→AB|aB|B
A→aab
B→bbA|bb
S→AB | aB|bbA|bb
A → aab
B → bbA | bb
Step 2:
Let G1 = (N' {a, b}, S, P'), where P' and N' are constructed as follows:
S→CaB|CbCbA|CbCb
A→CaCaCb
B→CbCbA|CbCb
Step 3:
Let G2 = (N", {a, b}, S, P") where P" and N" are constructed as follows:
S→CbD1
D1→CbA
D2→CbCb
B→CbD1
D1→CbA
S→AB | CaB|CbCb|CbD1
a→Cad2
b→CbCb|CbD1
D1→CbA
D2→CbCb
Ca→a
Cb→b
2.5 Ambiguity
(ii)S → i C t S e S
(iii)S → a
(iv)C → b
where i, t, and, e stand for if, then, and else, and C and S for “conditional”
and “statement” respectively.
(4)Remove ambiguity if any and prove that both the grammar produces the
same language.
Solution :
w=ibtibtaea
Leftmost derivation 1:
S⇒iCtS
⇒ i b t S (C →b)
⇒ i b t i bt SeS (C→b)
⇒ i b t i bt aeS (S →a)
⇒ i b t i bt aea (S →a)
(2)
(3)Leftmost derivation 2:
S⇒ i C t S e S
⇒ i b t S e S (C→b)
⇒ i b t i C t S e S (S→iCtS)
⇒ i b t i b t S e S (C →b)
⇒ i b t i b t a e S (S→a)
⇒ i b t i b t a e a (S→a)
For the string w= i b t i b t a e a, the given grammar has two leftmost derivations.
Therefore it is ambiguous.
Solution:
The given grammar is ambiguous.
The sentence id+id+id can be derived from more than one leftmost derivation or
rightmost derivation or parse tree.
Leftmost derivations:
(i)E ⇒ E + E
⇒ id + E (E→id)
⇒ id + E + E (E→E + E)
⇒ id + id + E (E→id)
⇒ id + id + id (E→id)
(ii)E ⇒ E + E
⇒ E + E + E (E→E + E)
⇒ id + E + E (E→id)
⇒ id + id + E (E→id)
⇒ id + id + id (E→id)
Let us see that the grammar S → a |Sa| |bSS| |SSb| |SbS is ambiguous.
Given:
Let w = aabbaa
Solution:
Left most derivations
S⇒SbS
⇒SabS (S → Sa)
⇒aabS (S → a)
⇒aabbSS (S → bSS)
⇒aabbaS (S → a)
⇒aabbaa(S → a)
S⇒SbS
⇒SSbS (S → SSb)
⇒aSbbS (S → a)
⇒aabbS (S → a)
⇒aabbSa (S → Sa)
⇒aabbaa (S → a)
Since it has two left most derivations for the string w = aabbaa.
E E + E \ E *E \ (E) \ a.
Proof of E E + E \ E *E \ (E) \ a.
Given:
Solution:
Leftmost derivations
(1)E ⇒ (E)
⇒ (E + E)
⇒ (a.+ E) (E→ a)
⇒ (a + E * E) (E → E*E)
⇒ (a + a * E) (E→ a)
⇒ (a + a * a) (E→ a)
(2)E ⇒ (E)
⇒ (E * E)
⇒ (E + E * E) (E → E+E)
⇒ (a + E * E) (E→ a)
⇒ (a + a * E) (E→ a)
⇒ (a + a * E) (E→ a)
⇒ (a + a * a) (E→ a)
From the given G, P two leftmost derivations of are induced. They are:
S⇒ S+S
⇒ a + S (S→a)
The corresponding derivation tree is:
Example:
S⇒ S+S
⇒ a+S (S→a)
The corresponding derivation tree is:
Let us prove that if ‘w’ is a string of a language then there is a parse tree
with yield ‘w’ and also prove that if A => w then it implies that ‘w’ is a string
of the language L defined by a CFG.
Theorem:
Proof:
Step 1:
We prove that A ⇒ α if and only if there is an A-tree which derives α. Once this
is proved, the theorem follows by assuming that A=S.
Let a be the yield of an A-tree T. We prove that AA⇒ α by induction on the number
of internal vertices in T.
When the tree has only one internal vertex, the remaining vertices are leaves and
are the sons of the root.
Let T be an A-tree with k internal vertices (k≥2). Let v1, v2,..... vm be the sons of
the root in the left-to-right ordering.
Let their labels be x1, x2...... xm. By the definition of derivation tree (iv)
A→x1, x2....xm is one of the production P. Therefore:
A ⇒ x1, x2........... Xm
By the left-to-right ordering of leaves, α can be written as α1, α2, .........αm, where
αi is obtained by:
Step 2:
Assume the result for derivations in atmost k steps. Let split this as:
A ⇒ x1 x2....... xm is a production in P.
String = (a+b)*c
Solution:
(i)Leftmost derivation
S ⇒ bA
⇒ baS (A → aS)
⇒ baaB (S → aB)
⇒ baaaBB (B → aBB)
⇒ baaabsB (B → bS)
⇒ baaabbAB (S → bA)
⇒ baaabbaB (A → a)
⇒ baaabbabS (B → bS)
⇒ baaabbab bA (S → bA)
⇒ baaabbabba (A → a)
(ii)Rightmost derivation
S ⇒ bA
⇒ baS (A → aS)
⇒ baaB (S → aB)
⇒ baaaBB (B → aBB)
⇒ baaaBbS (B → bS)
⇒ baaaBbbA (S → bA)
⇒ baaaBbba (A → a)
⇒ baaabSbba (B → bS)
⇒ baaabbAbba (S → bA)
⇒ baaabbabba (A → a)
(iii)Parse trees:
2.7 Simplification of CFG
For any context free language L, there exists an pda M such that
L = L(M).
Proof:
here exists a Greibach Normal Form then we can construct pda which simulates
left most derivations in this grammar.
Σ = terminal of grammar G
δ(q0, λ, z) = {(q1, Sz)}, so that after the first move of M, the stack contains the start
symbol S of the derivation. (The stack symbol z is a marker to allow us to detect
the end of the derivation)
Given:
S→ABC|BaB
B→bBb|a
C → CA | AC
Solution:
The variables A and B are generating and derives a terminal string. Therefore C
is a useless variable. After removing ‘C’, the CFG is:
S→BaB
A→aA | aaa
B→bBb|a.
This defines those symbols that do not participate in derivation of any string,
i.e. the useless symbols, usually remove the useless productions from the
grammar.
A symbol X is useful if:
If X is generating, i.e., X =>* w, where w ϵ L(G) and w in Vt*, this means that
the string leads to a string of terminal symbols.
If X is reachable then there is a derivation S =>* αXβ =>* w, w ϵ L(G), for same
α and β, then X is said to be reachable.
Identify the non-generating symbols in the given CFG and eliminate those
productions which contains non-generating symbols.
Example: Remove the useless symbol from the given context free grammar:
S -> aB / bX
A -> Bad / bSX / a
B -> aSB / bBX
X -> SBD / aBx / ad
Solution:
A and X directly derive string of terminals a and ad, hence they are useful.
Example: Find the equivalent useless grammar from the given grammar
A -> xyz / Xyzz
X -> Xz / xYz
Y -> yYy / Xz
Z -> Zy / z
Solution:
X and Y are not useful so all the production with X and Y in them should be
removed to eliminate non-generating symbols. The grammar then becomes
A -> xyz
Z -> Zy / z
Since A is the starting symbol this implies Z is the non-reachable symbol. So,
we remove it to get a grammar free of useless symbols:
A -> xyz.
2.9 Unit productions
Select a unit production A -> B, such that there exist a production B -> α,
where α is a terminal
Solution:
Now we have C -> D so we add a production C -> a to the grammar and delete
C -> D from the grammar.
Similarly we have B -> C by adding B -> a and removing B -> C we get the final
grammar free of unit production as:
S -> AB
A -> a
B -> a / b
C -> a
D -> a
E -> a
We can see that C, D and E are unreachable symbols so to get a completely
reduced grammar we remove them from the CFG. The final CFG is :
S -> AB
A -> a
B -> a / b
Example: Identify and remove the unit productions from the following
CFG
S -> S + T/ T
T -> T * F/ F
F -> (S)/a
Solution:
S -> T and T -> F are the two unit productions in the CFG.
For productions T -> F we have F -> (S)/a so we add T -> (S)/a to the grammar
and remove T-> F from the grammar.
Now for production S -> T we have production T -> T * F/(S)/a so we add S -> T
* F/(S)/a to the grammar.
N => … => ϵ.
These resultant non ϵ-productions must be added to the grammar to keep the
language the same.
Solution:
Given:
S → ABb | a
A → aaA \ B
B → bAb
Solution:
A1 → A2A3b | a
A2 → aaA2| A3
A3 → bA2b
A1 → a.
Solution:
Step 1:
Therefore the variables S and A are renamed as A1, A2. Hence the productions
are:
A1 → A2A2 | a, A2 → A1 A1 | b
Step 2:
Step 3:
A2 → aA1, A2→b
A2→aA1,z2, A2→bz2
Z2→A2A1, Z2→A2A1Z2.
Lemma 2:
Let G=(N, T, S, P) be a CFG. Let the set of A productions be A→Aα1,|Aα2|…..|Aαr|
β1,| β2|…….| β S.
Let B be a new variable then G1, = (N∪{B}, T, P', S) where P' is defined as
A → β1|β2|…… βs
B→α,|α2...... | α r
(c)The productions for other variables are as in P, then G1, is a CFG in GNF
equivalent to G.
Step 4:
A2→aA1|b|aA1z2bz2
A1→aA1A2 | bA2
Al→aA1z2A2 | bz2A2
Step 5:
G1 = ({A1,A2,z2}, {a,b}.,P1,A1)
A1→a|aA1A2|bA2|aA1z2A2|bz2A2
S → A|CB
A → C|D
B → 1B|1
C → 0C|0
D → 2D|2
Given:
S → A|CB
A → C|D
B → 1 B|1
C → 0C|0
D → 2D|2
Solution:
1.No null productions, the unit productions exist in the form of chain
productions.
S→A
A→C
S → 0C|0
Similarly
S→A
A→D
S → 2D|2
S → 0C|2D|CB|0|2
B → IB | 1
C → 0C|0
D 2D|2
S → 0|2|CB
B → 1C → 0D → 2 are added to P1
S → 0C|2D,B → 1BC → 0C
D → 2D yield
S → A1C|A3D,B → A2B,C → A1C
The length of the parse tree to represent a string w is n (longest path) then |w|
≤ 2n–1 in CNF.
Let us discuss and prove the Chomsky normal form for CFL.
Theorem:
Proof:
(i)All productions in P1 are added to P2 if they are in the required form. All the
variables in N1 are added to N".
and new variables C1, C2 ........Cm-2. These are added to P" and N" respectively.
Step 4:
Hence
Solution:
Step 1:
Step 2:
Let G1 = (V1, {a, b, d}, P1, S), where P1, V1 are constructed as follows:
∴ V1 = {S,A,B,D,Ca,Cb}
∴ P1 = {S → CaAD,
A → C B | CaAB,
B → b,
D → d,
Ca → a,
Cb → b}
Step 3:
Let G2 = (V1 {a, b, d}, P2, S), where P2, V1 are constructed as follows:
∴ P2 = {S → CaC1
A → CaB | CbC2
C1 → AD
C2 → AB
B → b,
D → d,
Ca → a,
Cb → b }
Step 1:
S → a | AAB | AB |AA
A → ab | aB | a
B → aba
Step 2:
A → aba yields
S → AAB, A → C1 C2 | C1 B, B → C1 C2 C1
where C1 → a, C2 → b
Step 3:
(i)S → a, A → a, S → AB | AA,
A → C1 C2 | C1 B, C1 → a, C2 → b are added to p2
Solution:
A1 → A2 A3 | a
A2 → A3 A4 | a
A3 → b
A4 → b
A1 → a A3 | a
A2 → b A4 | a
A3 → b
A4 → b