Sei sulla pagina 1di 66

' $

J. Xue

COMP3131/9102: Programming Languages and Compilers

Jingling Xue
School of Computer Science and Engineering
The University of New South Wales
Sydney, NSW 2052, Australia
http://www.cse.unsw.edu.au/~cs3131
http://www.cse.unsw.edu.au/~cs9102
Copyright @2010, Jingling Xue

& %
COMP3131/9102 Page 194 March 22, 2010
' $
J. Xue

Assignment 1
• See the FAQs, feedback and test cases
• The marks are provisional – see the plagiarism policy
• The functionality of the scanner:
* When called, always returns the longest prefix of the
remaining input that can be interpreted as a token
* if the current char cannot begin a token, then:
- skip it if it is a whilespace
- return an error token otherwise

& %
COMP3131/9102 Page 195 March 22, 2010
' $
J. Xue

Lecture 4: Top-Down Parsing: Recursive-Descent


1. Compare and contrast top-down and bottom-up parsing
2. Write a predictive (or non-backtracking) top-down parser
Grammar G Assignment 2!

Eliminating left recursion & common prefixes

The Transformed Grammar G′

Constructing First, Follow and Select Sets for G′

A Recursive-Descent Parser The LL(1) Parsing Table

The LL(1) Table-Driving Parser


& %
COMP3131/9102 Page 196 March 22, 2010
' $
J. Xue

The micro-English Grammar Revisited


1 hsentencei → hsubjecti hpredicatei
2 hsubjecti → NOUN
3 | ARTICLE NOUN
4 hpredicatei → VERB hobjecti
5 hobjecti → NOUN
6 | ARTICLE NOUN

The English Sentence

PETER PASSED THE TEST

& %
COMP3131/9102 Page 197 March 22, 2010
' $
J. Xue

The micro-English Grammar Revisited (Cont’d)


• The Leftmost Derivation:
hsentencei =⇒lm hsubjecti hpredicatei by P1
=⇒lm NOUN hpredicatei by P2
=⇒lm NOUN VERB hobjecti by P4
=⇒lm NOUN VERB ARTICLE NOUN by P6

• The Rightmost Derivation:


hsentencei =⇒rm hsubjecti hpredicatei by P1
=⇒rm hsubjecti VERB hobjecti by P4
=⇒rm hsubjecti VERB ARTICLE NOUN by P6
=⇒rm NOUN VERB ARTICLE NOUN by P2

& %
COMP3131/9102 Page 198 March 22, 2010
' $
J. Xue

The Role of the Parser

PETER PASSED THE TEST

Scanner
NOUN1 VERB ARTICLE NOUN2

Parser
hsentencei
hsubjecti hpredicatei
hNOUNi VERB hobjecti
PETER PASSED ARTICLE NOUN
THE TEST

& %
COMP3131/9102 Page 199 March 22, 2010
' $
J. Xue

Two General Parsing Methods


1. Top-down parsing – Build the parse tree top-down:
• Productions used represent the leftmost derivation.
• The best known and widely used methods:
– Recursive descent
– Table-driven
– LL(k) (Left-to-right scan of input, Leftmost derivation, k tokens of
lookahead).
– Almost all programming languages can be specified by LL(1) grammars, but
such grammars may not reflect the structure of a language
– In practice, LL(k) for small k is used
• Implemented more easily by hand.
• Used in parser generators such as JavaCC
2. Bottom-up parsing – Build the parse tree bottom-up:
• Productions used represent the rightmost derivation in reverse.
• The best known and widely used method: LR(1) (Left-to- right scan of input,
Rightmost derivation in reverse, 1 token of lookahead)
• More powerful – every LL(1) is LR(1) but the converse is false
• Used by parser generators (e.g., Yacc and JavaCUP).
& %
COMP3131/9102 Page 200 March 22, 2010
' $
J. Xue

Lookahead Token(s)

• Lookahead Token(s): The currently scanned token(s) in the input.


• In Recogniser.java, currentToken represents the lookahead token
• For most programming languages, one token lookahead only.
• Initially, the lookahead token is the leftmost token in the input.

& %
COMP3131/9102 Page 201 March 22, 2010
' $
J. Xue

Top-Down Parse Of NOUN VERB ARTICLE NOUN


T REE hsentencei

I NPUT NOUN

T REE hsentencei
hsubjecti hpredicatei

I NPUT NOUN

Notations:
• ↑ on the tree indicates the nonterminal being expanded or recognised
• ↑ on the sentence points to the lookahead token
– All tokens to the left of ↑ have been read
– All tokens to the right of ↑ have NOT been processed

& %
COMP3131/9102 Page 202 March 22, 2010
' $
J. Xue

T REE hsentencei

hsubjecti hpredicatei

NOUN

I NPUT NOUN

T REE hsentencei

hsubjecti hpredicatei

NOUN
I NPUT NOUN VERB

& %
COMP3131/9102 Page 203 March 22, 2010
' $
J. Xue

T REE hsentencei

hsubjecti hpredicatei

NOUN VERB hobjecti



I NPUT NOUN VERB

T REE hsentencei

hsubjecti hpredicatei

NOUN VERB hobjecti



I NPUT NOUN VERB ARTICLE

& %
COMP3131/9102 Page 204 March 22, 2010
' $
J. Xue

T REE hsentencei

hsubjecti hpredicatei
NOUN VERB hobjecti

ARTICLE NOUN

I NPUT NOUN VERB ARTICLE

T REE hsentencei

hsubjecti hpredicatei
NOUN VERB hobjecti

ARTICLE NOUN

I NPUT NOUN VERB ARTICLE NOUN

& %
COMP3131/9102 Page 205 March 22, 2010
' $
J. Xue

Top-Down Parsing

• Build the parse tree starting with the start symbol (i.e., the
root) towards the sentence being analysed (i.e., leaves).
• Use one token of lookahead, in general
• Discover the leftmost derivation
I.e, the productions used in expanding the parse tree
represent a leftmost derivation

& %
COMP3131/9102 Page 206 March 22, 2010
' $
J. Xue

Predictive (Non-Backtracking) Top-Down Parsing


• To expand a nonterminal, the parser always predict
(choose) the right alternative for the nonterminal by
looking at the lookahead symbol only.
• Flow-of-control constructs, with their distinguishing
keywords, are detectable this way, e.g., in the VC
grammar:
hstmti → hcompound-stmti
| if ”(” hexpri ”)” (ELSE hstmti)?
| break ”;”
| continue ”;”
···
• Prediction happens before the actual match begins.
& %
COMP3131/9102 Page 207 March 22, 2010
' $
J. Xue

Bottom-Up Parse Of NOUN VERB ARTICLE NOUN

T REE
I NPUT NOUN

T REE hsubjecti
NOUN
I NPUT NOUN

T REE hsubjecti
NOUN
I NPUT NOUN VERB

& %
COMP3131/9102 Page 208 March 22, 2010
' $
J. Xue

T REE hsubjecti
NOUN
I NPUT NOUN VERB ARTICLE

T REE hsubjecti
NOUN
I NPUT NOUN VERB ARTICLE NOUN

T REE hsubjecti hobjecti


NOUN ARTICLE NOUN
I NPUT NOUN VERB ARTICLE NOUN

Note: What if the parser had chosen hsubjecti →ARTICLE NOUN instead of hobjecti →ARTICLE
NOUN?
In this case, the parser would not make any further process.
Having read a hsubjecti and VERB, the parser has reached a state in which it should not parse another hsubjecti.

& %
COMP3131/9102 Page 209 March 22, 2010
' $
J. Xue

T REE hsubjecti hpredicatei

NOUN VERB hobjecti

ARTICLE NOUN
I NPUT NOUN VERB ARTICLE NOUN

T REE hsentencei

hsubjecti hpredicatei

NOUN VERB hobjecti

ARTICLE NOUN
I NPUT NOUN VERB ARTICLE NOUN

& %
COMP3131/9102 Page 210 March 22, 2010
' $
J. Xue

Bottom-Up Parsing

• Build the parse tree starting with the the sentence being
analysed (i.e., leaves) towards the start symbol (i.e., the
root).
• Use one token of lookahead, in general.
• The basic (smallest) language constructs recognised first,
then they are used to discover more complex constructs.
• Discover the rightmost derivation in reverse — the
productions used in expanding the parse tree represent a
rightmost derivation in reverse order

& %
COMP3131/9102 Page 211 March 22, 2010
' $
J. Xue

Lecture 4: Top-Down Parsing: Recursive-Descent



1. Compare and contrast top-down and bottom-up parsing
2. Write a predictive (or non-backtracking) top-down parser
Grammar G Assignment 2!

Eliminating left recursion & common prefixes

The Transformed Grammar G′

Constructing First, Follow and Select Sets for G′

a Recursive-Descent Parser The LL(1) Parsing Table

The LL(1) Table-Driving Parser


& %
COMP3131/9102 Page 212 March 22, 2010
' $
J. Xue

Which of the Two Alternatives on S to Choose?


• Grammar:
S → aA | bB
A → ···
B → ···
• Sentence: a · · ·
• The leftmost derivation:
S =⇒lm aA
=⇒lm · · ·

Select the first alternative aA

& %
COMP3131/9102 Page 213 March 22, 2010
' $
J. Xue

Which of the Two Alternatives on S to Choose?


• Grammar:
S → Ab | Bc
A → Df | CA
B → gA | e
C → dC | c
D→h|i
• Sentence: gchf c
• The leftmost derivation:
S =⇒lm Bc =⇒lm gAc =⇒lm gCAc
=⇒lm gcAc =⇒lm gcDf c =⇒lm gchf c

& %
COMP3131/9102 Page 214 March 22, 2010
' $
J. Xue

Intuition behind First Sets


• Grammar:
S → Ab | Bc
A → Df | CA
B → gA | e
C → dC | c
D→h|i
=⇒hf b
=⇒Df b =⇒if b
=⇒Ab
=⇒dCAb=⇒ · · ·
=⇒CAb
S =⇒cAb=⇒ · · ·
=⇒gAc=⇒ · · ·
=⇒Bc
=⇒ec=⇒ · · ·
• All possible leftmost derivations:
First(Ab) = {c, d, h, i} First(Bc) = {e, g}
& %
COMP3131/9102 Page 215 March 22, 2010
' $
J. Xue

Definition of First Sets

First(α):
• The set of all terminals that can begin any strings derived
from α.
• if α=⇒∗ ǫ, then ǫ is also in First(α)

Nullable Nonterminals

A nonterminal A is nullable if A=⇒∗ ǫ.

& %
COMP3131/9102 Page 216 March 22, 2010
' $
J. Xue

A Procedure to Compute First(α)


1. Case 1: α is a single symbol or ǫ:
If α is a terminal a, then First(α) = First(a) = {a}
else if α is ǫ, then First(α) = First(ǫ) = {ǫ}
else if α is a nonterminal and α→β1 | β2 | β3 | · · · then
First(α) = ∪k First(βk )
2. Case 2: α = X1 X2 · · · Xn :
If X1 X2 . . . Xi is nullable but Xi+1 is not, then
First(α) = First(X1 ) ∪ First(X2 ) ∪ · · · ∪ First(Xi+1 )
If i = n is nullable, then add ǫ to First(α)

& %
COMP3131/9102 Page 217 March 22, 2010
J. Xue

' $

Case 2 of the Procedure for Computing First


S → ABCd
A→e|f |ǫ
B→g|h|ǫ
C →p|q

eBCd=⇒
S=⇒ ABCd f BCd=⇒ gCd=⇒
BCd=⇒
hCd=⇒ pd
Cd=⇒
qd
First(ABCd) = {e, f, g, h, p, q}
& %
COMP3131/9102 Page 218 March 22, 2010
J. Xue

' $

Case 2 of the Procedure for Computing First Again


S → ABC d was deleted from the grammar in slide 218
A→e|f |ǫ
B→g|h|ǫ
C →p|q|ǫ

eBC=⇒
S=⇒ ABC f BC=⇒ gC=⇒
BC=⇒
hC=⇒ p
C=⇒ q
ǫ
First(ABC) = {e, f, g, h, p, q, ǫ}
& %
COMP3131/9102 Page 219 March 22, 2010
' $
J. Xue

The Expression Grammar


• The grammar with left recursion:
Grammar 1: E → E + T | E − T | T
T → T ∗ F | T /F | F
F → INT | (E)
• The transformed grammar without left recursion:
Grammar 2: E → T Q
Q → +T Q | − T Q | ǫ
T → FR
R → ∗F R | /F R | ǫ
F → INT | (E)

& %
COMP3131/9102 Page 220 March 22, 2010
' $
J. Xue

First Sets for Grammar 1 (with Recursion)


First(E) = First(E + T ) = First(E − T ) =
First(T ) = First(T ∗ F ) = First(T /F ) =
First(F ) = {(, INT}
First((E)) = {(}
First(INT) = {INT}

The explicit construction is left as an exercise.

& %
COMP3131/9102 Page 221 March 22, 2010
' $
J. Xue

First Sets for Grammar 2 (without Recursion)


First(E) = First(T Q) = {(, INT}
First(T ) = First(F R) = {(, INT}
First(Q) = {+, −, ǫ}
First(R) = {∗, /, ǫ}
First(F ) = {INT, (}
First(+T Q) = {+}
First(−T Q) = {−}
First(∗F R) = {∗}
First(/F R) = {/}
First((E)) = {(}
First(INT) = {INT}

& %
COMP3131/9102 Page 222 March 22, 2010
' $
J. Xue

Why Follow Sets?

• First sets do not tell us when to apply A→α such that


α=⇒∗ ǫ (the important special case is A→ǫ)
• Follow sets do
• Follow sets constructed only for nonterminals
• By convention, assume every input is terminated by a
special end marker (i.e., the EOF marker), denoted $
• Follow sets do not contain ǫ

& %
COMP3131/9102 Page 223 March 22, 2010
' $
J. Xue

Definition of Follow Sets

Let A be a nonterminal. Define Follow(A) to be the set of


terminals that can appear immediately to the right of A in
some sentential form. That is,
Follow(A) = {a | S=⇒∗ · · · Aa · · · }
where S is the start symbol of the grammar.

& %
COMP3131/9102 Page 224 March 22, 2010
' $
J. Xue

A Procedure to Compute Follow Sets

1. If A is the start symbol, add $ to Follow(A).


2. Look through the grammar for all occurrences of A on the
right of productions. Let a typical production be:
B → αAβ
There are two cases – both may be applicable:
(a) Follow(A) includes First(β) − {ǫ}.
(b) If β=⇒∗ ǫ, then include Follow(B) in Follow(A).

& %
COMP3131/9102 Page 225 March 22, 2010
' $
J. Xue

Follow Sets for Grammar 1 (with Recursion)

Follow(E) = {+, −, ), $}
Follow(T ) = Follow(F ) = {+, −, ∗, /, ), $}

The explicit construction is left as an exercise.

& %
COMP3131/9102 Page 226 March 22, 2010
' $
J. Xue

Follow Sets for Grammar 2 (without Recursion)

Follow(E) = {), $}
Follow(Q) = {), $}
Follow(T ) = {+, −, ), $}
Follow(R) = {+, −, ), $}
Follow(F ) = {+, −, ∗, /, ), $}

& %
COMP3131/9102 Page 227 March 22, 2010
' $
J. Xue

Select Sets for Productions

• One Select set for every production in the grammar:


• The Select set for a production of the form A→α is:
– If ǫ ∈ First(α), then
Select(A→α) = (First(α) − {ǫ}) ∪ Follow(A)
– Otherwise:
Select(A→α) = First(α)
• The Select set predicts A→α to be used in a derivation.
• Thus, the Select not needed if A has has one alternative

& %
COMP3131/9102 Page 228 March 22, 2010
' $
J. Xue

Select Sets for Grammar 1

Follow sets not used since the grammar has no ǫ-productions


Select(E→E + T ) = First(E + T ) = {(, INT}
Select(E→E − T ) = First(E − T ) = {(, INT}
Select(E→T ) = First(T ) = {(, INT}
Select(T →T ∗ F ) = First(T ∗ F ) = {(, INT}
Select(T →T /F ) = First(T /F ) = {(, INT}
Select(T →F ) = First(F ) = {(, INT}
Select(F →INT) = First(INT) = {INT}
Select(F →(E)) = First((E)) = {(}

& %
COMP3131/9102 Page 229 March 22, 2010
' $
J. Xue

Select Sets for Grammar 2

Select(E→T Q) = First(T Q) = {(, INT}


Select(Q→ + T Q) = First(+T Q) = {+}
Select(Q→ − T Q) = First(−T Q) = {−}
Select(Q→ǫ) = (First(ǫ) − {ǫ}) ∪ Follow(Q) = {), $}
Select(T →F R) = First(F R) = {(, INT}
Select(R→ ∗ F R) = First(+F R) = {∗}
Select(R→/F R) = First(/F R) = {/}
Select(R→ǫ) = (First(ǫ) − {ǫ}) ∪ Follow(T ) = {+, −, ), $}
Select(F →INT) = First(INT) = {INT}
Select(F →(E)) = First((E)) = {(}

& %
COMP3131/9102 Page 230 March 22, 2010
' $
J. Xue

Lecture 4: Top-Down Parsing: Recursive-Descent


1. Compare and contrast top-down and bottom-up parsing
2. Write a predictive (or non-backtracking) top-down parser
Grammar G Assignment 2!

Eliminating left recursion & common prefixes

The Transformed Grammar G′

Constructing First, Follow and Select Sets for G′

A Recursive-Descent Parser The LL(1) Parsing Table

The LL(1) Table-Driving Parser


& %
COMP3131/9102 Page 231 March 22, 2010
' $
J. Xue

Writing a Predictive Recursive-Descent Parser


• The variable currentToken is the lookahead token, which
is initialised to the leftmost token in the program
• A method, called match, for matching the tokens at
production right-hand sides
void match(int tokenExpected) {
if (currentToken.kind == tokenExpected) {
currentToken = scanner.getToken();
} else {
error: "tokenExpected" expected
but "currentToken" found
}
}

• A method, called parseA, for every nonterminal

& %
COMP3131/9102 Page 232 March 22, 2010
' $
J. Xue

Parsing Method ParsingA for A→α1 | · · · | αn


void parseA() {
switch (currentToken.kind) {
cases in Select(A→α1 )
parse α1
break;
···
cases in Select(A→αn )
parse αn
break;
default:
syntacticError(...);
break;
}
}
• All alternatives form the basis of the method body
• Attempt to find a substring o the input, beginning with the
currentToken, that can be interpreted as the nonterminal
& %
COMP3131/9102 Page 233 March 22, 2010
' $
J. Xue

ParsingA for A→α1 | · · · | αn : if-then-else


void parseA() {
if (currentToken.kind in Select(A→α1 ))
parse α1
else if (currentToken.kind in Select(A→α2 ))
parse α2
···
else if (currentToken.kind in Select(A→αn ))
parse αn
else
syntacticError(...);
}

& %
COMP3131/9102 Page 234 March 22, 2010
' $
J. Xue

ParsingA for A→α When A Has a Single Alternative

void parseA() {
parse α
}

& %
COMP3131/9102 Page 235 March 22, 2010
' $
J. Xue

Coding parse αi

• Suppose αi = aABbC, where A, B and C are


nonterminals
• parse αi implemented as:
match("a");
parseA();
parseB();
match("b");
parseC();
• If αi = ǫ, then parse αi implemented as:
/* empty statement */

& %
COMP3131/9102 Page 236 March 22, 2010
' $
J. Xue

Coding parse αi : A Concrete Example

void parseWhileStmt() throws SyntaxError {


match(Token.WHILE);
match(Token.LPAREN);
parseExpr();
match(Token.RPAREN);
parseStmt();
}

& %
COMP3131/9102 Page 237 March 22, 2010
' $
J. Xue

Parsing Method for the Start Symbol


• If the start symbol S does not appear anywhere else: then
void parseS() {
code for the alternatives of S
match(Token.EOF);
}
• Otherwise, introduce a new start symbol, Goal:
void parseGoal() {
parseS();
match(Token.EOF);
}

& %
COMP3131/9102 Page 238 March 22, 2010
' $
J. Xue

Term (Predictive) Recursive Descent?


• Predictive (or non-backtracking): the parser always
predicts the right production to use at every derivation step
• Recursive, a parsing method may call itself recursively
either directly or indirectly.
• Descent: the parser builds the parse tree (or AST) by
descending through it as it parses the program
(Assignment 3).

& %
COMP3131/9102 Page 239 March 22, 2010
' $
J. Xue

Outline for the Rest of the Lecture

1. Definition of LL(1) grammar


2. One simplification in the presence of a nullable alternative
3. Eliminate left recursion and common prefixes
4. Write parsing methods in the presence of regular operators
5. LL(k) for small k often necessary for some constructs
6. Assignment 2

& %
COMP3131/9102 Page 240 March 22, 2010
' $
J. Xue

Definition of LL(1) Grammar

• A grammar is LL(1) if for every nonterminal of the form:


A → α1 | · · · | αn
the select sets are pairwise disjoint, i.e.:
Select(A→αi ) ∩ Select(A→αj ) = ∅
for all i and j such that i 6= j.
• This implies there can be at most one nullable alternative

& %
COMP3131/9102 Page 241 March 22, 2010
' $
J. Xue

One Simplification When A Has a Nullable Alternative

void parseA() {
switch (currentToken.kind) {
cases in First(A→α1 )
parse α1
break;
···
cases in First(A→αn−1 )
parse αn−1
break;
default: /* A→αn as the default (item 4, p. 193/229, Red/Purple Dragon */
parse αn
break;
}
}
• Suppose αn =⇒∗ ǫ is the only nullable alternative
• Then Select(A→αi ) = First(αi ) for 1 6 i < n
• In fact, the coding still correct even if all are not nullable!
& %
COMP3131/9102 Page 242 March 22, 2010
' $
J. Xue

The Simplified Parsing Method Illustrated


• The grammar:
S --> A b Select(A→a) = First(a) = {a}
A --> a | ǫ Select(A→ǫ) = Follow(A) = {b}
• The language: {b, ab}

V1: void parseS() { V2: void parseS() {


parseA(); parseA();
match(”b”); match(”b”);
match(”EOF”); match(”EOF”);
} }

void parseA() { void parseA() {


if (lookahead is ”a”) if (lookahead is ”a”)
match(”a”); match(”a”);
else if (lookahead is ”b”) }
/* do nothing */ /* In V2, some error detection
else
print an error message postponed but cannot cause
} any error to be missed */
& %
COMP3131/9102 Page 243 March 22, 2010
' $
J. Xue

The Simplified Parsing Method Illustrated (Cont’d)


• The grammar generating the language: {ab, cb}:
S --> A b Select(A→a) = First(a) = {a}
A --> a | c Select(A→c) = First(c) = {c}
V1: void parseS() { V2: void parseS() {
parseA(); parseA();
match(”b”); match(”b”);
match(”EOF”); match(”EOF”);
} }

void parseA() { void parseA() {


if (lookahead is ”a”) if (lookahead is ”a”)
match(”a”); match(”a”);
else if (lookahead is ”c”) else
match(”c”); match(”c”);
else }
print an error message
}
In V2, some error detection postponed but no error will be missed
& %
COMP3131/9102 Page 244 March 22, 2010
' $
J. Xue

Left-Recursive Grammars Are Not LL(1)


The parsing method for E of Grammar 1 in Slide 220:
void parseE() {
switch (currentToken.kind) {
case Token.INT: case Token.LPAREN:
parseE();
break;
case Token.INT: case Token.LPAREN:
parseE();
match(Token.PLUS);
parseT();
break;
case Token.INT: case Token.LPAREN:
parseE();
match(Token.MINUS);
parseT();
break;
default:
syntacticError(...);
break;
}

& %
} /* this does not work */

COMP3131/9102 Page 245 March 22, 2010


' $
J. Xue

Left Recursion
• Direct left-recursion:
A → Aα
• Non-direct left-recursion:
A → Bα
B → Aβ
– Algorithm 4.1 of text eliminates both kinds of left recursion
– In real programming languages, non-direct left-recursion is rare
– Not required
• A grammar with left recursion is not LL(1)

& %
COMP3131/9102 Page 246 March 22, 2010
' $
J. Xue

Eliminating Direct Left Recursion


• The grammar G1 :
A → α1 | α2 | · · · | αn // αi does not beging with A
A → Aβ1 | Aβ2 | · · · | Aβm
• The transformed grammar G2 :
A → α1 A′ | α2 A′ | · · · | αn A′
A′ → β1 A′ | β2 A′ | · · · | βm A′ | ǫ
• G1 and G2 define the same language: L(G1 ) = L(G2 )
• Example: in Slide 220, Grammar 2 is the transformed
version of Grammar 1

& %
COMP3131/9102 Page 247 March 22, 2010
' $
J. Xue

Eliminating Direct Left Recursion: Special Case


• The grammar G1 :
A → α // α does not beging with A
A → Aβ
• The transformed grammar G2 :
A → αA′
A′ → βA′ | ǫ

& %
COMP3131/9102 Page 248 March 22, 2010
' $
J. Xue

Grammars with Common Prefixes Are Not LL(1)


• The dangling-else grammar:
hstmti → IF ”(” hexpri ”)” hstmti ELSE hstmti
hstmti → IF ”(” hexpri ”)” hstmti
hstmti → other
• The parsing method according to Slide 233:
void parseStmt() {
switch (currentToken.kind) {
case Token.IF:
accept();
match(Token.LPAREN);
parseExpr();
match(Token.RPAREN);
parseStmt();
match(Token.ELSE);
parseStmt();
break;
case Token.IF:
accept();
match(Token.LPAREN);
parseExpr();
match(Token.RPAREN);
parseStmt();
break;
...
& %
COMP3131/9102 Page 249 March 22, 2010
' $
J. Xue

Eliminating Common Prefixes: Left-Factoring


• The grammar G1 :
A → αβ1 | αβ2 | · · · | αβm
A → γ
• The transformed grammar G2 :
A → αA′
A → γ
A′ → β1 | β2 | · · · | βm
• L(G1 ) = L(G2)
• A grammar with common prefixes is not LL(1)

& %
COMP3131/9102 Page 250 March 22, 2010
' $
J. Xue

Example: Eliminating Common Prefixes

hstmti → IF ”(” hexpri ”)” hstmti ELSE hstmti


hstmti → IF ”(” hexpri ”)” hstmti
hstmti → other

hstmti → IF ”(” hexpri ”)” hstmti helse-cluasei
helse-clausei → ELSE hstmti | ǫ
hstmti → other

& %
COMP3131/9102 Page 251 March 22, 2010
' $
J. Xue

Eliminating Direct Left Recursion Using Regular Operators


• The grammar G1 :
A → α1 | α2 | · · · | αn
A → Aβ1 | Aβ2 | · · · | Aβm
• The transformed grammar G2 :
A → (α1 | · · · | αn )(β1 | · · · | βm )∗
• G1 and G2 define the same language: L(G1 ) = L(G2 )
• Recommended to use in Assignment 2, where n = m = 1
for most left-recursive cases:
A → αβ ∗

& %
COMP3131/9102 Page 252 March 22, 2010
' $
J. Xue

The Expression Grammar


• The grammar with left recursion:
Grammar 1: E → E + T | E − T | T
T → T ∗ F | T /F | F
F → INT | (E)
• Eliminating left recursion using the Kleene Closure
Grammar 3: E → T (”+” T | ”-” T )∗
T → F (”*” F | ”/” F )∗
F → INT | “(” E “)”
All tokens are enclosed in double quotes to distinguish
them for the regular operators: (, ) and ∗
• Compare with Slide 220
& %
COMP3131/9102 Page 253 March 22, 2010
' $
J. Xue

Eliminating Common Prefixes using Choice Operator


• The grammar G1 :
A → αβ1 | αβ2 | · · · | αβm
A → γ
• The transformed grammar G2 :
A → α(β1 | β2 | · · · | βm )
A → γ
• Recommended to use in Assignment 2

& %
COMP3131/9102 Page 254 March 22, 2010
' $
J. Xue

Example: Eliminating Common Prefixes

hstmti → IF ”(” hexpri ”)” hstmti ELSE hstmti


hstmti → IF ”(” hexpri ”)” hstmti
hstmti → other

hstmti → IF ”(” hexpri ”)” hstmti ( ELSE hstmti)?
hstmti → other

Compare with Slide 251

& %
COMP3131/9102 Page 255 March 22, 2010
' $
J. Xue

Coding parse αi in the Presence of Regular Operators


• Suppose αi = a(A)∗ (B)+ b(C)?, where A, B and C are
nonterminals or a sequence of terminals and nonterminals
• parse αi implemented as:
match("a");
while (currentToken.kind is in First(A))
parse A;
do {
parse B;
} while (currentToken.kind is in First(B))
match("b");
if (currentToken.kind is in First(C))
parse C;
& %
COMP3131/9102 Page 256 March 22, 2010
' $
J. Xue

LL(k) Grammar and Parsing

• A grammar is LL(k) if it can be parsed deterministically


using k tokens of lookahead
• A formal definition for LL(k) grammars can be found in
Grune and Jacobs’ book (cs3131/9102 on-line resources)
• Grammar 1 in Slide 220 is not LL(k) for any k!
• However, Grammar 2 in Slide 220 is LL(1)
• Only a understanding of LL(1) is required this year

& %
COMP3131/9102 Page 257 March 22, 2010
' $
J. Xue

Assigmment 2

• A subset of VC already implemented for you


• For expressions, you need to eliminate left-recursion on
several nonterminals as illustrated in Slide 253
• You also need to eliminate some common prefixes (e.g.,
one for hprimary-expri) as illustrated in Slide 255.
• A simple left-factoring can fix the LL(2) construct:
hprogi → ( hfunc-decli | hvar-decli )∗
• Everything else should be quite straightforward

& %
COMP3131/9102 Page 258 March 22, 2010
' $
J. Xue

Reading
• Chapter 2
• Red Dragon: Pages 176 – 178 and 188 – 190
• Purple Dragon: $4.3.3 – 4.3.4 and Pages 217 – 226

Week 5:
• Pages 190 – 191 (Red) and Pages 226 – 228 (Purple)
• More on recursive-descent
• LL(1) table-driven parsing
• Error recovery

& %
COMP3131/9102 Page 259 March 22, 2010

Potrebbero piacerti anche