Sei sulla pagina 1di 22

Compiler Construction

Course Code: CSIT-21403


MCS( 4-Semester)

Sajeel Zulfiqar
Lecture Plan
Week 6:
Tokens Recognitions,Transition Diagrams and
DFAs

Transition Diagrams, Examples,


Implementation of Transition Diagrams,
DFAs.
Recognition of tokens
 Starting point is the language grammar to understand
the tokens:
stmt -> if expr then stmt
| if expr then stmt else stmt

expr -> term relop term
| term
term -> id
| number
Recognition of tokens (cont.)
 The next step is to formalize the patterns:
digit -> [0-9]
Digits -> digit+
number -> digit(.digits)? (E[+-]? Digit)?
letter -> [A-Za-z_]
id -> letter (letter|digit)*
If -> if
Then -> then
Else -> else
Relop -> < | > | <= | >= | = | <>
 We also need to handle whitespaces:
ws -> (blank | tab | newline)+
Transition Diagram
Convert patterns into stylized flowcharts, called
“transition diagrams.“

Graphical way to represent the working.


Transition Diagram

Conversion from regular-expression patterns to diagrams.


It have a collection of nodes or circles, called states.
Each state represents a condition that could occur
during the process of scanning.

Edges are directed from one state of the transition


diagram to another state.
Each edge is labeled by a symbol or set of symbol.
Transition diagrams (cont.)
 Transition diagram for reserved words and identifiers
Transition diagrams (cont.)
 Transition diagram for whitespace
Transition diagrams (cont.)
 Transition diagram for unsigned numbers
Transition Diagrams for >=

 start state : stare 0 in the above example


 If input character is >, go to state 6.
 other refers to any character that is not
indicated by any of the other edges leaving s.
10 Yu-Chen Kuo
Transition Diagrams for Relational Operators

token attribute-value
Architecture of a transition-diagram-based
lexical analyzer
TOKEN getRelop()
{
TOKEN retToken = new (RELOP)
while (1) { /* repeat character processing until a
return or failure occurs */
switch(state) {
case 0: c= nextchar();
if (c == ‘<‘) state = 1;
else if (c == ‘=‘) state = 5;
else if (c == ‘>’) state = 6;
else fail(); /* lexeme is not a relop */
break;
case 1: …

case 8: retract();
retToken.attribute = GT;
return(retToken);
}
Transition Diagrams for Identifiers and
Keywords

 gettoken( ): return token (id, if, then,…) if it


looks the symbol table
 install_id( ): return 0 if keyword or a pointer
to the symbol table entry if id
Finite Automata
 It define how Lex turns its input program into a lexical
analyze.
 The heart of the transition is the formalism known as
finite automata.
 Like a transition diagrams but little bit of differences
 It is a recognizers;,simply says "yes" or "no" about each
possible input string.
Finite Automata(cont’)
Finite automata consist on two types:

a) Nondeterministic finite automata (NFA)

b) Deterministic finite automata (DFA)

Both are capable of recognizing the same language.


Nondeterministic Finite Automata

have no restrictions on the labels of their


edges. A symbol can label several edges out of the same
state, and ͼ, the empty string, is a possible label.
NFA (cont’)
A NFA consists of:
1. A finite set of states S.
2. A set of input symbols ∑, the input alphabet. (We
assume that ͼ, which stands for the empty string, is never a member of
∑)
3. A transition function that gives, for each state, and for each
symbol in ∑ U (ͼ) a set of next states.
4. A state so from that state (or initial state).
5. A set of states F, a subset of S, that is distinguished as the
accepting states (or final states).
Deterministic Finite Automata

have, for each state, and for each


symbol of its input alphabet exactly one
edge with that symbol leaving that state.
Deterministic Finite Automata

Accepting aa*|bb*
DFA (cont’)

1) There are no moves on input ͼ , and


2) For each state s and input symbol a, there is exactly one
edge out of s labeled a.
((DFA Vs NFA)
DFA NFA

For Every symbol of the alphabet, there is We do not need to specify how does the
only one state transition in DFA. NFA react according to some symbol.

DFA cannot use Empty String transition.


NFA can use Empty String transition

DFA can be understood as one machine.


NFA can be understood as multiple little
machines computing at the same time.

DFA will reject the string if it end at other If all of the branches of NFA dies or
than accepting state. rejects the string, we can say that NFA
reject the string
((DFA Vs NFA)
DFA NFA
Backtracking is allowed in DFA. Backtracking is not always allowed in NFA
NFA can be understood as multiple little
DFA can be understood as one machine.
machines computing at the same time.
DFA is more difficult to construct NFA is easier to construct
If all of the branches of NFA dies or
DFA will reject the string if it end at other
rejects the string, we can say that NFA
than accepting or final state.
reject the string.

Potrebbero piacerti anche