Sei sulla pagina 1di 20

Scanner

Lexical Analyzer
Part Two
Lecture Three
Parts of a Compiler
Source Front End ‘Middle End’ Back End Target

chars IR IR
Scan Select Instructions
tokens IR

Parse Optimize Allocate Registers


AST IR

Semantics Emit

IR IR
Machine Code

AST = Abstract Syntax Tree


IR = Intermediate Representation

Jim Hogg – Washington University - CSE P501 A-2


Structure of a Compiler
Front end: analyze
• Read source program; understand its structure and meaning
• Specific to the source language used

Back end: synthesize


• Generate equivalent target language program
• Mostly unaware of the source language use

Source Front End ‘Middle End’ Back End Target


Parts of a Compiler
Source Front End ‘Middle End’ Back End Target

chars IR IR
Scan Select Instructions
tokens IR

Parse Optimize Allocate Registers


AST IR

Semantics Emit

IR IR
Machine Code

AST = Abstract Syntax Tree


IR = Intermediate Representation

Jim Hogg – Washington University - CSE P501 A-4


Finite Automata (FA)
DFA is a 5-tuple (S, , , s0, F) where:
S is a finite set of states.
 a finite set of symbols, the input alphabet.
 transition function is a mapping from S   → Set of states
s0  S is the start state
F  S is the set of accepting (or final) states

5
Finite Automata
• States are nodes, transitions are directed labeled edges,
some states are marked as final, one state is marked as
starting.

a
The mapping  of an FA can be represented in a transition table

0
a
1
b
2
b
3 Input Input
State
a b
b
S = {0,1,2,3} (0,a) = {0,1} 0 {0, 1} {0}
 = {a,b} (0,b) = {0} 1 {2}
s0 = 0 (1,b) = {2}
F = {3} 2 {3}
(2,b) = {3}
*3
6
DFA Example (2)
Accepting Strings Ending in “ing”.
Not i or g

Not i or n i
Not i

S0 i in ing
i n g

i i

Not i
CS510 Spring 2015 Shahira Azazy 7
DFA Example (3)
Rejecting Strings containing 11

0 0,1

1 1
S0 S1 S2

CS510 Spring 2015 Shahira Azazy 8


NFA vs DFA
NFA has either:
Multiple transitions from one state on the same input
-move
• A string is accepted by an NFA if there exists a sequence of transitions leads
from the start state to some final state.
a DFA
S0 S1

𝜀 NFA
S0 S1
a
NFA
S0 S1

a S2
9
Finite Automata(Example)
1) Design FSM that accept RE=[a-z]+|[0-9]+ S0
a-z
S1
a-z

2) Match RE=[a-z]+|[0-9]+ with “Abcd 2004”


3) Which of the following Strings are accepted 0-9
0-9
S2
by the FSM?
a. a
b. Abcd 2004
c. Abcd 2004
d. 20000007
e. abcd2004
2) {bcd,2004}
3) a,d
10
Finite Automata(Example)

1) Define the following FSM


2) Define language accepted by this machine
3) Is Machine DFA or NFA?

1) FA is a 5-tuple (S, , , s0, F) where:


1. S={q0,q1,q2,q3,q4)
2. ={0,1}
3. s0=q0
4. F={q2,q4}
5. =

2) Language contains two consecutive 0’s or two


consecutive 1’s
3) NFA because q0 has more than one transition on
the same input. 11
Scanner DFA Example – Part 1

whitespace
or comments

end of input
1 Accept EOF

(
2 Accept LPAREN

)
3 Accept RPAREN

;
4 Accept SEMI
Scanner DFA Example – Part 2

! =
5 6 Accept NEQ

[other ] Accept NOT


7

< =
8 9 Accept LEQ

[other ] Accept LESS


10
Scanner DFA Example – Part 3

[0-9] [0-9]
11

[other ] Accept ILIT


12
Scanner DFA Example – Part 4

[a-zA-Z] [a-zA-Z0-9_]
13

[other ] Accept ID or keyword


14

• Strategies for handling identifiers vs keywords


• Hand-written scanner: look up identifier-like things in table of keywords
• Machine-generated scanner: generate DFA with appropriate transitions to
recognize keywords

Jim Hogg - UW - CSE P501


Scanner Phases
The scanning process is a
pattern matching process
and it can be divided in • Regular
RE Expression
two main phases:
• Pattern specification using
• Non Deterministic
regular expressions NFA Finite Automata
• Pattern recognition using
finite automata for • Deterministic
recognizing patterns DFA Finite Automata

• 1- Using Doubly
Nested Case Analysis
Implementation • 2- Using the
Transition Table
(Table Driven)

16
RE to NFA
• Theconstruction of an NFA using a regular expression. The -
transitions are used to “glue together” the machines of each piece of a
regular expression to form a machine that corresponds to the whole
expression

17
RE to NFA(Continued)
NFA for: ab
a ε b

a
ε ε

NFA for a|b

b ε
ε

ε a ε
NFA for: a*

ε
* Jim Hogg :2014 Washington university

18
RE to NFA(Example-1)
• NFA for a(b|c)*

b
 
a
  

 c

19
RE to NFA(Example-2)
• NFA for abc*


a  b
  c
 

20

Potrebbero piacerti anche