Sei sulla pagina 1di 14

9/8/10

CS240 • Language Theory and Automata • Fall 2010


Abstract Machines

Language Recognition Problems Power

1

9/8/10

• We are interested in designing the Computing Machine


most powerful computer, i.e., the
one that can solve the widest range temporary memory
of language recognition problems
–  our notion of power does not say anything
about how fast a computation can be done input memory
CPU
–  reflects a more fundamental notion of
output memory
whether or not it is even possible to perform
some computation in a finite number of
steps Program memory

Example Example

temporary memory temporary memory

input memory x=2


CPU CPU
output memory output memory

Compute x*x Compute x*x


Compute x2*x Compute x2*x

2

9/8/10

Example
Different Kinds of Automata

temporary memory

x=2
CPU
f(x) = 8

Compute x*x
Compute x2*x
10

Finite Automaton Pushdown Automaton

X
temporary memory Stack!

input memory input memory


CPU CPU
output memory output memory

Program memory Program memory

3

9/8/10

Power of Automata
Turing Machine
Random
access
memory!

input memory
CPU Less power More power
output memory
Solve more computational problems

Program memory

14

Finite Automaton Definition


•  Perhaps the simplest type of machine that is still
interesting to study
–  Many of its important properties carry over to more complicated
machines
–  To understand more complicated machines, we first study FAs
•  Captures the basic elements of an abstract machine
–  Reads in a string, and depending on the input and the way the machine was
designed, it outputs true (yes) or false (no)

•  A useful practical abstraction because


–  FAs retain sufficient flexibility to perform interesting tasks
–  Hardware requirements for building them are relatively minimal

•  An important way to describe certain simple, but


highly useful languages called regular languages

4

9/8/10

Transition Diagram
Operation
•  FA is always is one of the n states, which we name 0
through n-1 • Represent visually by a graph:
–  Each state is labeled true (“yes”) or false (“no”) –  nodes = states
•  Begins in the start state –  arc from q to p is labeled by the set of input
•  As the input characters are read in one at a time, symbols a such that δ(q, a) = p
changes from one state to another in a pre-specified –  No arc if no such a
way
–  new state is completely determined by the current state and the
–  Start state indicated by word "start" and an
character just read in arrow
•  When input is exhausted, outputs true (yes, the string is –  Accepting states get double circles
in the language) or false (no, the string is not in the
language) according to the label of the state it is
currently in

Example FAs in Action


0! •  Used in
1! –  Text editors for pattern matching
1 mod 3
–  Compilers for lexical analysis
start! 1! 0!
–  Web browsers for html parsing
–  Operating systems for graphical user interfaces
0 mod 3 2 mod 3
1! 0! •  Serve as the control unit in many physical systems,
including
–  Vending machines, elevators, automatic traffic signals
–  Computer microprocessors
–  Network protocol stacks and old VCR clocks
1 1 0 1
•  Play a key role in natural language processing and
machine learning
DFA.java simulates a DFA that accepts inputs with a multiple of 3 bs.
!

5

9/8/10

String searching FAs Another example


•  One of the most important applications of FAs is • CommentStripper.java
searching for patterns in strings –  reads in a Java (or C++) program from standard
–  at the heart of Web search engines like Google input, removes all comments, and prints the result
•  FA providing a simplified example of such a tool over to standard output
the binary alphabet –  removes /* */ and // style comments using a 5 state
–  Accepts all string inputs that contain the pattern aabaaabb as a finite state automaton
substring
–  Shows DFA power, but to properly strip Java
comments, you would need a few more states to
handle extra cases, e.g., quoted string literals like s
= "/***//*"

FA for Newspaper Vending Machine!


The FA
quarter!

quarter!
dime! dime! dime,!
dime! quarter!
start!

nickel! nickel! nickel! nickel! nickel,!


dime,!
quarter!

quarter!

Picture courtesy of David Eppstein via http://www.cs.princeton.edu/introcs/


6

9/8/10

FA that recognizes simple identifiers


FA for coin flipping
start! letter! letter or digit!
•  Four ways of arranging two coins, depending
on which is heads (H) and which is tails (T)
–  HH, HT, TT, TH
other character! •  Two operations:
(delimeter)! –  Flip the first coin (a)
–  Flip the second coin (b)
•  Assume initially coins are laid out as HH
•  What are all possible ways of applying the
operations so that the configuration is TT?

Conventions
Model as an FA
HH: Flip first coin
a HH: Flip second coin
• It helps if we can avoid mentioning the
start
HH
TH

TH: Flip first coin
type of every name by following some
a
TH: Flip second coin rules:
b b b b –  Input symbols are a, b, etc., or digits.
HT: Flip first coin
HT: Flip second coin –  Strings of input symbols are u, v, . . . , z.
a –  States are q, p, etc.
HT
TT
TT: Flip first coin
a TT: Flip second coin
Final state

7

9/8/10

Formal Definition of DFA


Determinism
•  Finite set of states, Q.
•  Alphabet of input symbols, Σ.
•  One state is the start/initial state, q0.
•  Zero or more final/accepting states, the set is F.
•  A transition function, δ. This function:
–  Takes a state and input symbol as arguments.
–  Returns a state.
–  One "rule" of δ would be written δ(q, a) = p, where q and p are
0! states, and a is an input symbol.
1! –  Intuitively: if the FA is in state q, and input a is received, then the
1 mod 3
start! 1! 0! FA goes to state p (note: q = p OK).

0 mod 3 2 mod 3 An FA is represented as the five-tuple: A = (Q, Σ, δ, q0 , F).


1! 0!

•  The states we need are:


Example: Clamping Logic –  State q0, the start state, says that the most recent input (if
there was one) was not a 1, and we have never seen two
1's in a row.
•  We may think of an accepting state as
–  State q1 says we have never seen 11, but the previous
representing a "1" output and non-accepting
input was 1.
states as representing "0" output.
–  State q2 is the only accepting state, it says that we have at
•  A "clamping" circuit waits for a 1 input, and some time seen 11.
forever after makes a 1 output. However, to –  Thus, A = ({q0, q1, q2}, {0, 1}, δ, q0, {q2}), where δ is
avoid clamping on spurious noise, we'll design given by:
an FA that waits for two 1's in a row, and
"clamps" only then. 0 1 By marking the start state with
> and accepting states with *,
•  In general, we may think of a state as >q0 q0 q1 the transition table that defines
representing a summary of the history of what q1 q0 q2 δ also specifies the entire FA
has been seen on the input so far.
*q2 q2 q2

8

9/8/10

Transition Graph Extension of δ to Paths

Intuitively, a FA accepts a string w


0!
= a1a2… an if there is a path in the
0,1!
transition diagram that:
1! 1! 1.  Begins at the start state,
start
2.  Ends at an accepting state, and
0! 3.  Has sequence of labels a1,a2 , … , an .

Formally, we extend transition function δ to (q,w),


• Acceptance of Strings
where w can be any string of input symbols:
An FA A = (Q, Σ, δ, q0, F) accepts string w if
–  Basis: (q, ε) = q (i.e., on no input, the FA doesn't go
anywhere).
^
–  Induction: (q, wa) = δ ( (q, w),a), where w is a δ (q0, w) ∈ F
string, and a a single symbol (i.e., see where the FA
goes on w, then look for the transition on the last
symbol from that state).
• Language of a FA
–  FA A accepts the language
Important fact with a straightforward, inductive proof: ^
^
^

δ really represents paths. That is, if w = a1a2 … an, and δ (pi, ai) = pi+1 for L(A) = {w |δ (q0, w) ∈ F}
^

all i = 0, 1, . . . , n-1, then δ (p0, w) = pn

9

9/8/10

Type Errors Non-deterministic Finite


•  A major source of confusion when dealing with
Automata
automata (or mathematics in general) is making
"type errors."
• Allow (deterministic) FA to have a choice
–  Example: Don't confuse A, a FA, i.e., a program, with L(A),
of 0 or more next states for each state-
which is of type "set of strings." input pair.
–  Example: the start state q0 is of type "state," but the accepting
states F is of type "set of states." • Important tool for designing string
–  Trickier example: Is a a symbol or a string of length 1? processors, e.g., grep, lexical analyzers.
• But "imaginary," in the sense that it has to
Answer: it depends on the context, e.g., is it used in δ(q, a),
^
be implemented deterministically.
where it is a symbol, or δ(q, a), where it is a string?

Example
1,2,3
q

•  Design an NFA to accept strings over alphabet 1
1

{1, 2, 3} such that the last symbol appears 1

previously, without any intervening higher p

symbol, e.g., r
t

2
2

•  … 11
•  … 21112
•  … 312123
3
3

–  Trick: use start state to mean "I guess I haven't seen
the symbol that matches the ending symbol yet." s

–  Three other states represent a guess that the matching
symbol has been seen, and remembers what that
symbol is. 1,2

10

9/8/10

Formal NFA
Example
•  N = (Q, Σ, δ, q0, F) where all is as DFA, but: Here is a DFA that accepts a language L
–  δ(q, a) is a set of states, rather than a single state.
consisting of all strings over Σ(a,b) that
•  Extension to begin with either aa or bb
–  Basis: (q, ε) = {q}.
–  Induction: Let: a

•  (q, w) = {p1, p2, …, pk}. 1
36

1,2

a

•  δ (pi, a) = Si for i = 1, 2 … , k. b
a,b

•  Then (q, wa) = S1 ∪ S2 ∪ … ∪ Sk.
0
4

•  Language of an NFA a,b

–  An NFA accepts w if any path from the start state to an b
a

accepting state is labeled w. Formally:
L(N) = {w | (q0, w) ∩ F ≠ Φ }.
2
t

5

b
a,b

Suppose we want to make an automaton to


recognize REV(L), the language of all strings that
end in aa or bb NFAs and DFAs
Easy solution: reverse all transitions and
interchange start and final states:

• It might seem that because there is a
degree of choice available in an NFA
But this is not a DFA!
1

a
3

1,2

that it might be more powerful than a
a

a,b
DFA
b

0
4
More than one start state –  That is, NFAs might be able to recognize languages
a DFA could not
a
a,b

b
More than one • But this is not the case!
2
5
transition labeled
b
a,b

with the same
This is an NFA symbol

11

9/8/10

Equivalence of NFAs and DFAs


Theorem
• NFAs and DFAs recognize the same
class of languages
Every non-deterministic finite
• A bit surprising: NFAs seem more
powerful automaton has an equivalent
• Useful: easier to specify an NFA for a deterministic finite automaton
language, convert to DFA

Two machines are equivalent if they recognize the same language

Proof
Proof Idea •  Let N = (Q,Σ,δ,q0,F) be an NFA recognizing some language A
•  Construct a DFA recognizing A
• If a language is recognized by an NFA, –  M = (Q’,Σ,δ’,q0’,F’)
show the existence of a DFA that also 1.  Q’ = the set of subsets of N
recognizes it 2.  For R ∈ Q’ and a ∈ Σ let δ’(R,a) = {q ∈ Q | q ∈ δ(r,a) for some r
∈ R
• Convert NFA to an equivalent DFA that –  If R is a state of M, it is also a set of states of N (because of 1 above). When
M reads a symbol a in a state R, it shows where a takes each state in R.
simulates the NFA Because each state may go to a set of states, we take the union of all these
sets. This can be written as:
–  Proof by construction
!' (R, a) = U! (r, a)
• Intuitively, can simulate the NFA by –  q0={q0}
r"R

keeping track of all the states you can get •  M starts in the state corresponding to the collection containing just the start state of
N.
to on a given input –  F’ = {R ∈ Q’|R contains an accept state of N}.
•  The machine M accepts if one of the possible states that N could be in at this point
is an accept state.

12

9/8/10

Visual version The DFA


δ({7,9},a)= φ
δ({7,9},b)= φ
5 Start state q0= {1} δ(1,a)={2,3} a {7,9}
δ({5,6,8},a)= φ
b δ(1,b)={4}
6 δ({5,6,8},b)= φ {2,3}
δ(1,a)= {2,3} δ({2,3},a)={7,9} a
δ({10},a)= φ b
b δ(1,b)= {4} {1}
{5,6,8} a,b
2 7 δ({10},b)= φ δ({2,3},b)={5,6,8}
a a a,b
δ({2,3},a)= δ({2},a) ∪ δ({3},a) δ({4},a)=φ b a
1 a 3
b 8 {4}
= φ ∪ {7,9} δ({4},b)={10} {φ}
a
b = {7,9} δ({7,9},a)= φ
9 b
4 δ({7,9},b)= φ
δ({2,3},b)= δ({2},b) ∪ δ({3},b) {10} a,b
b = {5,6} ∪ {8} δ({5,6,8},a)= φ
10 δ({5,6,8},b)= φ Start state q0={1}
= {5,6,8}
Final states F’= δ({10},a)= φ Final states F’={{5,6,8},{10}}
δ({4},a)= φ
δ({10},b)= φ
{{5,6,8},{10}}
δ({4},b)= {10}

Try this for the NFA constructed before:

1,2,3
q
NFA With ε-Transitions
1
1

1

• Allow ε to be a label on arcs
p
r
t
–  Nothing else changes: acceptance of w is still the
2
2
existence of a path from the start state to an
accepting state with label w.

3
–  But ε can appear on arcs, and means the empty
3

string (i.e., no visible contribution to w)
s

–  When an arc labeled ε is traversed, no input is
consumed
1,2

13

9/8/10

Example
0
DFAs and NFA-ε’s 
•  ε-transitions are a convenience, but do
1
ε
not increase the power of FA's.
q
r
s

• For any NFA-ε there is an equivalent
0
ε
(i.e., accepts the same language) DFA
• The construction is similar to the NFA-to-
1
DFA construction

ε ε
001 is accepted by the path q, s, r, q, r, s, with label 0 01 = 001.

Example 0

Creating a DFA from a NFA-ε
(or, eliminating ε-transitions) 1

1. Compute ε-closure for all states: q
r
ε
s

ECLOSE(q) = {q}
1.  Compute the ε-closure for each state (set of states ECLOSE(r) = {r,s}
ECLOSE(s) = {r,s}
0
ε

reachable from that state on ε-transitions only).
2. Compute δ: 1

2.  Start state is ε-CLOSE(q0).
δ({q},0) = ECLOSE({s})={r,s}
3.  Compute δ for each a ∈ Σ and each set S (each of δ({q},1) = ECLOSE({r})={r,s}
the ε-CLOSE’d sets) as follows: δ({r,s},0) = ECLOSE({q})={q}
δ({r,s},1) = ECLOSE({q})={q}
  If a state p ∈ S can reach state q on input a (not ε), then
RESULTING DFA:
add a transition on input a from S to ε-CLOSE(q).
3. Final states FD ={{r,s}}
0,1
4.  The set of final states includes those sets that q r,s

contain at least one accepting state of the NFA-ε.


0,1

14

Potrebbero piacerti anche