Sei sulla pagina 1di 14

CS240 • Language Theory and Automata • Fall 2006 Finite Automata

• A formal way to describe certain simple, but highly


useful languages called regular languages.
Deterministic and • A graph with a finite number of nodes, called states
sometimes called ``finite state machines”.
Non-deterministic • Arcs are labeled with one or more symbols from
some alphabet.
Finite Automata • One state is designated the start state or initial state.
• Some states are final states or accepting states.
• The language of the FA is the set of strings that label
paths that go from the start state to some accepting
state.

Transition Diagram Example 1

• An FA can be represented by a graph:


– nodes = states
– arc from q to p is labeled by the set of input
symbols a such that !(q, a) = p.
Give sequence of states visited on input string
– No arc if no transition from state q on input a. ababb
– Start state indicated by an arrow coming out Reading each letter of input string causes state transition
of nowhere
If path traveled when reading input string
– Accepting states get double circles. ends in a final state, string is accepted
(recognized)
Example 1 (cont.) Example 1 (cont.)

On input string ababb : Describe the language of machine. If machine is


1. start in state q0 called M, then we specify the language L(M)
2. read a, stay in state q0
3. read b, go to q1 L(M) = {w | w contains substring bb, for all w # {a,b}}
4. read a, go to q0
5. read b, go to q1
6. read b, go to q2

FA that recognizes simple identifiers


Example 2
start letter letter or digit

other character
(delimeter)
L(M) = {???}

L(M) = { " or all strings over alphabet {0,1} that end in 0}


FA for Newspaper Vending Machine
FA for coin flipping
quarter

quarter • Four ways of arranging two coins, depending


dime dime dime, on which is heads (H) and which is tails (T)
dime quarter
– HH, HT, TT, TH
start
0 5 10 15 20 25 • Two operations:
nickel nickel nickel nickel nickel, – Flip the first coin (a)
dime,
quarter – Flip the second coin (b)
• Assume initially coins are laid out as HH
quarter • What are all possible ways of applying the
operations so that the configuration is TT?

Model Problem as an FA Conventions


HH: Flip first coin
a HH: Flip second coin
• It helps if we can avoid mentioning the
start
HH TH
TH: Flip first coin
type of every name by following some
a
TH: Flip second coin rules:
b b b b – Input symbols are a, b, etc., or digits.
HT: Flip first coin
HT: Flip second coin – Strings of input symbols are u, v, . . . , z.
a – States are q, p, etc.
HT TT TT: Flip first coin
a TT: Flip second coin
Final state
Formal Definition of
Deterministic Finite Automaton (DFA)
Creating a Finite Automaton
• Finite set of states, typically Q. • If you think of the states as “memory”, then the memory of
• Finite alphabet of input symbols, typically $. an FA is limited to the number of states.
• One state is the start/initial state, often called q0.
• Zero or more final/accepting states, the set is typically F. • The consequence of a finite number of states is, if
• A transition function, typically !. This function: |w| > number of states, some state must be repeated in the
execution of the FA over w.
– Takes a state and input symbol as arguments.
– Transition function works over all possible state, symbol pairs to
returns a state ( ! : Q % $ & Q ) <-- this makes it a DFA • If the machine can’t remember all the symbols it has seen
– One "rule" of ! would be written (q, a) = p, where q and p are so far in an input string, it has to change state based on
states, and a is an input symbol. other information,
– Intuitively: if the FA is in state q, and input a is received, then the
FA goes to state p (note: q = p OK).
e.g., L1 = the set of all strings with an odd number of 1’s
Any FA is represented as the five-tuple: A = (Q, $, !, q0 , F). over the alphabet {0,1}.

Creating a Finite Automaton Creating a Finite Automaton


• Start by putting yourself in the place of the FA that has to 4. Test your FA.
make every transition choice based on a single character,
because it can’t look ahead or rewind. A tool like JFLAP can help testing input strings on complex FAs
1. Define the meaning of the states:
– For L1, we can designate a two state automaton; one state applies to the Example: Create an FA that accepts all the
situation in which the number of ones seen is even (so far), and the other
to the situation where the number of ones seen is odd (so far). strings over alphabet {0,1} such that the
2. Determine the transition function: string contains the substring 100.
– In either state, if you read a 0, stay put. Whenever you read a 1, follow a
link to the other node. • Define the states.
3. Label the start state and final states. • Determine the transition function for each
state.
DONE? Not quite...
• Label the start state and the final states.
• The states we need are:
Example: Clamping Logic – State q0, the start state, says that the most recent input (if
there was one) was not a 1, and we have never seen two
1's in a row.
• A "clamping" circuit waits for a 1 input, – State q1 says we have never seen 11, but the previous
and forever after makes a 1 output. input was 1.
However, to avoid clamping on spurious – State q2 is the only accepting state, it says that we have at
some time seen 11.
noise, we'll design an FA that waits for – Thus, A = ({q0, q1, q2}, {0, 1}, !, q0, {q2}), where ! is
two 1's in a row, and "clamps" only then. given by:
• Shows how a state can represent the 0 1 By marking the start state with
history of what has been seen on the >q0 q0 q1
> and accepting states with *,
the transition table that defines
input so far. q1 q0 q2 ! also specifies the entire FA

*q2 q2 q2

Transition Graph for Clamping Proving a given language is regular


Circuit
To show a language L is regular, produce a DFA that
recognizes L. For example, if L = { awa | w # {a,b}* }
Proving a given language is regular Extension of ! to Paths
L = { w : |w| mod 3 = 0 for w # {a,b}* }
Intuitively, a FA accepts a string w
= a1a2… an if there is a path in the
transition diagram that:
1. Begins at the start state,
2. Ends at an accepting state, and
3. Has sequence of labels a1,a2 , … , an .

Regular Languages
Regular Operations
More formally, an FA M accepts a string w =
a0,a1,a2,...,an if there exists a sequence of states
r0,r1,r2, ...,rn in Q such that Union
1. r0 = q0,
2. !(ri, ai+1) = ri+1, for i = 0, ..., n-1, and
3. rn # F. Concatenation

A language is called a regular language if some


FA accepts it. So to prove a language is regular, Star Closure
we just have to create a FA that accepts it.
Theorem: The class of regular languages is
Regular Operations closed under union

Union Proof is by construction:


1. Given regular languages A1 and A2, where A1 is
– DFAs that recognize languages L1 and L2
recognized by some DFA M1 and A2 is
can be combined to give a DFA that
recognized by DFA M2.
accepts the union of L1 and L2
2. Construct M from M1 and M2
a. Make the states of M = Q1 X Q2
b. Alphabet of A1 and A2 can be same or different.
c. Each transition in M is to the state labeled by the pair
of states M1 and M2 entered by each machine on each
symbol.
d. Start state = (start state of M1, start state of M2)
e. F = {(r1, r2) | r1# F1 or r2# F2}

Regular Operations Regular Operations


Union Concatenation (and Star Closure)
– DFAs that recognize languages L1 and L2 To construct a DFA that accepts A1 A2, can
can be combined to give a DFA that we use the same strategy as we did for the
accepts the union of L1 and L2 union operation?

I.e., can we take the DFA that accepts A1


Concatenation and make the final state of A1 a non-final
– Same can be shown for concatenation, state that has a transition to the start state of
and star closure, with one hitch... the DFA that accepts A2?
No because a DFA cannot “guess” where the string
from A1 ends and that of A2 begins...what to do?
How does an NFA compute?
Use a Nondeterministic FA!!
Instead of a sequential flow of computation, the NFA can
In a NFA, there can be 0 or more branch out from every state where there is more than 1 choice
of outgoing arrow and follow that set of arrows.
transitions out of each state on
each symbol of the alphabet. Imagine that the machine splits into multiple copies of itself
and follows all possible paths in parallel.

If there is no arrow out of a particular state on a particular


symbol in an NFA, that branch of computation dies in that
state.

If any copy of the machine ends in an accepting state after


reading an input string, we say the NFA accepts the input string.

Non-deterministic Finite NFA to recognize HTML lists


Automata (NFA)
• The following NFA, call it M, scans HTML documents,
looking for a list of what could be title-author pairs,
• Important tool for designing string perhaps in a reading list for some literature course.
processors, e.g., grep, lexical analyzers. • M accepts whenever it finds the end of a list.
• In an application, the strings that matched the title
(before 'by') and author (after 'by') might be stored in a
• But "imaginary," in the sense that it has table of title-author pairs being accumulated.
to be implemented deterministically on
each root to leaf branch of computation.
start <ol>, <ul> <li> Any non-tag
Formal Definition of
Nondeterministic Finite Automaton (NFA)
space
• Finite set of states, Q.
• Finite alphabet of input symbols, $.
b
• A transition function, !. This function:
<li> – Takes a state and an input symbol or " as arguments.
– Transition function works over all possible state, symbol pairs to
y return a set of states ( ! : Q % {$} & P(Q) )
• q0 # Q is the start state.
• F ' Q is the set of final states.
space
DFAs are a subset of NFA’s
Any non-tag
</ol>, </ul> </li>
Any FA is represented as the five-tuple: A = (Q, $, !, q0 , F).

NFAs and DFAs Equivalence of NFAs and DFAs


• It might seem that because there is a • NFAs and DFAs recognize the same
degree of choice available in an NFA class of languages
that it might be more powerful than a • Useful: easier to specify an NFA for a
DFA language, convert to DFA
– That is, NFAs might be able to recognize languages
a DFA could not Two machines are equivalent if they recognize the same language

• But this is not the case!


Proof Idea
Theorem • If a language is recognized by an NFA,
show the existence of a DFA that also
Every non-deterministic finite recognizes it
• Convert NFA to an equivalent DFA that
automaton has an equivalent simulates the NFA
deterministic finite automaton – Proof by construction
• Intuitively, can simulate the NFA by
keeping track of all the states you can get
to on a given input

Proof
• Let N = (Q,$,!,q0,F) be an NFA recognizing some language A
Visual version
• Construct a DFA recognizing A !({7,9},a)= )
– M = (Q’,$,!’,q0’,F’) !({7,9},b)= )
5 Start state q0= {1} !({5,6,8},a)= )
1. Q’ = the set of subsets of Q -- P(Q) b
6
!({5,6,8},b)= )
2. For R # Q’ and a # $ let !’(R,a) = {q # Q | q # !(r,a) for some r !(1,a)= {2,3} !({10},a)= )
# R} b !(1,b)= {4}
2 7 !({10},b)= )
– If R is a state of M, it is also a set of states of N (because of 1 above). When a a
M reads a symbol a in a state R, it shows where a takes each state in R. a b !({2,3},a)= !({2},a) ( !({3},a)
Because each state may go to a set of states, we take the union of all these 1 3 8
a = ) ( {7,9}
sets. This can be written as:
b = {7,9}
9
!' (R, a) = U! (r, a) 4 !({2,3},b)= !({2},b) ( !({3},b)
– q0={q0} r"R b = {5,6} ( {8}
• M starts in the state corresponding to the collection containing just the start state 10
of N.
= {5,6,8}
– F’ = {R # Q’|R contains an accept state of N}.
Final states F’=
!({4},a)= )
• The machine M accepts if one of the possible states that N could be in at this point {{5,6,8},{10}}
is an accept state. !({4},b)= {10}
" Labels
The DFA
In an NFA (NFA- "), arrows can be labeled with the
a
" symbol, meaning that a transition(s) can be made
!(1,a)={2,3} {7,9}
!(1,b)={4} to the following state(s) after consuming an input, or
{2,3} from the start state without consuming an input
!({2,3},a)={7,9} a
b {5,6,8} a,b symbol.
!({2,3},b)={5,6,8} {1}
!({4},a)=) a,b
b {4} a The convention used in our book is to follow the
!({4},b)={10} {)}
!({7,9},a)= ) arrow(s) with label " out of the state the NFA is in
!({7,9},b)= ) b after a symbol has been read and the arrow labeled
{10} a,b
!({5,6,8},a)= ) with that symbol has already been traversed.
!({5,6,8},b)= ) Start state q0={1}
!({10},a)= ) Final states F’={{5,6,8},{10}}
!({10},b)= )

" Labels " Labels


For example, if a b was read in state q1, the machine If the link between q0 and q1 was labeled with an
could transition to either q2 or q3. But if an a was read epsilon, the machine could transition to q1 without
in state q2, the computation would die. reading a symbol or it could stay in q0 and read an
input symbol and then transition to q1.

"
Example
NFA With "-Transitions 0

• Allow " to be a label on arcs


– Nothing else changes: acceptance of w is still the q
1
r
" s
existence of a path from the start state to an
accepting state with label w.
– But " can appear on arcs, and means the empty 0 "
string (i.e., no visible contribution to w)
– When an arc labeled " is traversed, no input is 1
consumed
" "
001 is accepted by the path q, s, r, q, r, s, with label 0 01 = 001.

Creating a DFA from a NFA-"


DFAs and NFA-"’s (or, eliminating "-transitions)

1. Compute the "-closure for each state (set of states


• "-transitions are a convenience, but do
reachable from that state on "-transitions only).
not increase the power of FA's.
2. Start state is "-CLOSE(q0) (called E({q0}) in book).
• For any NFA-" there is an equivalent 3. Compute ! for each a # $ and each set S (each of
(i.e., accepts the same language) DFA the "-CLOSE’d sets) as follows:
• The construction is similar to the NFA-to- ! If a state p # S can reach state q on input a (not "), then
add a transition on input a from S to "-CLOSE(q).
DFA construction
4. The set of final states includes those sets that
contain at least one accepting state of the NFA-".
Example 0

1. Compute "-closure for all states: q 1 r "


Regular Operations
s
ECLOSE(q) = {q}
ECLOSE(r) = {r,s} With NFAs, showing that regular languages
ECLOSE(s) = {r,s}
0 "
are closed under star closure and con-
2. Compute !: 1 catenation is much easier.
!({q},0) = ECLOSE({s})={r,s}
!({q},1) = ECLOSE({r})={r,s}
Concatenation
!({r,s},0) = ECLOSE({q})={q}
!({r,s},1) = ECLOSE({q})={q} Given 2 NFAs, N1 and N2, where N1
recognizes language A1 and N2 recognizes
RESULTING DFA:
3. Final states FD ={{r,s}} language A2, A1A2 is formed by making all
q 0,1 the final states of N1 non-final and linking
r,s
them via " links to the start state of N2.
0,1

The class of regular languages is closed under the union


operation
Regular Operations For two languages R1 and R2, take two NFAs N1 and N2 and
combine them into one new NFA N.
N must accept input if either N1 or N2 accepts input.
Star Closure
Given an NFA, N that recognizes language N
N1
A, the NFA to recognize language A* is
formed by creating a new start state that is "
The new machine
also a final state with " arc to the old start guesses non-
state. Then make " arcs to from all the old deterministically
final states back to the start state. which of the two
N2 machines accepts
" the input
The class of regular languages is closed under the The class of regular languages is closed under the star
concatenation operation operation

For two languages R1 and R2, take two NFAs N1 and N2 and For a language R1, modify N1 to accept (R1)*.
combine them sequentially into one new NFA N.

N1

N1 N2
The new machine
guesses non-
deterministically
where to split the N
input in order to The new machine
N have a first part " has the option of
accepted by N1 and " jumping back to the
" a second part " start state to read
accepted by N2. another piece that
" N1 accepts.

Q: Why not just make the start state of N1 a final state?

Potrebbero piacerti anche