Sei sulla pagina 1di 38

Regular Languages and finite-state

automata

Regular Expressions and regular languages


Deterministic FSA
Non-deterministic FSA

Regular Expressions and regular


languages

Characterizing formal
languages

Plain words: the language of all and only those


words over ={a,b} of length 2
(aa,bb,ab,ba,bb)
Set abstraction: {w|w* and |w| = 2}
New way: regular expressions denote
languages

Regular Expressions Denote


Languages

a* denotes language {an|n 0}


(a).(b) or just ab denotes unit language {ab}
a*b* denotes {anbm|n,m 0}
a2b3 denotes {aabbb}
an not regular expression
a+bb denotes {anbb|n 1}
a?bb denotes {anbb| 0 n 1}
a|b denotes {a,b}

Exercise

a* denotes language {an|n 0}


(a).(b) or just ab denotes unit language {ab}
a*b* denotes {anbm|n, m 0}
a2b3 denotes {aabbb}
an not regular expression
a+bb denotes {anbb|n 1}
a?bb denotes {anbb| 0 n 1}
a|b denotes {a,b}

What does ((a.b)*)|((b.a)*) mean? Give a few


examples

Exercise

a* denotes language {an|n 0}


(a).(b) or just ab denotes unit language {ab}
a*b* denotes {anbm|n, m 0}
a2b3 denotes {aabbb}
an not regular expression
a+bb denotes {anbb|n 1}
a?bb denotes {anbb| 0 n 1}
a|b denotes {a,b}

What does ((a.b)*)|((b.a)*) mean? Give a few


examples
The language containing all and only evenlength words consisting of alternating as and
bs {,ab,ba,abab,baba,}

Kleene-Closure Operator (recap)


Symbol *: certain unary operation on languages
Given language L
L* = def {w| for some n 0, w is the concatenation of n
words of L}
L*: is the result of concatenating 0 or more words of L

Language forming operations


(recap)

Binary concatenation operation: .


L1.L2 = def. {w1w2|w1 L1 & w2 L2}
The language that results from taking a word from
L1 and appending to it a word from L2

Definition
is regular expression (over ) and denotes language
(ii) is regular expression (over ) and denotes language {}
(iii) If s is in * then s itself is a regular expression and denotes
language {s}
(iv) Suppose s and r are regular expressions that denote
languages Lr and Ls, then
(a) (r|s) is a regular expression that denotes LrLs
(b) (r.s) is a regular expression that denotes Lr.Ls
(c) (r*) is a regular expression that denotes (Lr)*

(v) No expression is a regular expression unless it is obtainable


from (i) (iv)

Definition
What about (r+)

What about (r?)

Definition
What about (r+)
= ((r*).r)
What about (r?)
= (|r)

Notation
Usually forget about parentheses:
(ab)
= ab
(a|b|c): 3-word language {a,b,c}
parentheses > superscript > concatenation > alternation
ab*
= a(b*)
(ab)*
a|ba
= (a|(b.a))
((a|b).a)

Regular Languages
Let L be a language over alphabet , i.e., L
*. Then L is said to be a regular language if
L is denoted by some regular expression over

Let be a finite alphabet and L1 and L2 regular


languages over . Then L1 L2, L1.L2, and L1*
are also regular languages

Remarks
if is a finite alphabet and w is any word over
. Then unit language {w} is regular.
if is a finite alphabet. Then any finite
language over is regular.

Deterministic Finite State


Automata

Finite State Automata


New model of computation: analysis of the
kind of computation that requires a fixed
(finite) amount of memory for arbitrary input
Also called finite-state machines

Deterministic Finite-State
Automata
b

a
q

Figure 9.2.1(a)

= {a,b}
Vertices and arcs
Labels of arcs are
members of
Input: (possibly
empty) word over
e.g. abb

Deterministic Finite-State
Automata
b

a
q

Figure 9.2.1(a)
EX9-2-1.FSA in Deus ex Machina

Accepting
configuration: FSA
halts in state q1
The FSA accepts
words abb
e.g. aba
q2: trap state
L = {abn|n0}

Determinism
For each state/symbol pair, FSA M has exactly
one instruction
FSA M has at least one instruction. This makes
M fully defined
Determinism means that, within any state
diagram for FSA, the path labeled by given
word w is unique: for word w *, there is
exactly one path starting at q0 and labeled by
w

b
a
b

Exercise

4
a

Figure 9.2.1(b)

Which regular language is accepted by this FSA?


What are the accepting states?
Is an accepted word?
What is the trap state?
Is the trap state a sink?
Is the language finite?

b
a
b

Exercise

4
a

Figure 9.2.1(b)

Which regular language is accepted by this FSA? a(a|b)a


What are the accepting states? q2 and q3
Is an accepted word? no
What is the trap state? q4
Is the trap state a sink? no
Is the language finite? yes

Exercise

3
b

b
a

Figure 9.2.1(c)

Which regular language is accepted by


this FSA?
What are the accepting states?
Is an accepted word?
What is the trap state?
Is the trap state a sink?
Is the language finite?

Exercise

3
b

b
a

Figure 9.2.1(c)

Which regular language is accepted by


this FSA? (aba)*
What are the accepting states? q0
Is an accepted word? yes
What is the trap state? q3
Is the trap state a sink? no
Is the language finite? no

Alternate description
M(q0,a)
M(q0,b)
M(q1,b)
M(q1,a)
M(q2,a)
M(q2,b)

= q1
= q2
= q1
= q2

b
q

a
q

= q2
= q2

Figure 9.2.1(a)

Formal Definition
A deterministic FSA is a quintuple ,Q,qinit,F,
is the input alphabet
- Q is a finite, nonempty set of states
- qinit Q is the initial state or start state
- F Q is a (possibly empty) set of accepting or
terminal states
: Q Q transition function (total and
single valued)

Word Acceptance
A deterministic finite-state automaton M accepts
word w * if there is a unique path starting
at qinit and labeled by w that leads to some
member of F

Language Acceptance
The language accepted by M is the set of all
and only those words over that are accepted
by M
L(M) for the language accepted by M.
FSAs are language acceptors only

Example
Deus ex Machina: Odd.fsa

Nondeterministic Finite State


Automata

A Nondeterministic Machine
a
q

q0

q1
b

q0

b
q

b
a

b
q3

= (ab)*|a

q2
b

Figure 9.3.3
L = (ab)*

L = (ab)* {a}

Nondeterminism
Nondeterministic FSA are usually easier to
design but run the risk of accepting
unintended words
: QQ is a transition mapping
Assumed to be total but permitted to be multivalued
Cf. difference between function and mapping!

Formal Definition
A nondeterministic FSA is a quintuple ,Q, qinit,F,

is the input alphabet


Q is a finite, nonempty set of states
qinit Q is the initial state or start state

F Q is a (possibly empty) set of accepting or terminal


states
: Q Q transition mapping (total and possibly
multi-valued)

Word Acceptance
Word w * is accepted by FSA M provided
there exists some path, labeled by w, in the
state diagram of M leading from qinit to a
terminal state
Cf. deterministic definition of word acceptance:
unique path

Language Acceptance
The language accepted by a nondeterministic
FSA is the set of words accepted by M.

Nondeterminism determinism
Nondeterministic FSA are easier to design
For every nondeterministic FSA, there exists
an equivalent deterministic FSA
We can automatically convert the
nondeterministic FSA to an equivalent
deterministic FSA through subset construction

Nondeterminism
-moves do not necessarily imply
nondeterminism

Exercise
Design a FSA that accepts the laughter
language {ha!, haha!, hahaha!, }
that includes at least 1 epsilon-move.
Then create an equivalent FSA without
epsilon-move(s)

Design a FSA that accepts the laughter language {ha!,


haha!, hahaha!, } that includes at least 1 epsilonmove. Then create an equivalent FSA without epsilonmove(s)
h

Exercise

h
or

2
ha

Potrebbero piacerti anche