Sei sulla pagina 1di 26

Lecture 3 Graph Representation

for Regular Expressions


digraph (directed graph)
A digraph is a pair of sets (V, E) such that
each element of E is an ordered pair of
elements in V.
A path is an alternative sequence of
vertices and edges such that all edges are
in the same direction.
string-labeled digraph
A string-labeled digraph is a digraph in
which each edge is labeled by a string.
In a string-labeled digraph, every path is
associated with a string which is obtained
by concatenating all strings on the path.
This string is called the label of the path.
G(r)
For each regular expression r, we can
construct a digraph G(r) with edges
labeled by symbols and as follows.

If r=, then

If r, then
*


Theorem 1
G(r) has a property that a string x belongs
to r if and only if x is the label of a path
from the initial vertex to the final vertex.

Proof is done by induction on r.


Graph Representation
A graph representation of a regular
expression r is a string-labeled graph with
an initial vertex s and a final vertex f such
that a string x belongs to r if and only if x is
associated with a path from s to f.
Corollary 2
For any regular expression r, there exists a
string-labeled digraph with two special
vertices, a initial vertex s and a final vertex
f, such that a string x belongs to r if and
only if x is associated with a path from s to
f.
Puzzle: If a regular expression r contains u
``+''s, v ``''s, and w ``*''s, how many
-edges does G(r) contain?

Question: How to reduce the number of


-edges?
Theorem 3
An -edge (u,v) in G(r) which is a unique
out-edge from a nonfinal vertex u or a
unique in-edge to a noninitial vertex v can
be shrunk to a single vertex. (If one of u
and v is the initial vertex or the final vertex,
so is the resulting vertex.)
Remark: Shrinking should be done one by
one.
Lecture 4 Deterministic Finite
Automata (DFA)
DFA

tape

head

Finite Control
a l p h a b e t

The tape is divided into finitely many cells.


Each cell contains a symbol in an alphabet
.
a

The head scans at a cell on the tape and


can read a symbol on the cell. In each
move, the head can move to the right cell.
The finite control has finitely many states
which form a set Q. For each move, the
state is changed according to the
evaluation of a transition function
:QxQ.
a a

q p

(q, a) = p means that if the head reads


symbol a and the finite control is in the
state q, then the next state should be p,
and the head moves one cell to the right.
s

There are some special states: an initial


state s and a set F of final states.
Initially, the DFA is in the initial state s and
the head scans the leftmost cell. The tape
holds an input string.
x

When the head gets off the tape, the DFA


stops. An input string x is accepted by the
DFA if the DFA stops at a final state.
Otherwise, the input string is rejected.
The DFA can be represented by
M = (Q, , , s, F)
where is the alphabet of input symbols.
The set of all strings accepted by a DFA M
is denoted by L(M). We also say that the
language L(M) is accepted by M.
The transition diagram of a DFA is an alternative
way to represent the DFA.
For M = (Q, , , s, F), the transition diagram of
M is a symbol-labeled digraph G=(V, E)
satisfying the following:

V = Q (s = ,f= for f \in F)

a
E={q p | (q, a) = p}.
0 1
s p s
p q s
q q q

1 0, 1

0 0
s p q
1

L(M) = (0+1)*00(0+1)*.
The transition diagram of the DFA M has the
following properties:
For every vertex q and every symbol a,
there exists an edge with label a from q.
For each string x, there exists exactly one
path starting from the initial state s
associated with x.
A string x is accepted by M if and only if
this path ends at a final state.

Potrebbero piacerti anche