Lecture Notes by Leslie and Nickson 2005

Collected Lecture Slides for COMP 202
Formal Methods of Computer Science

Neil Leslie and Ray Nickson
11 October, 2005
This document was created by concatenating the lecture slides into a single
sequence: one chapter per topic, and one section per titled slide.
There is no additional material apart from what was distributed in lectures.
ii
Neil Leslie and Ray Nickson assert their moral right
to be identied as the authors of this work.
c _ Neil Leslie and Ray Nickson, 2005
Contents
Problems, Algorithms, and Programs 1
1 Introduction 3
1.1 COMP202 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Course information . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Tutorial and Marking Groups . . . . . . . . . . . . . . . . . . . . 4
1.5 What is Computer Science? . . . . . . . . . . . . . . . . . . . . . 4
1.6 Some Powerful Ideas . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.7 Related Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.8 Lecture schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Problems and Algorithms 7
2.1 Problems and Algorithms . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Problems, Programs and Proofs . . . . . . . . . . . . . . . . . . . 7
2.3 Describing problems . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Example: Comparing Strings . . . . . . . . . . . . . . . . . . . . 8
2.5 Understanding the Problem . . . . . . . . . . . . . . . . . . . . . 9
2.6 Designing an Algorithm . . . . . . . . . . . . . . . . . . . . . . . 9
2.7 Dening the Algorithm Iteratively . . . . . . . . . . . . . . . . . 10
2.8 Dening the Algorithm Recursively . . . . . . . . . . . . . . . . . 10
3 Two Simple Programming Languages 11
3.1 Imperative and Applicative Languages . . . . . . . . . . . . . . . 11
3.2 The Language of While-Programs . . . . . . . . . . . . . . . . . . 11
3.3 Comparing strings again . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 The Applicative Language . . . . . . . . . . . . . . . . . . . . . . 12
3.5 Those strings again ... . . . . . . . . . . . . . . . . . . . . . . . . 13
3.6 Using head and tail . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4 Reasoning about Programs 15
4.1 Some Questions about Languages . . . . . . . . . . . . . . . . . . 15
4.2 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
iii
iv CONTENTS
4.4 Comparing Strings One More Time . . . . . . . . . . . . . . . . . 16
4.5 Laws for Program Verication . . . . . . . . . . . . . . . . . . . . 18
4.6 Reasoning with Assertions . . . . . . . . . . . . . . . . . . . . . . 18
I Regular Languages 21
5 Preliminaries 23
5.1 Part I: Regular Languages . . . . . . . . . . . . . . . . . . . . . . 23
5.2 Formal languages . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.3 Alphabet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.4 String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.5 Language, Word . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.6 Operations on Strings and Languages . . . . . . . . . . . . . . . 25
5.7 A proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.8 Statement of the conjecture . . . . . . . . . . . . . . . . . . . . . 26
5.9 Base case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.10 Induction step . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.11 A theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6 Dening languages using regular expressions 29
6.1 Regular expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.2 Arithmetic expressions . . . . . . . . . . . . . . . . . . . . . . . . 29
6.3 Simplifying conventions . . . . . . . . . . . . . . . . . . . . . . . 30
6.4 Operator precedence . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.5 Operator associativity . . . . . . . . . . . . . . . . . . . . . . . . 30
6.6 Care! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.7 Dening regular expressions . . . . . . . . . . . . . . . . . . . . . 31
6.8 Simplifying conventions . . . . . . . . . . . . . . . . . . . . . . . 31
6.9 What the regular expressions describe . . . . . . . . . . . . . . . 32
6.10 Derived forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7 Regular languages 35
7.1 Regular Languages . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.2 Example languages . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.3 Finite languages are regular . . . . . . . . . . . . . . . . . . . . . 37
7.4 An algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.5 EVEN-EVEN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.6 Deciding membership of EVEN-EVEN . . . . . . . . . . . . . . . 38
7.7 Using fewer states . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.8 Using even fewer states . . . . . . . . . . . . . . . . . . . . . . . 38
7.9 Uses of regular expressions . . . . . . . . . . . . . . . . . . . . . . 38
CONTENTS v
8 Finite automata 41
8.1 Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.2 Formal denition . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.3 Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.4 An automaton . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
8.5 The language accepted by M
1
. . . . . . . . . . . . . . . . . . . . 43
8.6 Another automaton . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.7 A pictorial representation of FA . . . . . . . . . . . . . . . . . . . 43
8.8 Examples of constructing an automaton . . . . . . . . . . . . . . 44
8.9 Constructing M
L
. . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.10 A machine to accept Language(11(0 +1)
) . . . . . . . . . . . . 45
8.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
9 Non-deterministic Finite automata 47
9.1 Non-deterministic Finite Automata . . . . . . . . . . . . . . . . . 47
9.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
9.3 Formal denition . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
9.4 Just like before. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
9.5 Example NFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
9.6 M
5
accepts 010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
9.7 M
5
accepts 01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
9.8 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
9.9 NFAs are as powerful as FAs . . . . . . . . . . . . . . . . . . . . 50
9.10 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
9.11 Using the graphical representation . . . . . . . . . . . . . . . . . 50
9.12 FAs are as powerful as NFAs . . . . . . . . . . . . . . . . . . . . 51
9.13 Useful observations . . . . . . . . . . . . . . . . . . . . . . . . . . 51
9.14 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
9.15 Constructing
and Q
. . . . . . . . . . . . . . . . . . . . . . . . 52
9.16 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
9.17 NFA with transitions . . . . . . . . . . . . . . . . . . . . . . . 53
9.18 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
9.19 Formal denition . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
9.20 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
9.21 Graphical representation of M
6
. . . . . . . . . . . . . . . . . . . 54
9.22 Easy theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
9.23 Harder theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
9.24
the new transition function . . . . . . . . . . . . . . . . . . . . 55

9.25 F
the new set of accepting states . . . . . . . . . . . . . . . . . . 56

9.26 Example: an NFA equivalent to M
6
. . . . . . . . . . . . . . . . 56
12
. . . . . . . . . . . . . . . . . . 56
9.28 Regular expression to NFA with transitions . . . . . . . . . . . 56
9.29 Proof outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
9.30 Base cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.31 Induction steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.32 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
vi CONTENTS
9.33 Proof summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.34 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
10 Kleenes theorem 63
10.1 A Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
10.2 Kleenes theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
11 Closure Properties of Regular Languages 65
11.1 Closure properties of regular language . . . . . . . . . . . . . . . 65
11.2 Formally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
11.3 Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
11.4 Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
11.5 Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
11.6 Kleene closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
11.7 Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
11.8 Summary of the proofs . . . . . . . . . . . . . . . . . . . . . . . . 67
12 Non-regular languages 69
12.1 Non-regular Languages . . . . . . . . . . . . . . . . . . . . . . . . 69
12.2 Pumping lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
12.3 Pumping lemma informally . . . . . . . . . . . . . . . . . . . . . 70
12.4 Proving the pumping lemma . . . . . . . . . . . . . . . . . . . . . 71
12.5 A non-regular language . . . . . . . . . . . . . . . . . . . . . . . 72
12.6 Pumping lemma re-cap . . . . . . . . . . . . . . . . . . . . . . . . 72
12.7 Another non-regular language . . . . . . . . . . . . . . . . . . . . 72
12.8 And another . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
12.9 Comment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
13 Regular languages: summary 75
13.1 Regular languages: summary . . . . . . . . . . . . . . . . . . . . 75
13.2 Kleenes theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
II Context Free Languages 77
14 Introducing Context Free Grammars 79
14.1 Beyond Regular Languages . . . . . . . . . . . . . . . . . . . . . 79
14.2 Sentences with Nested Structure . . . . . . . . . . . . . . . . . . 79
14.3 A Simple English Grammar . . . . . . . . . . . . . . . . . . . . . 80
14.4 Parse Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
14.5 Context Free Grammars . . . . . . . . . . . . . . . . . . . . . . . 81
14.6 Formal denition of CFG . . . . . . . . . . . . . . . . . . . . . . 82
CONTENTS vii
15 Regular and Context Free Languages 83
15.1 Regular and Context Free Languages . . . . . . . . . . . . . . . . 83
15.2 CF Reg ,= . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
15.3 CF , Reg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
15.4 CF Reg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
15.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
15.6 Regular Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 85
15.7 Parsing using Regular Grammars . . . . . . . . . . . . . . . . . . 85
15.8 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
15.9 Derivations in Regular Grammars . . . . . . . . . . . . . . . . . . 86
15.10Derivations in Arbitrary CFGs . . . . . . . . . . . . . . . . . . . 86
15.11Parse Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
15.12Derivations and parse trees . . . . . . . . . . . . . . . . . . . . . 88
15.13Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
16 Normal Forms 89
16.1 Lambda Productions . . . . . . . . . . . . . . . . . . . . . . . . . 89
16.2 Eliminating Productions . . . . . . . . . . . . . . . . . . . . . 89
16.3 Circularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
16.4 Productions on the Start Symbol . . . . . . . . . . . . . . . . . 91
16.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
16.6 Unit Productions . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
16.7 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
17 Recursive Descent Parsing 93
17.1 Recognising CFLs (Parsing) . . . . . . . . . . . . . . . . . . . . . 93
17.2 Top-down Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . 93
17.3 Top-down Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . 94
17.4 Recursive Descent Parsing . . . . . . . . . . . . . . . . . . . . . . 94
17.5 Building a Parser for Nested Lists . . . . . . . . . . . . . . . . . 94
17.6 Parser for Nested Lists . . . . . . . . . . . . . . . . . . . . . . . . 95
17.7 Building a Parse Tree . . . . . . . . . . . . . . . . . . . . . . . . 96
17.8 Parser for Nested Lists . . . . . . . . . . . . . . . . . . . . . . . . 96
17.9 LL(1) Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
17.10First and Follow sets . . . . . . . . . . . . . . . . . . . . . . . . . 98
17.11Transforming CFGs to LL(1) form . . . . . . . . . . . . . . . . . 98
18 Pushdown Automata 101
18.1 Finite and Innite Automata . . . . . . . . . . . . . . . . . . . . 101
18.2 Pushdown Automata . . . . . . . . . . . . . . . . . . . . . . . . . 102
18.3 Formal Denition of PDA . . . . . . . . . . . . . . . . . . . . . . 103
18.4 Deterministic and Nondeterministic PDAs . . . . . . . . . . . . . 104
18.5 CFG PDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
18.6 Top-Down construction . . . . . . . . . . . . . . . . . . . . . . . 105
18.7 S aS [ T T b [ bT . . . . . . . . . . . . . . . . . . . . . . . 105
18.8 Bottom-Up construction . . . . . . . . . . . . . . . . . . . . . . . 106
viii CONTENTS
18.9 S aS [ T; T b [ bT . . . . . . . . . . . . . . . . . . . . . . 106
18.10PDA CFG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
18.11PDA to CFG Formally . . . . . . . . . . . . . . . . . . . . . . . . 107
18.12Example (L
6
from slide 69) . . . . . . . . . . . . . . . . . . . . . 107
19 Non-CF Languages 109
19.1 Not All Languages are Context Free . . . . . . . . . . . . . . . . 109
19.2 Context Sensitive Grammars . . . . . . . . . . . . . . . . . . . . 109
19.3 Example (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
19.4 Example (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
19.5 Generating the empty string . . . . . . . . . . . . . . . . . . . . . 110
19.6 CFL CSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
20 Closure Properties 113
20.1 Closure Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 113
20.2 Union of Context Free Languages is Context Free . . . . . . . . . 113
20.3 Concatenation of CF Languages is Context Free . . . . . . . . . 114
20.4 Kleene Star of a CF Language is CF . . . . . . . . . . . . . . . . 115
20.5 Intersections and Complements of CF Languages . . . . . . . . . 115
21 Summary of CF Languages 117
21.1 Why context-free? . . . . . . . . . . . . . . . . . . . . . . . . . . 117
21.2 Phrase-structure grammars . . . . . . . . . . . . . . . . . . . . . 117
21.3 Special cases of Phrase-structure grammars . . . . . . . . . . . . 118
21.4 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
21.5 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
21.6 Recursive descent and LL(1) grammars . . . . . . . . . . . . . . 119
21.7 Pushdown automata . . . . . . . . . . . . . . . . . . . . . . . . . 119
21.8 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
21.9 Constructions on CFLs . . . . . . . . . . . . . . . . . . . . . . . . 120
III Turing Machines 121
22 Turing Machines I 123
22.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
22.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
22.3 A crisis in the foundations of mathematics . . . . . . . . . . . . . 124
22.4 The computable functions: Turings approach . . . . . . . . . . . 124
22.5 What a computer does . . . . . . . . . . . . . . . . . . . . . . . . 124
22.6 What have we achieved? . . . . . . . . . . . . . . . . . . . . . . . 125
22.7 The computable functions: Churchs approach . . . . . . . . . . . 125
22.8 First remarkable fact . . . . . . . . . . . . . . . . . . . . . . . . . 126
22.9 Church-Turing thesis . . . . . . . . . . . . . . . . . . . . . . . . . 127
22.10Second important fact . . . . . . . . . . . . . . . . . . . . . . . . 127
22.11The universal machine . . . . . . . . . . . . . . . . . . . . . . . . 127
CONTENTS ix
22.12Third important fact . . . . . . . . . . . . . . . . . . . . . . . . . 128
23 Turing machines II 129
23.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
23.2 Informal description . . . . . . . . . . . . . . . . . . . . . . . . . 130
23.3 How Turing machines behave: a trichotomy . . . . . . . . . . . . 130
23.4 Informal example . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
23.5 Towards a formal denition . . . . . . . . . . . . . . . . . . . . . 131
23.6 Alphabets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
23.7 The head and the tape . . . . . . . . . . . . . . . . . . . . . . . . 132
23.8 The states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
23.9 The transition function . . . . . . . . . . . . . . . . . . . . . . . 132
23.10Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
23.11Formal denition . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
23.12Representing the computation . . . . . . . . . . . . . . . . . . . . 133
23.13A simple machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
23.14Graphical representation of M
1
. . . . . . . . . . . . . . . . . . . 134
23.15Some traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
23.16Turing machines can accept regular languages . . . . . . . . . . . 134
23.17Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
23.18Another machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
23.19Graphical representation of M
2
. . . . . . . . . . . . . . . . . . . 135
23.20A trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
24 Turing Machines III 137
24.1 An example machine . . . . . . . . . . . . . . . . . . . . . . . . . 137
24.2 Some denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
24.3 Computable and computably enumerable languages . . . . . . . . 138
24.4 Deciders and recognizers . . . . . . . . . . . . . . . . . . . . . . . 138
24.5 A decidable language . . . . . . . . . . . . . . . . . . . . . . . . . 139
24.6 Another decidable language . . . . . . . . . . . . . . . . . . . . . 139
24.7 A corollary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
24.8 Even more decidable sets . . . . . . . . . . . . . . . . . . . . . . 140
24.9 An undecidable set . . . . . . . . . . . . . . . . . . . . . . . . . . 140
25 Turing Machines IV 141
25.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
25.2 A
FA
is decidable . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
25.3 A
PDA
is decidable . . . . . . . . . . . . . . . . . . . . . . . . . . 142
25.4 The halting problem . . . . . . . . . . . . . . . . . . . . . . . . . 142
25.5 Other undecidable problems . . . . . . . . . . . . . . . . . . . . . 143
25.6 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . 143
25.7 Computable and computably enumerable languages . . . . . . . . 144
25.8 A language which is not c.e. . . . . . . . . . . . . . . . . . . . . . 145
x CONTENTS
26 Turing Machines V 147
26.1 A hierarchy of classes of language . . . . . . . . . . . . . . . . . . 147
26.2 A hierarchy of classes of grammar . . . . . . . . . . . . . . . . . . 147
26.3 Grammars for natural languages . . . . . . . . . . . . . . . . . . 148
26.4 A hierarchy of classes of automaton . . . . . . . . . . . . . . . . . 148
26.5 Deterministic and nondeterministic automata . . . . . . . . . . . 149
26.6 Nondeterminstic Turing Machines . . . . . . . . . . . . . . . . . . 149
26.7 NTM=TM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
26.8 More variations on Turing Machines . . . . . . . . . . . . . . . . 150
27 Summary of the course 151
27.1 Part 0 Algorithms, and Programs . . . . . . . . . . . . . . . . . 151
27.2 Part I Formal languages and automata . . . . . . . . . . . . . . 151
27.3 Part II Context-Free Languages . . . . . . . . . . . . . . . . . . 152
27.4 Part III Turing Machines . . . . . . . . . . . . . . . . . . . . . 152
27.5 COMP 202 exam . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
27.6 What next? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Part
Problems, Algorithms, and
Programs
1
Chapter 1
Introduction
1.1 COMP202
Formal Methods of Computer Science
COMP202 introduces a selection of topics, focusing on the use of formal no-
tations and formal models in the specication, design and analysis of programs,
languages, and machines.
The focus is on language: syntax, semantics, and translation.
Covers fundamental aspects of Computer Science which have many impor-
tant applications and are essential to advanced study in computer science.
1.2 Course information
Lecturer
Ray Nickson, CO 442, Phone 4635657,
Email nickson@mcs.vuw.ac.nz
Tutors
Alan Taalolo, CO 242, Phone 4636750,
Email alan@mcs.vuw.ac.nz
others to be advised
Web page
http://www.mcs.vuw.ac.nz/Courses/COMP202/
Lectures
Monday, Wednesday and Thursday, 4.10-5pm in Hunter 323.
3
4 CHAPTER 1. INTRODUCTION
Tutorials
One hour per week.
Text book
Introduction to Computer Theory (2nd Edition), by Daniel Cohen, Pub-
lished by Wiley in 1997 (available from the University bookshop, approx.
cost $135).
1.3 Assessment
Problem sets (10%)
The basis for tutorial discussion, and containing questions to write up for
marking.
Programming projects (20%)
Due in weeks 6 and 11.
Test (15%)
Two hours, on Thursday 1 September. Exact time to be conrmed.
Covers material from the rst half of the course.
Exam (55%)
Three hours.
To pass COMP 202 you must achieve at least 40% in the nal
exam, and gain a total mark of at least 50%.
1.4 Tutorial and Marking Groups
Tutorials start NEXT WEEK (on Tuesday 12 July).
There will be ve groups for tutorials and marking; we expect all tutorials
to be held in Cotton 245 (subject to conrmation).
Group Time
1 Tuesday 12-12.50pm
2 Tuesday 2.10-3pm
3 Thursday 2.10-3pm
4 Friday 11-11.50am
5 Friday 12-12.50pm DELETED
Please sign up for a group on the sheets posted outside Cotton 245.
You need to sign up for a group even if you dont intend to attend
tutorials, as these will also be your marking groups.
1.5 What is Computer Science?
Computer Science involves (amongst other things):
Describing complex systems, structures and processes.
1.6. SOME POWERFUL IDEAS 5
Reasoning about such descriptions, to establish desired properties.
Animating/executing descriptions to obtain resulting behaviour.
Transforming one description into another.
These are common threads in much of Computer Science.
Understanding them will form the main aim of this course.
1.6 Some Powerful Ideas
Computer Science has produced several powerful ideas to address these kinds
of problems.
Languages and notations: programming languages, command languages, data
denition languages, class diagrams, . . .
Mathematical models: graphs to model networks, trees to model program
structures, . . .
COMP202 concentrates on the idea of language.
We will study techniques for describing and reasoning about dierent lan-
guages, from simple to complicated.
We will keep in mind the idea that understanding language and performing
computation are closely related activities.
1.7 Related Areas
The course material is drawn mainly from, and used in:
Programming language design, denition and implementation.
Software specication, construction and verication.
Formal languages and automata theory.
It also draws on mathematics (especially algebra and discrete maths),
logic, and linguistics.
It has applications in areas such as Problem solving, User interface
design, Networking protocols, Databases, Programming language de-
sign, and indeed in most computer applications.
6 CHAPTER 1. INTRODUCTION
1.8 Lecture schedule
0. Problems and programs [Weeks 12]
Describing problems, Describing algorithms and programs, Proving prop-
erties of algorithms and programs.
1. Regular Languages [Weeks 36]
Dening and recognising nite languages, Properties of strings and lan-
guages, Regular expressions, Finite automata, Properties of regular lan-
guages.
Text book: Part I.
2. Context Free Languages [Weeks 710]
Context free grammars, Push down automata, Parsing.
Text book: Part II.
3. Computability Theory [Weeks 11-12]
Recursively enumerable languages, Turing Machines, Computable func-
tions.
Text book: Part III.
1.9 Conclusion
COMP 202 will look at various techniques for dening, understanding,
and processing languages.
It will be a mixture of theory and practice.
Mastery of concepts and techniques will be assessed by tests and exams;
problem sets (tutorials and assignments) will test your ability to explore
more deeply what you have learned; and projects will give you the oppor-
tunity to apply knowledge in practice.
The Course Requirements document provides denitive information, and
has links to various rules and policies about which you should be aware.
READ IT!
Chapter 2
Problems and Algorithms
2.1 Problems and Algorithms
Recall that our focus in this course will be on languages.
Before we start studying how to precisely dene languages, let us agree on
some language and notation that we will use to talk about those denitions
(metalanguage).
In particular, let us describe the (semiformal) languages that we will use to:
dene problems
express algorithms
prove properties.
2.2 Problems, Programs and Proofs
How can we specify a problem (independently of its solution)?
How shall we describe a solution to a problem?
program
machine
What does it mean to say that a given program/machine solves a given
problem?
How can we convince ourselves (and others) that a given program/machine
solves a given problem?
2.3 Describing problems
How can we describe a problem for which we wish to write a computer pro-
gram?
7
8 CHAPTER 2. PROBLEMS AND ALGORITHMS
E.g. P
1
: add two natural numbers.
P
2
: sort a list of integers into ascending order.
P
3
: nd the position of an integer x in a list l.
P
4
: what is the shortest route from Wellington to Auckland?
P
5
: is the text in le f a valid Java program?
P
6
: translate a C++ program into machine code.
In each case, we describe a mapping from inputs to outputs.
+----------+
| |
Input ---->| P |----> Output
| |
+----------+
To be more precise, we need to specify:
1. How many inputs, and what kinds of values they can have.
2. How many outputs, and what kinds of values they can have.
3. Which of the possible inputs are actually allowed.
4. What output is required/acceptable for each allowable input.
1,2. To dene number and kinds of inputs and outputs, we need to dene types.
Need basic types, eg integer, character, Boolean.
Combine these to give tuples, sequences/lists, sets, ...
Use mathematical types, rather that arrays & records,
Input and output domains = signature.
3. To dene what inputs are allowed, specify a subset of the input domain
by placing a precondition on the input values.
4. To dene the required/acceptable output, specify a mapping from inputs
to outputs.
Maybe a function; in general, a relation.
It is formalized by giving a postcondition linking inputs and outputs.
2.4 Example: Comparing Strings
Consider the following simple problem:
Determine whether two given character strings are identical.
2.5. UNDERSTANDING THE PROBLEM 9
If you need such an operation in a program youre writing, you might:
Use a built-in operation (e.g. s.equal).
Look for existing code, or a published algorithm.
Design and code an algorithm from scratch.
Before doing any of these, you should make sure you know exactly what
problem youre trying to solve, by dening the signature, precondition, and
postcondition.
2.5 Understanding the Problem
Dene the interface via a signature:
input Strings s and t
output Boolean r
or
Equal : String String Bool
Formalise the output constraint:
r s = t
where
s = t is dened by [s[ = [t[ and (i 1 .. [s[)s[i] = t[i]
Notation:
s[i] is the ith element of s (starting from 0)
[s[ is the length of s
2.6 Designing an Algorithm
In designing an algorithm to determine whether two strings are identical, we
need to consider:
What style of algorithm to write:
Iterative:
while ... do
...
or recursive:
Equal(s, t)

=
...
Equal(..., ...)
...
10 CHAPTER 2. PROBLEMS AND ALGORITHMS
Which condition to test rst:
if [s[ = [t[ then
while ... do
...
else
...
for each index position i in s do
if s[i] = t[i] then
...
else
...
...
What operations to use in accessing the strings.
Indexing/length
s[i] Return ith element of s (starting from 0).
[s[ Return length of s.
Head/tail
head(s) Return rst element of s.
tail(s) Return all but rst element of s.
empty(s) Return true i s is an empty string.
2.7 Dening the Algorithm Iteratively
Algorithm Equal;
input String s, String t;
output Boolean r where r s = t
r := true;
if [s[ = [t[ then
for k from 0 to [s[ 1 do
if s[k] ,= t[k] then
r := false
else
r := false
2.8 Dening the Algorithm Recursively
Equal(s, t)

=
if empty(s) and empty(t) then true
elsif empty(s) or empty(t) then false
elsif head(s) ,= head(t) then false
else Equal(tail(s), tail(t))
Chapter 3
Two Simple Programming
Languages
3.1 Imperative and Applicative Languages
We will dene two simple programming languages: one imperative, the other
applicative.
The language of while-programs
imperative: has assignments and explicit ow of control
similar in style to Pascal, Ada, C++, Java
corresponds to the iterative algorithm style
The language of applicative programs
applicative: concerned with applying functions to arguments and
evaluating expressions
similar in style to Lisp, Scheme, and to functional languages such as
Haskell (COMP304).
corresponds to the recursive algorithm style
3.2 The Language of While-Programs
assignment statements x := e
x is a variable, e is an expression
We (usually) dont worry about declarations
no-operation statement skip
Does nothing at all: just like ; by itself in C++
11
12 CHAPTER 3. TWO SIMPLE PROGRAMMING LANGUAGES
sequence S
1
; S
2
Note that ; is a separator (like Pascal, unlike C++)
selection if cond then S
1
else S
2
Usually no need for
Will omit else skip
Can chain conditions:
if C
1
then S
1
elsif C
2
then S
2
else S
n
iteration while cond do S
procedures: (a bit like static methods in Java)
procedure name(parameters);
begin
S
end
The heading names the program, and lists its inputs and outputs.
Can declare local variables (with types) after the heading: usually we
wont bother.
The program can be invoked from other programs simply by naming it: if
we have dened
procedure A(in x; out y); begin y:=x end
we can then call
A(2, z)
with the same eect as z := 2.
3.3 Comparing strings again
procedure Equal(in s, t; out r);
begin
if [s[ = [t[ then
k := 0; r := true;
while k < [s[ do
if s[k] ,= t[k] then
r := false;
k := k + 1
else
r := false
end
3.4 The Applicative Language
Purely applicative (or functional ) languages have no assignable variables.
3.5. THOSE STRINGS AGAIN ... 13
They are more like mathematical functions.
Programming in this style may seem unfamiliar, but it is much easier to get
right than imperative programming.
The basic constructs are function denition and function call. For ex-
ample:
add(x, y)

= x +y
double(x)

= add(x, x)
Instead of a selection statement, we have a conditional expression:
if cond then E
1
else E
2
is an expression whose value is E
1
or E
2
according to the value of cond.
For example:
abs(x)

=
if x 0 then
x
else
x
Instead of a looping construct, we use recursion.
The denition of a function f can include calls on f itself (mathematically a
no-no, but familiar from recursive procedures/methods in imperative program-
ming).
For example:
mul(x, y)

=
if x < 0 then
mul(abs(x), y)
elsif x = 0 then
0
else
add(y, mul(x 1, y))
3.5 Those strings again ...
Equal(s, t)

=
if [s[ = [t[ then Equal
(s, t, 0) else false

Equal
(s, t, k)

=
if k [s[ then true
elsif s[k] ,= t[k] then false
14 CHAPTER 3. TWO SIMPLE PROGRAMMING LANGUAGES
else Equal
(s, t, k + 1)
3.6 Using head and tail
Equal(s, t)

=
if isempty(s) and isempty(t) then true
elsif isempty(s) or isempty(t) then false
elsif head(s) ,= head(t) then false
else Equal(tail(s), tail(t))
Chapter 4
Reasoning about Programs
4.1 Some Questions about Languages
1. Is program P a valid program in language L?
2. How many valid sentences are there in language L?
3. What sentences are in the intersection of languages L
1
and L
2
?
4. How hard is the decision question for language L?
5. What does program P mean?
6. Does program P terminate when given input x?
7. What output does it produce?
8. Do programs P and Q do the same thing?
9. Is there a program in language L
1
that does the same thing as program
P in language L
2
?
10. Are languages L
1
and L
2
equally expressive?
4.2 Syntax
The rst four questions are about syntax.
They concern the form of sentences in a language, not the meanings.
In Parts I and II (weeks 310) of the course, we will look at issues of syntax:
What is a language?
How can we dene the set of all valid sentences?
How can we (mechanically) decide whether a sentence is valid in a lan-
guage?
15
16 CHAPTER 4. REASONING ABOUT PROGRAMS
What relationships are there between dierent models of languages?
4.3 Semantics
Semantics tells us what a (syntactically valid) programmeans: what will happen
if it is run with any given input.
Dening semantics is harder than dening syntax: we need a suitable (math-
ematical) model of program execution.
Such a model may be operational, denotational, or axiomatic:
an operational model denes some machine and a procedure for translating
programs into instructions for that machine (Part III of the course)
a denotational model denes mathematical functions that directly capture
the behaviour of the program (COMP 304)
an axiomatic model provides rules for reasoning about the specications
that a program satises.
We will develop a simple axiomatic model for while-programs.
4.4 Comparing Strings One More Time
begin
0 if |s| = |t| then
1 k := 0;
2 r := true;
3 while k < |s| do
4 if s[k] = t[k] then
5 r := false;
7 k := k + 1
8 else
9 r := false
10 end
How can we convince ourselves that it is correct?
Testing.
Mathematical reasoning.
We can reason about dierent input cases (case analysis) and the corre-
sponding execution sequences:
If [s[ , = [t[, the test at line 0 fails, and the program sets r to false (line 9),
which is the correct output in this case.
If [s[ = [t[, the test succeeds, and the program sets r to true (line 2), then
goes into the loop (lines 37).
4.4. COMPARING STRINGS ONE MORE TIME 17
Now this kind of reasoning breaks down, because we dont know how many
times the program will go round the loop. We have to reason in a way that is
independent of the number of iterations performed.
This means we need induction.
We need to identify some property that:
holds every time execution reaches the top of the loop (line 3)
guarantees the required result when the loop exits (after line 7).
When execution reaches the top of the loop, we know:
[s[ = [t[
0 k [s[
r = true i s[0 .. k 1] = t[0 .. k 1]
i.e. r (i 0 .. k 1)s[i] = t[i]
This is called a loop invariant.
We need to show that:
1. The loop invariant holds on entry to the loop.
2. If the loop invariant holds at the top of the loop, and the loop body is
executed, it will hold again at the top of the loop.
3. If the loop invariant holds at the top of the loop, and the loop exits, the
required property (the postcondition) holds afterwards.
Together, these constitute a proof by induction that the loop does what is
required.
1. Invariant holds on entry:
[s[ = [t[ by the if condition.
0 k [s[, since k = 0 (line 1) and 0 [s[ by denition.
k = 0 means that (i 0 .. k 1)s[i] = t[i] is trivially true; and
r = true.
2. Invariant is maintained by the loop:
[s[ = [t[ is unchanged, because s and t dont change.
0 k < [s[ is true initially (induction hypothesis), and k < [s[ (loop
condition), so 0 k [s[ 1.
r s[0 .. k 1] = t[0 .. k 1] initially (induction hypothesis).
Now, if s[k] = t[k] (line 4), r remains unchanged, and s[0..k] = t[0..k] i
s[0..k 1] = t[0..k 1].
On the other hand, if s[k] ,= t[k], r becomes false (line 5), and so does
s[0..k] = t[0..k].
In either case, 0 k [s[ 1 r s[0..k] = t[0..k].
Now, the next iteration of the loop will increment k, so whatever is true
of k now will be true of k 1 at the start of the next iteration.
So the loop invariant holds again, as required.
3. Postcondition holds at loop exit:
When [s[ = [t[,
0 k [s[,
r (s[0 .. k 1] = t[0 .. k 1]), and
k ,< [s[
all hold,
k = [s[, so r s[0 .. [s[ = 1] = t[0 .. [s[ 1].
But s[0 .. [s[ = 1] = s and t[0 .. [s[ 1] = t (remember [s[ = [t[), hence
r s = t.
Thus, the program will set r to true if s = t, and to false if s ,= t.
4.5 Laws for Program Verication
In verifying the above program, we used several kinds of knowledge and reason-
ing:
Standard laws of mathematics/logic.
Properties of operations used in the specication and program.
E.g. indexing, string length, string equality.
Laws for reasoning about program execution: assignments and control
structures.
Let us know make our use of the laws about execution a little more precise.
4.6 Reasoning with Assertions
Our reasoning was based on intuition about what must be true at particular
points in the execution.
For example, before line 3 we knew that [s[ = [t[, because we were inside the
then part of the if statement at line 0.
We also knew that k = 0 and r = true because of lines 1 and 2.
We can formalize that intuition by annotating the lines of the program with
assertions: logical statements that are known to be true immediately before the
line is executed.
Verication then proceeds by showing that assertions are indeed satised.
4.6. REASONING WITH ASSERTIONS 19
begin
{I0} if |s| = |t| then
{I1} k := 0;
{I2} r := true;
{I3} while k < |s| do
{I4} if s[k] = t[k] then
{I5} r := false
{I6} else skip;
{I7} k := k + 1
{I8} else
{I9} r := false
{I10} end
I
0
is just the precondition: in this case,
I
0
= true.
I
1
is true when we are inside the then part of the if.
A law for if tells us I
1
= I
0
[s[ = [t[; that is,
I
1
= [s[ = [t[.
Law for if Pif C then P CS
1
else P CS
2
I
3
is the loop invariant. There is no easy way to nd a loop invariant: the
only way is to understand how the loop works. In this case, we know that
0 k [s[, and that the purpose of the loop is to decide whether s = t up to
but not including element k. We also know that [s[ = [t[, from I
1
. Hence:
I
3
= 0 k [s[ (r s[0 .. k 1] = t[0 .. k 1]) [s[ = [t[.

We must show that the loop invariant holds initially; that is, I
1
k :=
0; I
2
r := trueI
3
is correctly annotated.
This requires reasoning about a sequence (;)
of assignment statements (:=).
Law for ; PS
1
Q QS
2
R
PS
1
; S
2
R
Our S
1
is k := 0 and our S
2
is r := true.
Our P, Q, and R are respectively I
1
, I
2
, and I
3
.
We must nd I
2
such that I
1
k := 0I
2
and I
2
r := trueI
3
.
Law for := Q[e/x]x := eQ
For I
2
r := trueI
3
, we need I
2
to be I
3
[true/r]: that is, I
2
= 0 k
[s[ (true s[0 .. k 1] = t[0 .. k 1]) [s[ = [t[.
Now, I
1
k := 0I
2
follows as long as I
1
is I
2
[0/k]:
indeed, 0 0 [s[ (true s[0 .. 0 1] = t[0 .. 0 1]) [s[ = [t[ is the same as
I
1
by simplication.
Immediately inside the loop, the loop invariant (I
3
) still holds, and k
[s[ 1, by the loop condition. (We are using a Law for while here, but lets
skip the formality.)
Hence:
I
4
= (r s[0 .. k 1] = t[0 .. k 1]) [s[ = [t[ k [s[ 1.

The if law tells us that s[k] ,= t[k] at I
5
, so:
I
5
= I
4
s[k] ,= t[k].
The assignment law justies adding r = false (check it!):
I
6
= I
5
r = false.
Before line 7, either the then part was taken, in which case I
6
holds; or, it
was not taken, in which case, I
4
still holds (skip law). So:
I
7
= (I
6
) (s[k] = t[k] I
4
).
After line 7, we require I
3
to hold again, to maintain the loop invariant.
Hence, we must prove: I
7
k := k + 1I
3
.
The assignment law will justify this (check it).
After the loop exits, we know that the loop invariant I
3
still holds, and that
k = [s[. Thus:
I
8
= 0 [s[ [s[ r s[0..[s[ 1] = t[0..[s[ 1] [s[ = [t[.

In the else branch of the main if, we get:
I
9
= [s[ , = [t[.
In that case, we assign r := false.
Finally, at end the two branches of the outer if come together:
I
10
= (I
8
) (I
9
r = false)
which implies r s = t (exercise).
Part I
Regular Languages
21
Chapter 5
Preliminaries
5.1 Part I: Regular Languages
In this part of COMP 202 we will begin to look at formal languages, and at how
they can be dened.
In this lecture we will introduce some of the basic notions required. In
subsequent lectures we will look at:
dening languages using regular expressions
dening languages using nite automata
We will also explore the relationship between these ways of dening lan-
guages.
5.2 Formal languages
What we say formal we mean that we are concerned simply with the form (or
the syntax) of the language.
We are not concerned with the meaning (or semantics) of the symbols.
The theory of syntax is very much more straightforward and better under-
stood than the theory of semantics.
We are not concerned with trying to understand natural languages like
English or M aori, nor even with the question of whether the tools we develop to
deal with formal languages are appropriate for dealing with natural languages.
5.3 Alphabet
Denition 1 (Alphabet) An alphabet is a nite set of symbols.
Example: Alphabets
23
24 CHAPTER 5. PRELIMINARIES
1. 1, 0
2. true, false, pencil, &&&&, ???
3. ), k, , , f
4. enter, leave, left-down, right-down, left-up, right-up
In the textbook, Cohen separates the elements of a set with space rather
than a comma.
The symbols themselves are meaningless, so we may as well pick a convenient
alphabet: Cohen uses a b in his examples.
Conventionally we use and (occasionally) as names for alphabets.
5.4 String
Denition 2 (String) A string over an alphabet is a nite sequence of symbols
from that alphabet.
Example: Some strings
Examples: if we have an alphabet = ), k, , , f then
1. )
2. k
3. kkf
4.
5. k))k)f
6. k)
are strings over .
String number 4 is the empty string. We need a better notation for the
empty string than simply not writing anything, so we use a meta-symbol for it.
Dierent authors use dierent meta-symbols, the most common ones being
, , and . Cohen chooses , so we might as well, too.
Note that is a string over any alphabet, but is not a symbol in the
alphabet.
5.5 Language, Word
Denition 3 (Language) A language is a set of strings.
Denition 4 (Word) A word is a string in a language.
5.6. OPERATIONS ON STRINGS AND LANGUAGES 25
Much of the rest of this course is devoted to investigating the related prob-
lems of:
how we can dene languages, and
how we can show whether or not a given string is a word in a language.
5.6 Operations on Strings and Languages
Denition 5 (Concatenation) If s and t are strings then the concatenation
of s and t, written st, is a string.
st consists of the symbols of s followed immediately by the symbols of t.
We freely allow ourselves to concatenate single symbols onto strings.
We can concatenate two languages. If L
1
and L
2
are languages then:
L
1
L
2
= st[s L
1
, t L
2
Denition 6 (Length of a string) The length of a string is the number of

occurrences of symbols in it.
If s is a string we write [s[ for the length of s.
We can give a recursive algorithm for [ [:
[[ = 0
if x is a single symbol and y a string, then [xy[ = 1 +[y[
Denition 7 (Kleene closure (Kleene *)) If S is a set then S
is the set
consisting of all the sequences of elements of S
Note: is in S
.
We can dene S
inductively:
S
if x S and y S
then xy S
If is an alphabet then
is a language over .
If L is a language then L
is a language.
Example: Kleene closure
if = 1, 0 then
= , 1, 0, 11, 10, 01, 00, . . .

if L = kk, k then L
= , kk, k, kkkk, kkk, kkk,

kk, . . .
Question: what is (
?
Denition 8 (Kleene +) If S is a set then S
+
is the set consisting of all the
non-empty sequences of elements of S
Note: is not in S
+
, unless it was in S.
We can dene S
+
inductively:
if x S then x S
+
if x S and y S
+
then xy S
+
We can also dene S
+
as SS
5.7 A proof
Throughout this course we will perform proofs by induction, so we begin here
with a simple one.
When we say that a string is a sequence of symbols over an alphabet we
mean that a string is either:
the empty string , or
a symbol from followed by a string over
So, if we want to prove properties of strings over some alphabet we can
proceed as follows:
prove the property holds of the empty string , and
prove the property holds of a string xy where x is an arbitrary element of
and y an arbitrary string over assuming the property holds of y.
More formally, in order to show P(s) for an arbitrary string s over an alpha-
bet , show:
Base case P()
Induction step P(xy), given P(y), where x is an arbitrary element of , y an
arbitrary string over
This sort of pattern of inductive proof occurs in very many situations in the
mathematics required for computer science.
5.8 Statement of the conjecture
The conjecture that we will prove by induction is:
Conjecture 1 [st[ = [s[ +[t[
In English: the length of s concatenated with t is the sum of the lengths of
s and t.
We proceed by induction on s, where P(s) is [st[ = [s[ +[t[.
5.9. BASE CASE 27
5.9 Base case
P() is [t[ = [[ +[t[.
We show that the LHS and the RHS of this equality are just the same.
On the LHS we use the fact that t = t, to give us [t[
On the RHS we observe that [[ = 0 and 0 +n = n to give us [t[.
Thus LHS = RHS, and the base case is established.
5.10 Induction step
P(xy) is [(xy)t[ = [xy[ + [t[.
We must show [(xy)t[ = [xy[ +[t[, given [yt[ = [y[ +[t[.
On the LHS we use the associativity of concatenation to observe that (xy)t =
x(yt).
Next we use the denition of [ [ to see that the LHS is 1 +[yt[.
The RHS is [xy[ + [t[ We use the denition of [ [ to see that the RHS is
1 +[y[ +[t[.
Now we use the induction hypothesis [yt[ = [y[ + [t[ to show that the RHS
is 1 +[yt[.
So we have shown that LHS = RHS in the induction step.
5.11 A theorem
We have established both the base case and the induction step, so we have
completed the proof.
Now our conjecture can be upgraded to a theorem:
Theorem 1 [st[ = [s[ +[t[
Chapter 6
Dening languages using
regular expressions
6.1 Regular expressions
In this lecture we will see how we can dene a language using regular expressions.
The rst step is to give an inductive denition of what a regular expression
is.
Before we dene regular expressions we will give a similar denition for
expressions of arithmetic, to illustrate the method used.
6.2 Arithmetic expressions
We will now give a formal inductive description of the expressions we have in
arithmetic.
if n N then n is an arithmetic expression;
if e
1
and e
2
are expressions then so are
(e
1
)
(e
1
+e
2
),
(e
1
e
2
),
(e
1
e
2
),
(e
1
/e
2
),
(e
e2
1
) (we could have chosen to use a symbol, such as ^, rather than
just use layout)
This denition forces us to include lots of brackets:
((1+2)+3)
29
30CHAPTER 6. DEFINING LANGUAGES USING REGULAR EXPRESSIONS
(1+(2+3))
are expressions, but
1+(2+3)
1+2+3
are not.
6.3 Simplifying conventions
We do not need to use all these brackets if we adopt some simplifying conven-
tions.
First we allow ourselves to drop the outermost brackets, so we can write:
(1+2)+3
rather than:
((1+2)+3)
We also adopt some conventions about the precedence and associativity of
the operators.
6.4 Operator precedence
+ and are of lowest precedence
and / are next
^ is highest
So we can write:
2 + 3 4
5
rather than:
2 + (3 (4
5
))
6.5 Operator associativity
We also adopt a convention that all the operators are left associative. This
means that if we have a sequence of operators of the same precedence we ll in
the brackets to the left.
So we can write:
1 + 2 + 3
rather than:
(1 + 2) + 3
We can have right associative operators, or non-associative operators.
6.6. CARE! 31
6.6 Care!
Be careful not to confuse the two statements:
1. # is a left associative operator
2. # is an associative operation
1 means that we can write x#y#z instead of (x#y)#z.
2 means that (xyz).(x#y)#z = x#(y#z).
In the examples above + and / are both left associative. However, addition
is associative, but division is not:
60/6/2 = 5
60/(6/2) = 20
6.7 Dening regular expressions
If is an alphabet then:
the empty language and the empty string are regular expressions
if r then r is a regular expression
if r is a regular expression then so is (r
)
if r and s are regular expressions then so are
(r +s)
(rs)
6.8 Simplifying conventions
We allow ourselves to dispense with outermost brackets.
has the highest precedence and + the lowest.

We take all operators to be left-associative, so we write:
r +s +t instead of (r +s) +t
rst instead of (rs)t
6.9 What the regular expressions describe
So far we have said what the form of a regular expression is, without explaining
what it describes. This is as if we had explained the form of the arithmetic
expressions without explaining what the operations of addition, subtraction and
so on were.
The regular expressions describe sets of strings over , that is languages, so
we must explain what languages the regular expressions describe.
We write Language(r) for the language described by regular expression r.
When we are being sloppy we use r for both the regular expression and the
language it describes.
Language()
The language described by is the empty language .
Language()
The language described by is .
Note , =
Language(r), r
The language described by r, r is r
Example:
If = 0, 1 then:
Language(0) = 0
Language(1) = 1
Language(r
)
The language described by r
is the Kleene closure of the language de-

scribed by r.
Example:
If = 0, 1 then:
Language(0
) = , 0, 00, 000, . . .
Language(1
) = , 1, 11, 111, . . .
Language(r +s)
The language described by r + s is the union of the languages described
by r and s.
Example:
If = 0, 1 then:
Language(1 +0) = 1, 0
Language(1
+0
) = , 0, 1, 00, 11, 000, 111, . . .

6.10. DERIVED FORMS 33
Language(rs)
The language described by rs is the concatenation of the languages de-
scribed by r and s.
Example:
If = 0, 1 then:
Language(10) = 10
Language((1 +0)(1 +0)) = 11, 10, 01, 00
Language((1 +0)0
) = 1, 0, 10, 00, 100, 000, 1000, . . .

6.10 Derived forms
Sometimes we use these derived forms:
r
+
=
def
rr
r
n
=
def
rr . . . rr

n rs
Chapter 7
Regular languages
7.1 Regular Languages
In the last lecture we introduced the regular expressions and the operations that
the regular expressions correspond to.
The regular expressions allow us to describe languages. A language which
can be described by a regular expression is called a regular language.
If is an alphabet, r is a symbol, and r is a regular expression, then:
Language() is
Language() is
Language(r), r is r
Language(r
) is (Language(r))
Language(r +s) is Language(r) Language(s)

Language(rs) is Language(r)Language(s)
7.2 Example languages
In this lecture we will look at the languages which we can describe with regular
expressions. We will usually restrict attention to = 0, 1
Example: Language(0 +1)
Language(0 +1)
= Language(0) Language(1)
= 0 1
= 0, 1
35
36 CHAPTER 7. REGULAR LANGUAGES
Example: Language(1 +0)
Language(1 +0)
= Language(1) Language(0)
= 1 0
= 1, 0
We see that 1 +0 and 0 +1 describe the same language
Two regular expressions which describe the same language are equal regular
expressions, so we can write:
1 +0 = 0 +1
We can be more general, however. For all regular expressions, r, s, we have:
r +s = s + r
because for all sets S
1
, S
2
we have S
1
S
2
= S
2
S
1
The dierence between 1 +0 = 0 +1 and r +s = r +s is just the same as
the dierence between 3 + 2 = 2 + 3 and x +y = y +x.
You would not infer x +x = x x from 2 +2 = 2 2 in ordinary arithmetic.
Example: Language((0 +1)
)
Language((0 +1)
)
= (Language(0 +1))
= (Language(0) Language(1))
= (0 1)
= 0, 1
= , 0, 1, 00, 01, 10, 11, . . .

Example: Language((0
)
Language((0
)
= (Language(0
))
= (Language(0
)Language(1
))
= (c
1
. . . c
n
[c
i
0, n N, 1, 11, . . .)
= , 1, 11, . . . , 0, 01, 011, . . .
= , 0, 1, 00, 01, 10, 11, . . .

The reasoning above shows that: (0 +1)
= (0
How would you generalise the reasoning given above to show for all regular
expressions, r, s:
(r +s)
= (r
?
7.3. FINITE LANGUAGES ARE REGULAR 37
7.3 Finite languages are regular
A nite language is one which has a nite number of words.
Theorem 2 Every nite language is a regular language.
Proof
A language is regular if it can be described by a regular expression.
A nite language can be written as: w
1
, . . . , w
n
for some n N. This
means we can write it as: w
1
. . . w
n
.
Note: Recall that if n = 0 in the above then we are dealing with the empty
language, which is certainly regular.
If we can represent each of the w
i
as a regular expression w
i
, then the
language is described by w
1
+. . . +w
n
.
Each word is just a string of symbols s
1
. . . s
m
for some m N. So each
word is the only word in the language s
1
. . . s
m
.
Note: Recall that if m = 0 in the above then we are dealing with the empty
string, which is certainly a regular expression.
Hence, every nite language can be described by a regular expression.
Example: Finite languages
1, 11, 111, 1111 is Language(1 +11 +111 +1111)
00, 01, 10, 11 is Language(00 +01 +10 +11)
Notice that 00, 01, 10, 11 is also Language((1 + 0)(1 + 0)), but to show
that the language is regular we only need to provide one regular expression
describing it.
7.4 An algorithm
The proof that every nite language is regular provides us with an algorithm
which takes a nite language and returns a regular expression which describes
the language.
Many of the proofs that we see in this course have an algorithmic character.
Such algorithmic proofs are often called constructive, and provide the basis for
a program.
Conversely every program is a constructive proof (even if we are usually not
interested in what it is a proof of).
7.5 EVEN-EVEN
We now describe a language which Cohen introduces, which he calls EVEN-
EVEN.
EVEN-EVEN = Language((00 +11 + (01 +10)(00 +11)
(01 +10))
)
It may not be immediately obvious what language this is (or rather, if there
is a simple description of this language).
However every word of EVEN-EVEN contains an even number of 0s and an
even number of 1s.
Furthermore EVEN-EVEN contains every string with an even number of 0s
and an even number of 1s.
7.6 Deciding membership of EVEN-EVEN
Suppose we are given the task of writing a program to decide whether a given
string is in EVEN-EVEN. How would we go about this task?
One method would be to use two counters, n
0
and n
1
, and to go through
the string counting all the 0s and all 1s. If n
0
and n
1
are both even at the end
of the string then the string is in EVEN-EVEN.
If we call the pair n
0
, n
1
) a state of the program then our program will
need to be able to go through as many states as there are symbols in a string
to decide membership.
Is there a program which uses fewer states?
7.7 Using fewer states
We dont actually care now many 0s and 1s there are, so we could use two
boolean ags b
0
and b
1
. As we go through the string we ip the appropriate ag
as we read each symbol. If both the ags end up in the same state that they
started in then the string is in EVEN-EVEN.
If we call the pair b
0
, b
1
) a state of the program then our program will need
to be able to go through at most four distinct states to decide membership.
7.8 Using even fewer states
Suppose, instead of reading the symbols one by one, we read them two by two.
Now we need only use one boolean ag, and we do not have to ip it all the
time.
If both the symbols we read are the same then we leave the ag alone. If
they dier then we ip the ag. Two dierent symbols means we have just read
and odd number of 0s and 1s. If we had read an even number before we have
now read an odd number, if we had read an odd number before we have now
read an even number.
If the ag ends up in the same state it began in then the string is in EVEN-
EVEN.
The program need go through at most two distinct states to decide mem-
bership.
7.9 Uses of regular expressions
Regular expressions turn out to have practical uses in a variety of places.
7.9. USES OF REGULAR EXPRESSIONS 39
We can informally think of a regular expression as describing the most typical
string in a language.
The tokens of a programming language can usually be described by regular
expressions.
For example, a type name might be an uppercase letter followed by a se-
quence of uppercase letters, lowercase letters or underscores, and so on.
The task of identifying the tokens in a program is called lexical analysis,
and is the rst step in compiling the program. The UNIX utility lex is a
lexical-analyser generator: given regular expressions describing the tokens of
a language it generates a program which will perform lexical analysis. Since
regular expressions describe typical strings they are the basis for many searching
and matching utilities.
The UNIX utility grep allows the user to specify a regular expression to
search for in a le.
Many text editors provide a facility similar to grep for searching for strings.
Programming languages which are oriented towards text processing usually
allow us to describe strings using regular expressions.
Chapter 8
Finite automata
8.1 Finite Automata
We leave regular expressions and introduce nite automata.
Finite automata provide us with another way to describe languages.
Note Cohen uses the term nite automaton (FA) where many other au-
thors use the term deterministic nite automaton (DFA). After we have intro-
duced deterministic nite automata we will introduce non-deterministic nite
automata. In common with everybody else, Cohen calls these NFAs.
8.2 Formal denition
Denition 9 (Finite automaton) A nite automaton is a 5-tuple (Q, , , q
0
, F)
where:
Q is a nite set: the states
is a nite set: the alphabet
is a function from Q to Q: the transition function
q
0
Q: the start state
F Q: the nal or accepting states.
Note: F may be Q, or F may be the empty set.
8.3 Explanation
While this is all very well it does not help us see what nite automata do, or
how.
Suppose we have a string made up of symbols from . We begin in the start
state q
0
. We read the rst symbol s from the string, and then enter the state
41
42 CHAPTER 8. FINITE AUTOMATA
given by (q
0
, s). We then read the next symbol from the string and use the
transition function to move to a new state, repeating this process until we reach
the end of the string.
If we have read all the string, and are in one of the accepting states we say
that the automaton accepts the string.
The language accepted by the automaton is the set of all the strings it
accepts.
So automata give us a way to describe languages.
8.4 An automaton
Suppose we have an automaton M
1
= (Q, , , q
0
, F), where:
Q = S
1
, S
2
, S
3
= 0, 1
(S
1
, 1) = S
2
(S
2
, 1) = S
3
q
0
= S
1
F = S
3
Lets see if M
1
accepts 1.
We begin in state S
1
and we read 1. (S
1
, 1) = S
2
so we enter S
2
.
Our string is empty, but S
2
is not a nal state, so M
1
does not accept 1.
Lets see if M
1
accepts 11. We begin in state S
1
and we read 1. (S
1
, 1) = S
2
so we enter S
2
.
Now we read 1. (S
2
, 1) = S
3
so we enter S
3
.
Our string is empty, and S
3
is an accepting state so M
1
accepts 11.
Lets see if M
1
accepts 0. We begin in state S
1
and we read 0. is a partial
function, and there is no value given for (S
1
, 0). We cannot make any progress
here, and so 0 is not in the language accepted by M
1
. We can think of the
machine as crashing on this string.
Note: here we dier from Cohen. He insists (initially, at least) that be
total, and adds a black hole state to all his machines, whereas we allow to
be partial.
Cohen would have an new state S
4
, and would extend with:
(S
1
, 0) = S
4
(S
2
, 0) = S
4
(S
4
, 0) = S
4
(S
4
, 1) = S
4
Since S
4
is not an accepting state, and since once we enter it there is no way
to leave, this extension does not change the language accepted by the machine.
8.5. THE LANGUAGE ACCEPTED BY M
1
43
8.5 The language accepted by M
1
A moments reection will show that the only string which M
1
accepts is 11,
and so the language M
1
accepts is Language(11).
8.6 Another automaton
Suppose we have an automaton M
2
= (Q, , , q
0
, F), where:
Q = S
1
, S
2
, S
3
= 0, 1
(S
1
, 1) = S
2
(S
2
, 1) = S
3
(S
3
, 0) = S
3
(S
3
, 1) = S
3
q
0
= S
1
F = S
3
M
2
is nearly the same as M
1
, but now we have transitions from S
3
to itself on
reading either a 0 or a 1.
What language does M
2
accept?
Clearly the only way to get from S
1
to S
3
is to begin with two 1s. If the
string is just 11 it will be accepted.
What about 111? This will be accepted too, as (S
3
, 1) = S
3
.
What about 110? This will be accepted too, as (S
3
, 0) = S
3
.
In fact any extension of 11 will be accepted, so we see that M
2
accepts
Language(11(0 +1)
).
8.7 A pictorial representation of FA
It is often easier to see what an FA does if we draw a picture of it.
We can draw a nite automaton out as a labelled, directed graph. Each state
of the machine is a node, and the transition function tells us which nodes are
connected by which edges. We mark the start state, and any accepting states
in some special way.
M
1
can be represented as:
` _
1
1
\
2
1
` _
3
+
M
2
can be represented as:
` _
1
1
\
2
1
` _
3
+
1,0

Note: Sometimes the start state is pointed to by an arrow, and the accepting
states are drawn as double circles or squares, e.g.:
\
1
1
\
2
1
\
3
1,0
or
\
1
1
\
2
1
3
1,0
8.8 Examples of constructing an automaton

If L is a language over some alphabet , then L, the complement of L, is
s[s
&s , L.
If we have some automaton M
L
, which accepts L, then we can construct an
automaton M
L
which accepts L.
M
L
will accept just the strings which M
L
does not, and will not accept just
the strings which M
L
does.
To recognise the complement of a language, we can think of ourselves as
going through the same steps as we would take to recognise the language, but
making the opposite decision at each state.
We then expect M
L
and M
L
, to have the same states, but an accepting state
of M
L
will not be an accepting state of M
L
and vice versa.
8.9 Constructing M
L
M
2
from above accepts Language(11(0 +1)
).
Our rst suggestion for a machine, M
3
to accept Language(11(0 +1)
) is
to have:
the same set of states,
the same transition function,
the complement of the set of accepting states of M
2
.
M
3
= (Q, , , q
0
, F), where:
Q = S
1
, S
2
, S
3
= 0, 1
(S
1
, 1) = S
2
(S
2
, 1) = S
3
(S
3
, 0) = S
3
(S
3
, 1) = S
3
8.10. A MACHINE TO ACCEPT LANGUAGE(11(0 +1)
) 45
q
0
= S
1
F = S
1
, S
2
F(M
3
) is QF(M
2
), as you would expect. Graphically:
` _
1
,+
1
` _
2
+
1
\
3
1,0
This automaton correctly accepts and the string 1.

What about the string 0?
The machine crashes, so, close, but no cigar.
The problem is that M
2
and M
3
both crash on the same strings: M
3
should
accept the strings that M
2
crashes on, and vice versa.
What we have to do is to convert our initial machine into a machine which
accepts the same language and whose transition function is total. We can always
do this. (How?)
Then we construct a third machine whose set of accepting states is the
complement of that of the second machine.
8.10 A machine to accept Language(11(0 +1)
)
M
4
= (Q, , , q
0
, F), where:
Q = S
1
, S
2
, S
3
, S
4
= 0, 1
q
0
= S
1
F = S
1
, S
2
, S
4
(S
1
, 0) = S
4
(S
1
, 1) = S
2
(S
2
, 0) = S
4
(S
2
, 1) = S
3
(S
3
, 0) = S
3
(S
3
, 1) = S
3
(S
4
, 0) = S
4
(S
4
, 1) = S
4
Graphically:
` _
1
,+
`
0
1
` _
2
+
1
_
0
\
3
1,0
` _
4
+
1,0

8.11 Summary
We have outlined a method which allows us to construct a machine which accepts
L from a machine which accepts L.
We can think of this construction a proof of the theorem:
Theorem 3 If a language L can be dened using an FA, then so can the lan-
guage L
Chapter 9
Non-deterministic Finite
automata
9.1 Non-deterministic Finite Automata
We now introduce a variant on the Finite Automaton, the Non-deterministic
nite automaton (NFA).
The dierence between an NFA and an FA is that, in an NFA more than
one arc leading from a state may be labelled by the same symbol.
Formally, the transition function of an NFA takes a state and a symbol and
returns a set of states.
A string can now label more than one path through the automaton.
The string itself does not determine which state we will end up when we
read it: there is some non-determinism built into the machine.
9.2 Preliminaries
Powerset
If A is a set then 2
A
is the powerset of A,
i.e. the set of all subsets of A.
2
A
A 2
A
Partial and total functions
A partial function is not dened for some values of its domain. If f is a
partial function and x is in the domain of f we write f(x) if f(x) is
dened.
47
48 CHAPTER 9. NON-DETERMINISTIC FINITE AUTOMATA
9.3 Formal denition
Denition 10 (Non-deterministic nite automaton) A non-deterministic
nite automaton is a 5-tuple (Q, , , q
0
, F) where:
is a function from Q to 2
Q
: the transition function
q
0
Q: the start state
Note: is always total.
9.4 Just like before. . .
An NFA accepts a string if there is a path from the start state to an accepting
state labelled by the string.
The set of strings accepted by a NFA is the language it accepts.
9.5 Example NFA
M
5
= (Q, , , q
0
, F), where
Q = S
1
, S
2
, S
3
, S
4
= 0, 1
q
0
= S
1
F = S
2
, S
3
(S
1
, 0) = S
2
, S
3
(S
1
, 1) =
(S
2
, 0) = S
4
(S
2
, 1) = S
4
(S
3
, 0) =
(S
3
, 1) = S
3
(S
4
, 0) = S
2
(S
4
, 1) = S
2
For clarity and conciseness, we sometimes choose to show as a table:

0 1
S
1
S
2
, S
3

S
2
S
4
S
4
S
3
S
3
S
4
S
2
S
2
And, of course we can give a pictorial representation. M

5
can be drawn:
9.6. M
5
ACCEPTS 010 49
` _
3
+
1
` _
1
0
` _
2
+
0,1
\
4
`
0,1
9.6 M
5
accepts 010
Now lets see whether some strings are in the language dened by M
5
. We will
try to nd a path from the start state to an accepting state.
We begin with 010.
We start in S
1
, and the rst symbol in the string is 0. Now we have a choice
as there are two transitions out of S
1
labelled by 0.
We choose to go to S
2
. The 1 takes us to S
4
, and the nal 0 brings us back
to S
2
. We have exhausted our string, and we are in an accepting state, so M
5
accepts 010.
9.7 M
5
accepts 01
Next we try 01
We start in S
1
, and the rst symbol in the string is 0. Now we have a choice
as there are two transitions out of S
1
labelled by 0.
As before we choose to go to S
2
. The next transition takes us to S
4
. Our
string is exhausted, but we are not in an accepting state.
If this were a deterministic automaton then we would know that the string
was not in the language. However this is a non-deterministic machine, and there
may be another path from the start state to an accepting state labelled by 01.
We backtrack to the place where we made a choice, and pick the other arc
labelled by 0. We see that this string is accepted by M
5
.
9.8 Comments
A little thought shows that M
5
accepts Language(01
+0((1 +0)(1 +0))
).
The following FA, M
6
= (Q, , , q
0
, F), accepts the same language:
Q = S
1
, S
2
, S
3
, S
4
, S
5
, S
6
= 0, 1
q
0
= S
1
F = S
3
, S
5
, S
6
0 1
S
1
S
3
S
2
S
2
S
3
S
4
S
5
S
4
S
6
S
6
S
5
S
6
S
3
S
6
S
4
S
4
9.9 NFAs are as powerful as FAs
Now we move on to prove two theorems about the languages denable by FAs
and NFAs.
Theorem 4 Every language denable by an FA is denable by a NFA.
The proof of this consists of an algorithm which takes an arbitrary FA, and
constructs an NFA which accepts the same language.
9.10 Proof
We have an FA, M
FA
= (Q, , , q
0
, F), and we will construct an NFA M
NFA
=
(Q
, q
0
, F
), which accepts the same language.

Clearly
= is the only sensible choice. It also seems reasonable to keep

the same structure for the machine and take:
Q
= Q
q
0
= q
0
F
= F
The only real dierence is in the two transition functions:
one may be partial, the other is total;
the range of one is states, the range of the other is sets of states.
The solution is simple:
(S, ) = if (S, ) then (S, ) else

The denition that we have given (Q
, q
0
, F
) is certainly an NFA. It
should also be clear that any string which labels a path from the start state to
an accepting state in the FA will do so in the NFA, and only strings which label
such a path in the FA will do so in the NFA.
Hence the two languages are the same.
Thus we have shown that the descriptive power of NFAs is at least as strong
as that of FAs.
This should not be a surprise, as NFAs were introduced as a generalisation
of FAs.
9.11 Using the graphical representation
Given that the graphical representation of an FA is the graphical representation
of an NFA which accepts the same language we could have just drawn the picture
and said Look!
That is too easy to be a real proof.
9.12. FAS ARE AS POWERFUL AS NFAS 51
9.12 FAs are as powerful as NFAs
Theorem 5 Every language denable by an NFA is denable by a FA.
The proof of this consists of an algorithm which takes an arbitrary NFA,
and constructs a FA which accepts the same language.
This is a more remarkable result as NFAs were introduced as a generalisation
of FAs.
Alas we cannot look at a picture and go Ha! We are forced to make a
careful construction.
The key idea is that the states of the FA we construct will be sets of states
of the original NFA.
When we tried to see if M
5
would accept strings we used a strategy rather
like depth-rst search. We pursued a path until we were either successful or
stymied, and then retraced our steps to make alternate choices.
Suppose instead we had kept a record of all the possible states we could have
reached as we worked our way along the string. This is rather like breadth-rst
search.
For the string 010 and M
5
we could have reached either of the states S
2
, S
3
after we had read 0, any state reachable from S

2
or S
3
on 1 next and so on.
9.13 Useful observations
If T is a set of states of some machine then T 2
Q
.
If Q if nite then so is 2
Q
Given some T 2
Q
then the set of all states reachable on some symbol
is
(t, )[t T.
For each T 2
Q
, there is just one such
(t, )[t T
Because there is just one set of states reachable on a given symbol from a
given set of states we have got determinism back.
We still have to sort out what the start states and accepting states are.
The singleton set of the start state of the NFA is the obvious candidate for
the start state for our FA.
A string is accepted by the NFA if there is any path from the start state to
an accepting state labelled by that string.
Therefore we will take every set which contains an accepting state of the
NFA to be an accepting state of the FA we are constructing.
Where are we? We now know:
what the alphabet will be
what the states of our FA look like,
what the start state of our FA will be
what the accepting states will be
what the transition function will look like.
We still have to give a full description of the states and the transition func-
tion.
9.14 Construction
After all that we now give a method to construct an FA
M
FA
= (Q
, q
0
, F
)
which accepts the same language as an NFA
M
NFA
= (Q, , , q
0
, F)
As expected:
= , and q
0
= q
0
.
We will have Q
2
Q
. We dont take Q
= 2
Q
, as we need only concern
ourselves with the states we can actually reach.
Hence we construct
and Q
in tandem.
9.15 Constructing
and Q
Start with q
0
= q
0
.
(q
0
, ) = (q
0
, ) where
This will generate new states of the FA, and we continue this process, con-
structing
using until no new states are created.

Any state of the FA which contains an accepting state of the NFA is an
accepting state of the FA.
Finally we may tidy things up by giving nice names to the states of the FA!
9.16 Example
Suppose we have an NFA = (Q, , , q
0
, F), where:
Q = S
1
, S
2
, S
3
, S
4
, S
5
, S
6
= 0, 1
q
0
= S
1
F = S
2
, S
3
, S
6
0 1
S
1
S
2
, S
5
S
2
S
3
, S
4

S
3
S
3
S
3
S
4
S
2
S
2
S
5
S
6
S
6
S
6

9.17. NFA WITH TRANSITIONS 53
We dene an FA (Q
, q
0
, F
), beginning with
=
q
0
= S
1
We construct
0 1
[=1]S
1
[=2] [=4]S
2
, S
5
[=3]
[=5]S
2
, S
5
[=6]S
3
, S
4
[=8]S
6
[=7]S
3
, S
4
[=10]S
2
, S
3
[=12]S
2
, S
3
[=9]S
6
[=13]S
6
[=14]
[=11]S
2
, S
3
[=15]S
3
, S
4
[=16]S
3
[=17]S
3
[=18]S
3
[=19]S
3
Then:
F
= S
2
, S
5
, S
3
, S
4
, S
6
, S
2
, S
3
, S
3
= S
1
, , S
2
, S
5
, S
3
, S
4
, S
6
, S
2
, S
3
, S
3
9.17 NFA with transitions

We have seen various ways of dening languages:
1. regular expressions
2. nite automata
3. non-deterministic nite automata
We have shown that, although 3 look like an extension of 2 they can accept
exactly the same languages.
Now we will introduce an extension of NFAs, NFA with transitions.
9.18 Motivation
We will show that NFAs with transitions can be used to accept exactly the
same languages as NFAs.
Since we have already shown that that NFAs and FAs can be used to accept
exactly the same languages it follows that NFAs with transitions and FAs
can be used to accept exactly the same languages.
We introduce NFAs with transitions because it is easy to construct an NFA
with transitions which accepts the language described by a regular expression.
An FA is an easily implementable recogniser for a language, so we have a
way to go from a description of typical strings in the language to a recogniser.
We can extend the notion of an NFA to include automata with transitions,
that is with transitions that may occur on reading no symbols from the string.
The main dierence is in the transition function: is now a function from
Q to 2
Q
.
9.19 Formal denition
Denition 11 (Non-deterministic nite automaton with transitions)
A non-deterministic nite automaton with transitions is a 5-tuple (Q, , , q
0
, F)
where:
is a function from Q to 2
Q
: the transition function
q
0
Q: the start state
9.20 Example
M
6
= (Q, , , q
0
, F) where:
Q = S
1
, S
2
, S
3
, S
4
, S
5
= 0, 1
q
0
= S
1
F = S
5
0 1
S
1
S
2
S
3
S
2
S
2
S
4
S
3
, S
4

S
3
S
4
, S
5
S
5
S
4
S
4
, S
5
S
4
, S
5

S
5
S
1

6
\
2
a,b
\
4
a,b
a,b
` _
1
a,
n
n
n
n
n
n
n
n
n
n
b
\
3
b
}
}
}
}
}
}
}
}
}
}
}
}
}
}
b,
` _
5
+
9.22. EASY THEOREM 55

9.22 Easy theorem
Theorem 6 Every language denable by an NFA is denable by a NFA with
transitions.
The proof is easy, as the machines are almost identical.
If M
8
= (Q, , , q
0
, F) is an NFA, then M
9
= (Q
, q
0
, F
) is an NFA
with transitions which accepts the same language, if:
Q
= Q
= q
0
= q
0
F
= F
And for S Q, :
(S, ) = (S, )
(S, ) =
Once again we could have just drawn the diagram representing the NFA,
and announced that it also represented an NFA with transitions.
9.23 Harder theorem
Theorem 7 Every language denable by an NFA with transitions is denable
by a NFA.
As you might expect most of the hard work lies in constructing the transition
function.
If M
10
= (Q, , , q
0
, F) is an NFA with then we will construct M
11
=
(Q
, q
0
, F
), an NFA which accepts the same language.

We begin with:
Q
= Q
=
q
0
= q
0
9.24
the new transition function
(S, ) must now give us the set of states which are reachable from S on a
transition and also on any combination of transitions and a transition.
Some care is needed, as we may be able to make several transitions before
we make the transition, and we may be able to make make several transitions
after we make the transition.
We need to nd every state which is reachable from S by:
( transition)
( transition)( transition)
We can call this the closure of (S, ).

Another way to think of what we are doing is that we are removing the arcs
labelled with , and adding direct arcs labelled with a symbol.
9.25 F
the new set of accepting states

We might initially guess that F
= F, but this is wrong.

Suppose q
0
is not an accepting state, but there is some accepting state which
can be reached by a sequence of transitions from q
0
i.e. the initial state is not
an accepting state but the NFA with transitions accepts the empty string.
If this is the case then F
= F q
0
, otherwise F
= F.
9.26 Example: an NFA equivalent to M
6
M
6
= (Q, , , q
0
, F) as above.
M
12
= (Q
, q
0
, F
) where Q
= Q,
= , q
0
= q
0
.
There is no accepting state reachable on transitions from S
1
, so F
= F.
We construct
0 1
S
1
S
2
, S
4
S
1
, S
2
, S
3
, S
4
, S
5
S
2
S
4
S
1
, S
2
, S
3
, S
4
, S
5
S
3
S
2
, S
4
S
1
, S
2
, S
3
, S
4
, S
5
S
4
S
1
, S
2
, S
4
, S
5
S
1
, S
2
, S
4
, S
5
S
5
S
2
, S
4
S
1
, S
2
, S
3
, S
4
, S
5

12
9.28 Regular expression to NFA with transi-
tions
Now we will show:
any language which is describable by a regular expression is describable
by a non-deterministic nite automaton with moves,
hence any language which is describable by a regular expression is describ-
able by a non-deterministic nite automaton,
hence any language which is describable by a regular expression is describ-
able by a nite automaton.
9.29 Proof outline
The proof is by induction, and its algorithmic content allows us to write a
program which takes a regular expression and constructs an automaton which
recognises the same language.
Recall that the regular expressions are dened as follows: If is an alphabet
then:
9.30. BASE CASES 57
the empty language and the empty string are regular expressions
if r then r is a regular expression
if r and s are regular expressions then so are r +s, rs, r
9.30 Base cases

The base cases are:
the empty language
the empty string
if r then r
In these cases we just present an appropriate automaton.
9.31 Induction steps
The induction steps are:
r +s
rs
r
In these cases we assume that we have automata to accept r and s and we

make use of these machines.
9.32 Proof
Base case:
The following NFA with transitions accepts no strings.
M
= (Q, , , q
0
, F) where:
Q = S
1
= q
0
= S
1
(S
1
, ) = F =
Informally: a machine with no accepting states (and not much else).
Base case:
The following machine accepts the empty string. M
= (Q, , , q
0
, F)
where:
Q = S
1
= q
0
= S
1
(S
1
, ) = F = S
1
Informally: a machine whose start state is the only accepting state.

Base case: r, r
The following machine accepts Language(r), for some r .
M
r
= (Q, , , q
0
, F) where:
Q = S
1
, S
2
q
0
= S
1
F = S
2
r ,= r
S
1
S
2

S
2

Informally: a machine with one accepting state, which can only be arrived
at on a r.
Induction step: r
Assume that we have an automaton M

r
which accepts Language(r) and
show how to make an automaton which accepts Language(r
).
Informally: we need a way to allow ourselves to go through M
r
0, 1, 2, 3,
. . . times. We make the initial state of M
r
an accepting state, and add a
transition from each accepting state to the initial state.
Let M
r
= (Q, , , q
0
, F) be an NFA with transitions which accepts
Language(r). Then M
r
= (Q
, q
0
, F
), where:
Q
= Q
= q
0
= q
0
F
= F q
0
(S, ) = (S, )
(S, ) = (S, ), S , F
(S, ) = (S, ) q
0
, S F
is an NFA with transitions which accepts Language(r
).
Induction step: r +s
Assume that we have automata M
r
and M
s
which accept Language(r)
and Language(s) and show how to combine them into a machine to accept
Language(r +s).
Informally: we need a way to allow ourselves to go through either M
r
or
M
s
. We can do this if we add a new start state with transitions to the
start states of M
r
and M
s
. The accepting states of the new machine will
be the union of the accepting states of M
r
and M
s
. The two machines
may make use of dierent alphabets, and we need to take care over this.
Let M
r
= (Q
r
,
r
,
r
, q
0r
, F
r
) and M
s
= (Q
s
,
s
,
s
, q
0s
, F
s
).
9.33. PROOF SUMMARY 59
Then M
r+s
= (Q
, q
0
, F
), where:
Q
= Q
r
Q
s
S
0
, S
0
, Q
r
Q
s
=
r

s
q
0
= S
0
F
= F
r
F
s
(S
0
, ) = q
0r
, q
0s
(S
0
, ) =
(S, ) =
r
(S, ), S Q
r
,
r
(S, ) = , S Q
r
, ,
r
(S, ) =
s
(S, ), S Q
s
,
s
(S, ) = , S Q
s
, ,
s
(S, ) =
r
(S, ), S Q
r
(S, ) =
s
(S, ), S Q
s
is an NFA with transitions which accepts Language(r +s).
Induction step: rs
Assume that we have automata M
r
and M
s
which accept Language(r)
and Language(s) and show how to combine them into a machine to accept
Language(rs).
Informally: we need a way to allow ourselves to go through M
r
and then
M
s
. We start at the start state of M
r
. From each accepting state of M
r
we add a transition to the start state of M
s
. The accepting states of
the new machine will be the accepting states of M
s
. The two machines
may make use of dierent alphabets, and we need to take care over this.
Let M
r
= (Q
r
,
r
,
r
, q
0r
, F
r
) and M
s
= (Q
s
,
s
,
s
, q
0s
, F
s
).
Then M
r+s
= (Q
, q
0
, F
), where:
Q
= Q
r
Q
s

=
r

s
q
0
= q
0r
F
= F
s
(S, ) =
r
(S, ), S Q
r
,
r
(S, ) = , S Q
r
, ,
r
(S, ) =
r
(S, ) q
0s
, S F
r
(S, ) =
r
(S, ), S Q
r
, S , F
r
(S, ) =
s
(S, ), S Q
s
,
s
(S, ) = , S Q
s
, ,
s
(S, ) =
s
(S, ), S Q
s
is an NFA with transitions which accepts Language(rs).
9.33 Proof summary
We have shown that we can perform the construction in each of the base cases
and in each of the inductive steps. Hence the proof is nished.
We can use this proof to give us an algorithm to recursively construct an
NFA with transitions, given a regular expression.
9.34 Example
We take the regular expression a
+ba through NFA, NFA, to FA.

a
\
a
\
+
b
\
b
\
+
a
` _
+
a
\
+
ba
\
b

\
+
a
\
+
a
+ba
\
+
a
\
+

m
m
m
m
m
m

b

\
+
a
\
+
\
+
a
\
+

m
m
m
m
m
m

b

\
+
a
\
+
Eliminate :

a
\
+
a
.
a
.
` _
+
a
y
y
y
y
y
y
y
y
a

b

a

a
\
+
\
2
a
` _
3
+
a
a
.
` _
1
+
a
x
x
x
x
x
x
x
x
a
\
4
b
\
5
a
\
6
a
` _
7
+
9.34. EXAMPLE 61
To FA:
` _
23
+
a
.
` _
1
+
a
p
p
p
p
p
p
p
b

x
x
x
x
x
x
x
x
\
4
b
` _
56
a
` _
7
+
` _
23
+
a
.
` _
1
+
a
p
p
p
p
p
p
p
b

x
x
x
x
x
x
x
x
\
4
b
` _
56
a
` _
7
+
Simplify:
` _
23
+
a
.
` _
1
+
a
p
p
p
p
p
p
p
b

x
x
x
x
x
x
x
x
` _
56
a
` _
7
+
Chapter 10
Kleenes theorem
10.1 A Theorem
Theorem:
Every language accepted by an FA is generated by a regular expression.
Proof:
omitted, due to industrial action.
Once again, the proof is constructive: it gives us an algorithm which, given a
FA, constructs a regular expression that generates the same language.
We have now established that, in terms of the languages that they can be
used to describe, all of the following are equivalent:
regular expressions
nite automata
non-deterministic nite automata
non-deterministic nite automata with transitions
10.2 Kleenes theorem
Previously we gave a denition of a regular language as one which was described
by a regular expressions.
We can re-formulate the equivalence of regular expressions, FAs and NFAs
as:
Theorem 8 (Kleene)
A language is regular i it is accepted by a FA.
A language is regular i it is accepted by an NFA.
A language is regular i it is generated by a regular grammar.
63
64 CHAPTER 10. KLEENES THEOREM
Note: i is an abbreviation of if, and only if.
Cohen characterises Kleenes theorem as:
the most important and fundamental theorem in the theory of nite
automata
Chapter 11
Closure Properties of
Regular Languages
11.1 Closure properties of regular language
We will now show some closure properties of the set of regular languages. We
will show that
the complement of a regular language is regular
the union of two regular languages is regular
the concatenation of two regular languages is regular
the Kleene closure of a regular language is regular
the intersection of two regular languages is regular
11.2 Formally
If L
1
and L
2
are regular languages then so are:
L
1
L
1
L
2
L
1
L
2
L
1
L
1
L
2
We use Kleenes theorem to prove these.
65
66 CHAPTER 11. CLOSURE PROPERTIES OF REGULAR LANGUAGES
11.3 Complement
Theorem 9 If L
1
is a regular language, then so is L
1
.
We have already shown this, in a previous lecture, where we showed that we
could construct a FA to accept L
1
, given a FA which accepted L
1
.
If L
1
is regular then there is a FA which accepts it. If there is an FA which
accepts L
1
then L
1
is regular.
11.4 Union
Theorem 10 If L
1
and L
2
are regular languages then so is L
1
L
2
.
The language L
1
L
2
is the set of all strings in either L
1
or L
2
.
If L
1
is regular then there is a regular expression r which describes it.
If L
2
is regular then there is a regular expression s which describes it.
Then the regular expression r +s describes L
1
L
2
.
Since L
1
L
2
is described by a regular expression it is regular.
11.5 Concatenation
Theorem 11 If L
1
and L
2
1
L
2
.
The language L
1
L
2
is the set of all strings which consist of a string from L
1
follwed by a string from L
2
.
If L
1
If L
2
is regular then there is a regular expression s which describes it.
Then the regular expression rs describes L
1
L
2
.
Since L
1
L
2
11.6 Kleene closure
Theorem 12 If L
1
is a regular language then so is L
1
.
The language L
1
is the set of all strings which are (possibly empty) sequences
of strings in L
1
.
If L
1
Then the regular expression r
describes L
1
.
Since L
1
11.7 Intersection
Theorem 13 If L
1
and L
2
1
L
2
.
11.8. SUMMARY OF THE PROOFS 67
Note: for any sets A and B: A B = A B
Since L
1
is regular so is L
1
Since L
2
is regular so is L
2
Since L
1
and L
2
are regular so is L
1
L
2
Since L
1
L
2
is regular so is L
1
L
2
11.8 Summary of the proofs
We could have performed all these proofs via FAs, NFAs or regular grammars
if we had wanted to.
We have now shown that what appear to be very dierent ways to dene
languages are all equivalent, and moreover that the class of languages that they
dene is closed under various operations. We have also shown that all nite
languages are regular.
The next obvious question is:
are there any languages which are not regular?
We will answer this question next.
68 CHAPTER 11. CLOSURE PROPERTIES OF REGULAR LANGUAGES
Chapter 12
Non-regular languages
12.1 Non-regular Languages
So far we have seen one sort of abstract machine, the nite automaton, and the
sort of language that this sort of machine can accept, the regular languages.
We will now show that there are languages which cannot be accepted by
nite automata.
Outline of Proof:
Suppose we have a language L and we want to show it is non-regular.
A language is non-regular just when it is not regular.
As a general rule of logic if we wish to show P we assume P and derive a
contradiction.
Hence, to show L is not regular we must show that a contradiction follows
if we assume that L is regular.
When we are trying to derive a contradiction from the assumption that L is
regular we usually make use of Kleenes theorem.
If L is regular then there is a FA which accepts L.
We have already shown that all nite languages are regular, so if we are to
show that L is non-regular then L had better be innite.
Note: there are lots of innite regular languages:e.g. Language((1 +0)
).
Not all innite languages are non-regular, but all non-regular languages are
innite.
The general technique is to show that there is some string which must be
accepted by the FA, but which is not in L.
A FA has, by denition, a nite number of states.
Any suciently long string which is in L will trace a path through any FA
which accepts L which visits some state more than once.
We then attempt to use this fact to give examples of other strings which any
FA which accepts L will accept, but which are not in L.
However, if there are such strings, then no FA accepts L. The contradicts
our assumption that L was regular.
69
70 CHAPTER 12. NON-REGULAR LANGUAGES
Hence L is not regular, i.e. L is non-regular.
Comments on this argument
This argument is perfectly good, but it glosses over the fact that we may have
to do some thinking to show that there are strings which do the trick for us.
And, of course, we may be attempting to show that some regular language
is not regular. In this case we will fail!
12.2 Pumping lemma
Theorem 14 (Pumping lemma) If L is a regular language then there is a
number p, such that (s L)([s[ p s = xyz), where:
1. (i 0)xy
i
z L
2. [y[ > 0
3. [xy[ p
We call p the pumping length.
(s L)([s[ p s = xyz) reads for every string s in L, if s is at least as
long as the pumping length, then s can be written as xyz.
12.3 Pumping lemma informally
The pumping lemma tells us, that for long enough strings s L, s can be
written as xyz such that:
y is not and
xz,
xyz,
xyyz,
xyyyz,
xyyyyz,. . . are all in L.
We say that we can pump s and still get strings in L.
12.4. PROVING THE PUMPING LEMMA 71
12.4 Proving the pumping lemma
We now give a formal proof of the pumping lemma.
We have three things to prove corresponding to conditions 1, 2 and 3 in the
pumping lemma. Let:
M = (Q, , , q
0
, F) be a FA which accepts L,
p be the number of states in M (i.e. p = Cardinality(Q))
s = s
1
s
2
. . . s
n1
s
n
be a string in L such that n p
r
1
= q
0
r
i+1
= (r
i
, s
i
), 1 i n
Then the sequence r
1
r
2
. . . r
n
r
n+1
is the sequence of states that the machine
goes through to accept s. The last state r
n+1
is an accepting state.
This sequence has length n + 1, which is greater than p. The pigeonhole
principle tells us that in the rst p + 1 items in r
1
r
2
. . . r
n
r
n+1
one state must
occur twice.
We suppose it occurs rst as r
j
and second as r
l
.
Notice: l ,= j, and l p + 1.
Now let:
x = s
1
. . . s
j1
y = s
j
. . . s
l1
z = s
l
. . . s
n
So:
x takes M from r
1
to r
j
y takes M from r
j
to r
j
z takes M from r
j
to r
n+1
Hence M accepts xy
i
z, i 0. Thus we have shown that condition 1 of the
pumping lemma holds.
Because l ,= j we know that [y[ , = 0. Thus we have shown that condition 2
of the pumping lemma holds.
Because l p+1 we know that [xy[ p. Thus we have shown that condition
3 of the pumping lemma holds.
Hence the pumping lemma holds.
12.5 A non-regular language
As an example we will show that 0
n
1
n
[n 0 is not a regular language.
We begin the proof by assuming that this is a regular language, so there is
some machine N which accepts it.
Hence, by the pumping lemma, there must be some integer k, such that the
string 0
k
1
k
, can be pumped to give a string which is also accepted by N.
We let xyz = 0
k
1
k
, and show that xyyz is not in 0
n
1
n
[n 0
There are three cases to consider:
1. y is a sequence of 0s
2. y is a sequence of 0s followed by a sequence of 1s
3. y is a sequence of 1s
In case 1 xyyz will have more 0s than 1s, and so xyyz , L.
In case 3 xyyz will have more 1s than 0s, and so xyyz , L.
In case 2 xyyz will have two occurrences of the substring 01, and so xyyz , L.
So in each case the assumption that 0
n
1
n
[n 0 is regular leads to a
contradiction.
So 0
n
1
n
[n 0 is not regular.
Comment
Clearly we can write an algorithm to decide whether a string is in 0
n
1
n
[n 0.
This algorithm cannot be represented by a nite state machine. Hence there
must be more powerful abstract machines than FAs.
12.6 Pumping lemma re-cap
If L is a regular language then there is a number p, such that (s L)([s[
p s = xyz), where:
1. (i 0)xy
i
z L
2. [y[ > 0
3. [xy[ p
12.7 Another non-regular language
We will show that the language
L
1
= w[w has the same number of 0
s as 1
s
is non-regular.
12.8. AND ANOTHER . . . 73
We start by assuming that L is regular. Then we will show that there is a
string in L which cannot be pumped. We let p be the pumping length.
Which string to choose?
One candiate is 0
p
1
p
, where p is the pumping length. A similar string worked
for us before. However, it appears that this string can be pumped. Suppose we
take x and z to be , and y to be 0
p
1
p
.
Then:
xz =
xyz = 0
p
1
p
xyyz = 0
p
1
p
0
p
1
p
xyyyz = 0
p
1
p
0
p
1
p
0
p
1
p
xy . . . yz = . . .
So, no contradiction here.
All is not lost, however. Condition 3 in the pumping lemma tells us that
we can restrict attention to the case where [xy[ p, where p is the pumping
length.
If we split 0
p
1
p
under this condition then y must consist of only 0s.
Now xy
i
z, i 0 will only have the same number of 0s and 1s when i = 1.
So the pumping lemma has told us that we must be able to split 0
p
1
p
in a
way which leads to a contradiction.
So L
1
is not regular.
12.8 And another . . .
We will show that the language L
2
= ww[w 0, 1
is non-regular.
Once again we assume that L
2
is regular and use the pumping lemma to
obtain a contradition.
As usual we let p be the pumping length.
We cant use the string 0
p
1
p
because it is not in the language. Why not try
0
p
0
p
? It is in the language, and it looks similar.
Notice that 0
p
0
p
= 0
2p
.
Alas however we can nd a way to pump 0
p
0
p
and stay in L
2
, even taking
into account condition 3.
Suppose we take x to be , and y to be a string of 0s such that [y[ p and
[y[ is even.
Then xy
i
z, i 0 will always be in L
2
.
So, no contradiction.
Note that every word in L
2
has two equal left and right parts.
What is happening when we pump the string is that we are adding symbols
to the left part.
Then, however, we are allowing ourselves to move half of them into the right
part.
This results in a string with two equal left and right parts once again.
What we need to do is to make sure that we cant do this rearrangement.
Consider the string 10
p
10
p
.
The pumping lemma tells us that we should be able to split this string up
into xyz, such that y is just 0s and xy
i
z, i 0 will be in L
2
.
Now we have our contradiction: xy
i
z is only in L
2
for i = 1.
12.9 Comment
The moral of this story is that we will sometimes have to think a bit to nd a
string which will allow us to nd a contradiction.
Chapter 13
Regular languages:
summary
13.1 Regular languages: summary
We have covered quite a lot of material in this section of the paper, and most
of it has been done carefully, and in some detail.
This material needs to be presented with some care: all the pieces t together
rather neatly, and much of the understanding depends on seeing how the delicate
mechanism works.
As we have worked through the material we have taken a bottom-up ap-
proach, now we will take a top down approach.
13.2 Kleenes theorem
Kleenes thoerem is the most important result. Why?
First, Kleenes theorem relates regular grammars (or regular expressions)
and nite automata.
A grammar is a way to generate strings in a language.
An automaton provides us with a way to recognise strings in a language.
Kleenes theorem neatly relates the strength of the machine with the form
of the grammar. It is not at all obvious that such a neat relationship should
hold.
Second, Kleenes theorem tells us that the deterministic and the non-deterministic
variants of nite automata have just the same power.
Again it is not at all obvious that this should be the case.
It is important not just to know Kleenes theorem as a fact, but also to know
why Kleenes theorem holds. In other words we need to know how the proofs
go. Why?
75
76 CHAPTER 13. REGULAR LANGUAGES: SUMMARY
First, the regular languages are only one class of language. There are similar
results about other classes of language. Understanding how we showed Kleenes
theorem helps us understand the properties of these classes too.
Second, the proofs actually let us construct recognisers from generators,
which turns out to be useful in itself.
Third, much of the mathematics that is used in these proofs is used elsewhere
in computer science.
In fact the bulk of this section of the course was devoted to setting up the
mechanism required to prove Kleenes theorem.
Part II
Context Free Languages
77
Chapter 14
Introducing Context Free
Grammars
14.1 Beyond Regular Languages
Many languages of interest are not regular.
a
n
b
n
[ a N is not regular.
FA cannot count higher than the number of states.
R
[ A
is not regular when [A[ > 1.

FA cant match symbols in with those in
R
Arithmetic expressions with brackets is not regular.
FA cant check that the brackets match
Now consider Context Free Languages,
dened using Context Free Grammars.
What is a Context Free Grammar (CFG)?
How can we dene languages using CFGs?
How can we recognise Context Free Languages?
What is the relationship between Context Free Languages and Regular
Languages?
14.2 Sentences with Nested Structure
2 + 3 ( 4 + 5 7 )
the boy hit the big ball
79
80 CHAPTER 14. INTRODUCING CONTEXT FREE GRAMMARS
while c do while d do S
if (c) if (d) S; else T;
if (c) if (d) S; else T;
14.3 A Simple English Grammar
A sentence is a noun phrase followed by a verb
phrase.
S NP VP
A noun phrase is a determiner followed by either a
noun or an adjective phrase.
NP D N
NP D AP
An adjective phrase is an adjective foll. by a noun. AP A N
A verb phrase is a verb followed by a noun phrase. VP V NP
A determiner is an article, e.g. the. D the
A noun is a word denoting an object, e.g. ball or
boy.
N boy
N ball
An adjective is a word denoting a property, e.g.
big.
A big
A verb is a word denoting an action, e.g. hit . V hit
14.4. PARSE TREES 81
14.4 Parse Trees
the boy hit the big ball
D N V D A N
AP
NP
NP VP
S
`
`
`
`

/
/
/
/
/
/
`
`
`
`
`
`
14.5 Context Free Grammars
Components of a grammar:
Terminal symbols: the, boy, ball, etc.
The words which actually appear in sentences.
Nonterminal symbols: S, NP, VP, D, N etc.
Names for components of sentences.
Never appear in sentences.
Distinguished nonterminal (S) identies the language being dened.
A nite set of Productions.
A production has the form:
nonterminal denition
where denition is a string (possibly empty) of terminal and/or nonterminal
symbols.
is a metasymbol.
It is part of the notation (metalanguage).
82 CHAPTER 14. INTRODUCING CONTEXT FREE GRAMMARS
14.6 Formal denition of CFG
A Context Free Grammar (CFG) is a 4-tuple G = (, N, S, P) where:
is a nite set of terminal symbols (the alphabet).
N is a nite set of nonterminal symbols, disjoint from .
S is a distinguished member of N, called the start symbol.
P is a nite set of production rules of the form , where:
is a nonterminal symbol: N,
is a (possibly empty) string of terminal and/or nonterminal sym-
bols: (N )
.
Note: Some presentations of context free grammars do not allow rules with
empty right-hand sides. We will discuss this restriction later.
Chapter 15
Regular and Context Free
Languages
15.1 Regular and Context Free Languages
Kleenes theorem tells us that all the following formalisms are equivalent in
power:
Regular expressions
(Deterministic) Finite Automata
Nondeterministic Finite Automata
Nondeterministic Finite Automata with Transitions
Now we have a new formalism: the Context Free Grammar.
How does its power compare with those above?
How is the class of Context Free Languages related to the class of regular
languages?
CF = Reg? CF Reg? CF Reg? CF Reg = ?
15.2 CF Reg ,=
The language EVEN-EVEN contains every string over the alphabet a, b with
an even number of as and an even number of bs.
We saw earlier that it may be described by a r.e.:
EVEN-EVEN = Language((aa +bb + (ab +ba)(aa +bb)
(ab +ba))
)
83
84 CHAPTER 15. REGULAR AND CONTEXT FREE LANGUAGES
Here is a context-free grammar for EVEN-EVEN:
S S S B aa
S B S B bb
S S B U ab
S U ba
S U S U
(Proof : Cohen, p236)
So, at least one language is both context free and regular.
15.3 CF , Reg
The language EQUAL contains every string over the alphabet a, b with an
equal number of as and bs.
We proved earlier that EQUAL is not regular (pumping lemma).
Here is a context-free grammar for EQUAL:
S aB A a B b
S bA A aS B bS
A bAA B aBB
(Proof : Cohen, p239)
So, at least one language is context free but is not regular.
15.4 CF Reg
In fact, every regular language is context free. To prove this, we show how to
convert any FA into a context free grammar.
(We could have chosen to convert regular expressions, or NFAs into CFGs
instead, since Kleene showed us they were all equivalent.)
The alphabet of the CFG is the same as the alphabet of the FA.
The set N of nonterminals of the CFG is the set Q of states of the FA.
The start symbol S of the CFG is the start state q
0
of the FA.
For every X, Y Q, a , there is a production X aY in the CFG if
and only if there is a transition X
a
Y in the FA.
For every X F in the FA there is a production X in the CFG.
(Proof : Cohen p260)
15.5. EXAMPLE 85
15.5 Example
` _
A
,+
a
b
` _
B
+
b
\
C
b,a
` _
D
+
b,a
_
The following CFG accepts the same language as the above FA:
= a, b
N = A, B, C, D
S = A
P =
A aD B aD C bC D aD
A bB B bC C aC D bD
A B D
15.6 Regular Grammars

Grammars produced by the above process are called regular grammars.
Every production of a regular grammar has one of the forms:
N
1
T N
2
N
1
T
(where N
1
, N
2
N, and T
).
That is, the RHS of every production consists of a (possibly empty) sequence
of terminals, possibly followed by a single nonterminal.
To prove that the class of languages accepted by regular grammars is exactly
the class of regular languages, we need to show how to transform any regular
grammar into an equivalent FA.
The transformation is similar to the one (which we omitted) turning FAs
into regular expressions. See Cohen (p263) if you are interested in it.
15.7 Parsing using Regular Grammars
Because regular grammars are so much like nite automata, it is easy to generate
(or parse) a sentence using a regular grammar.
S aS (1) T bT (3) U aU (5)
S T (2) T bU (4) U a (6)
To generate the string aabba, we can go through the following steps:
S
1
aS
1
aaS
2
aaT
3
aabT
4
aabbU
6
aabba
This is called a derivation.
15.8 Derivations
Consider the CFG (, N, S, P).
A sentential form is a (possibly empty) string made up of nonterminals and
terminals: that is, a string of type ( N)
.
A derivation is a sequence
0

n
in which:
Each
i
is a sentential form.

0
is the start symbol, S.

n
is a string of type
(i.e., there are no nonterminals left)

For each pair
i

i+1
, we have:

i
has the form n for some nonterminal n N and sentential
forms and ;
there is a production (n ) P; and

i+1
has the form .
15.9 Derivations in Regular Grammars
A semiword is a sentential form of the restricted form
N: that is, a (possibly

empty) string of terminals followed by a single nonterminal.
In a regular grammar, every production has as its right-hand side either a
semiword or a word.
In any derivation,
0
is a semiword: it is a single nonterminal.
If some sentential form
i
is a semiword, and we use a production from a
regular grammar,
i+1
will also be either a semiword or a word.
Thus, to nd a derivation using a regular grammar, we simply select a pro-
duction whose left-hand-side matches the (single) nonterminal in our sentential
form, and repeat until a rule with no nonterminals on its right-hand side is used,
at which stage the result is a word.
15.10 Derivations in Arbitrary CFGs
With regular grammars, every sentential form in a derivation is a semiword.
With arbitrary CFGs, productions can have multiple nonterminals on the
right-hand side, and so sentential forms in derivations have multiple nontermi-
nals too. Further, any terminals need not occur rst in the sentential forms.
It is no longer a matter of selecting a production to match our nonterminal;
we must rst decide which nonterminal to match.
15.11. PARSE TREES 87
(1) (2) (3) (4) (5) (6)
S XS Y S a X bX X c Y d Y e
S
1
XSY S
1
XSY
3
bXSY
1
XXSY Y
4
bcSY
5
XXSdY
1
bcXSY Y
3
XbXSdY
3
bcbXSY Y
3
bXbXSdY
3
bcbbXSY Y
2
bXbXadY
4
bcbbcSY Y
3
bXbbXadY
2
bcbbcaY Y
4
bcbbXadY
5
bcbbcadY
6
bcbbXade
6
bcbbcade
4
bcbbcade
The leftmost nonterminal in a sentential form is the rst nonterminal
that we encounter when scanning left to right.
A leftmost derivation of word w from a CFG is a derivation in which, at
each step, a production is applied to the leftmost nonterminal in the current
sentential form.
Of the two derivations of the word bcbbcade on the previous slide, the rst
is a leftmost derivation; the second is not.
Theorem: Any word that can be generated from a given CFG by some deriva-
tion can be generated by a leftmost derivation.
15.11 Parse Trees
We saw parse trees informally in our rst lecture.
We now want to dene parse trees more formally, by looking at the relation-
ship to derivations.
Given G = (, N, S, P) and w V
T
, a parse tree for w from G is an ordered,
labelled tree such that:
Each leaf node is labelled with an element of .
Each non-leaf node is labelled with an element of N.
The root is labelled with the start symbol, S.
For each non-leaf node, n, if is the label on n and
1
, ,
k
are the
labels on its children, then
1

k
is a rule in P.
The fringe of the tree is w.
15.12 Derivations and parse trees
A partial parse tree is an ordered, labelled tree which is like a parse tree,
except that it may have nonterminals or terminals at its leaves.
The fringe of a partial parse tree is a sentential form.
A derivation
0

n
corresponds to a sequence of partial parse trees
t
0
, . . . , t
n
such that the fringe of t
i
is
i
, and each t
i+1
is obtained from t by
replacing a single leaf node (labelled with a nonterminal symbol n) by a tree
whose root is n.
Since
n

, the nal partial parse tree t

n
is a parse tree for the word
n
,
corresponding to the derivation
0
, . . . ,
n
.
15.13 Ambiguity
A word may have several dierent parse trees, each corresponding to a dierent
leftmost derivation.
(1) (2) (3) (4) (5)
S XY X b X Xa Y b Y aY
S
1
XY
2
bY
5
baY
5
baaY
4
baab
S
1
XY
3
XaY
2
baY
5
baaY
4
baab
S
1
XY
3
XaY
3
XaaY
2
baaY
4
baab
A grammar in which there are sentences with multiple parse trees is called
ambiguous.
(1) (2) (3, 4, 5)
E E+E E EE E 1[2[3
1 + 2 3
E E E

E
1 + 2 3
E E E

E
>
>
`
`
`
E
\
\
\
\
\
It is often possible to transform an ambiguous grammar so that it unam-
biguously denes the same language.
(1) (2) (3) (4) (5, 6, 7)
E E+T E T T TF T F F 1[2[3
1 + 2 3
F F F
>
`
` T T
E T
E
'
'
'
'
'
1 + 2 3
F F F
T T
E
>
>
`
`
E
??????
Chapter 16
Normal Forms
16.1 Lambda Productions
It is sometimes convenient to include productions of the form A in a CFG.
For example, here are two CFGs, both dening the language described by
the regular expression a
b
+
:
S AB S AB
A S B
A aA A aA
B b A a
B bB B b
B bB
The grammar on the left is shorter and easier to understand; however, the
lambda production A causes problems for parsing.
16.2 Eliminating Productions
Suppose a CFG has a production A , which we wish to remove.
Suppose that A is not the start symbol.
Any derivation that uses this production must also have used some produc-
tion B A , for some nonterminal B (possibly A itself).
If we add to the grammar the single production B , this instance of
the Lambda production is unnecessary, but the language accepted is the same.
If we carry out that process for every occurrence of A on the right-hand side
of any production, we may eliminate A altogether.
For example, we take:
S AB, A , A aA, B b, B bB
and, noting that A occurs on the right-hand side of two productions, we add:
S B, A a
89
90 CHAPTER 16. NORMAL FORMS
Now, A is redundant, and we may remove it; the result is as given
before.
However, the process may run into some problems.
16.3 Circularities
First, the process of eliminating a production may introduce new ones. Con-
sider:
S aTU, T , T U, U T
This is a (rather long-winded) grammar for the language consisting of just the
string a.
To remove T , we must add S aU and U ; the result is:
S aTU, S aU, T U, U T, U
Then, to remove U , we must add S aT and T : and so we are
no better o than where we started.
The solution is to remove the potential Lambda productions for T and U
concurrently.
That is, we note that U, though not directly dened by a Lambda produc-
tion, is nullable: there is a derivation
U
T is trivially nullable, because it has a Lambda production of its own.
We must add new productions for every possible nullable nonterminal on the
right-hand side of any production.
S aTU, T , T U, U T
T and U are both nullable, so we must add:
S aU, accounting for T ;
S aT, accounting for U ; and
S a, accounting for TU .
Now, we no longer need the Lambda productions, so we get
S aTU, S aU, S aT, S a, T U, U T
Aside: At this point, we could note that T and U are useless, since no derivation
involving either of them can terminate.
They can thus be removed, leaving just S a.
16.4. PRODUCTIONS ON THE START SYMBOL 91
16.4 Productions on the Start Symbol
Consider the CFG:
S , S aS
It accepts the language described by the r.e. a
.
If we attempt to remove S , we note that S is nullable, and add S a,
resulting in:
S a, S aS
However, this CFG now accepts Language(a
+
): the empty string is no longer
allowed.
The same problem will arise any time S is nullable, not just when it is
directly involved in a Lambda production.
We can make sure that the start symbol is never subjected to Lambda re-
moval by rst transforming our grammar so that the start symbol occurs exactly
once, on the left-hand side of the rst production.
Let G = (, N, S, P) be a CFG, with
(S
1
), (S
2
), . . . , (S
n
) P
being all the productions for S; let S
, N be a brand new nonterminal.

The CFG: G
= (, N S
, S
, P S
S) is equivalent to G, and may

be safely subjected to Lambda removal.
If S was nullable, the additional production S
will be generated; of
course, this Lambda production must not be removed.
16.5 Example
S cS, S TU, T , T aT, U , U bTU
accepts the language L = Language(c
(ba
). Note that L.
First introduce a new start symbol S
:
S
S, S cS, S TU, T , T aT, U , U bTU

Now, note that S, T, and U are all nullable, so add:
S
, S c, S T, S U,
T a, U bT, U bU, U b
The result (with productions removed) also accepts L.
92 CHAPTER 16. NORMAL FORMS
16.6 Unit Productions
Denition: A unit production is a production n
1
n
2
, where n
1
, n
2
N.
For example, consider S A B, A B, B b.
The grammar accepts the language bb; the A nonterminal, with its unit
production A B, is merely a distraction, and the grammar could more infor-
matively be written S A A, A b.
Any leftmost derivation using a unit production n
1
n
2
must include sen-
tential forms
1
= n
1
and
2
= n
2
(
, ( N)
).
Since n
2
is itself a nonterminal, the derivation must also include
3
= ,
where (n
2
) P.
If we add the production n
1
, the unit production becomes unnecessary,
and the derivation can go directly from
1
to
3
.
However, once again we can get circularities.
For example, if P contains both n
1
n
2
and n
2
n
1
, we will indenitely
replace one by the other.
Instead, we use the following rule:
For every pair of nonterminals n
1
and n
2
such that n
1
n
2
,
and for every nonunit production n
2
, add the production n
1

.
As long as all such replacements are done simultaneously, the unit produc-
tions may safely be eliminated.
See Cohen (pp273) for discussion and an example.
16.7 Example
We continue with the example from Slide 36.
S
S [ S cS [ TU [ c [ T [ U
T aT [ a U bTU [ bT [ bU [ b
Direct unit productions are S
S, S T, S U.
Indirectly, we also have S
T, S
U.
From S
S with S cS [ TU [ c we get S
cS [ TU [ c.
From S
T with T aT [ a we get S
aT [ a.
From S
U with U . . . we get S
bTU [ bT [ bU [ b.
From S T with T aT [ a we get S aT [ a.
From S U with U . . . we get S bTU [ bT [ bU [ b.
In summary:
S
[ cS [ TU [ c [ aT [ a [ bTU [ bT [ bU [ b
S cS [ TU [ c [ aT [ a [ bTU [ bT [ bU [ b
T aT [ a
U bTU [ bT [ bU [ b
Chapter 17
Recursive Descent Parsing
17.1 Recognising CFLs (Parsing)
How can we determine whether a string w
is in the language generated by

a CFG, G = (, N, S, P)?
We know that w L(G) i there is a derivation for w from S equivalently,
if there is a parse tree with root S and fringe w.
To determine whether w L(G), we try to build a parse tree for w.
Top-down: Build parse tree starting from the root. At each step, choose:
A nonterminal, N, in the fringe to expand.
A rule with N as its lhs to apply.
Bottom-up: Build parse tree starting from the leaves and working upwards to
the root. At each step, choose:
A substring, , of the current string to reduce.
A rule with as its rhs to apply.
17.2 Top-down Parsing
Consider the following grammar for nested lists of numbers:
L ( ) [ ( B ) (1,2)
B E [ E , B (3,4)
E L [ 1 [ 2 [ 3 (5,6,7,8)
Lets try to parse the input (1,()).
L
( 1 , ( ) )
93
94 CHAPTER 17. RECURSIVE DESCENT PARSING
17.3 Top-down Parsing
Here is the leftmost derivation:
L
( B )
( E , B )
( 1 , B )
( 1 , E )
( 1 , L )
( 1 , ( ) )
Working left to right: At each step, either expand the leftmost non-terminal,
or match the leftmost terminal with an input symbol.
This is called LL(1) parsing.
17.4 Recursive Descent Parsing
Recursive Descent is a technique for building one-o LL(1) parsers.
Parser is a set of mutually recursive procedures, with one procedure cor-
responding to each non-terminal.
Each procedure decides which rule to use and looks for the symbols in the
rhs of that rule.
Matches the rhs from left to right.
Must be able to choose rule by looking at next input.
17.5 Building a Parser for Nested Lists
LL(1) grammar for nested lists:
List ( RestList (1)
RestList ) [ ListBody ) (2), (3)
ListBody ListElt RestBody (4)
RestBody [ , ListBody (5), (6)
ListElt number [ List (7), (8)
Parser will have one procedure for each nonterminal: ParseList , ParseRestList ,
etc.
Input is a sequence of symbols, ss, recognised by a scanner, which also
removes white space.
The symbols used are:
17.6. PARSER FOR NESTED LISTS 95
Symbol Kind Symbol
lparensym (
rparensym )
commasym ,
numbersym (0[1[2[3[4[5[6[7[8[9)
+
17.6 Parser for Nested Lists
Rule: List ( RestList (1)
procedure ParseList (in out ss);
begin
if head(ss) = lparensym then
ss := tail(ss); ParseRestList(ss)
else Error
end ParseList
Rule: RestList ) [ ListBody ) (2), (3)
procedure ParseRestList (in out ss);
begin
if head(ss) = rparensym then
ss := tail(ss)
else
ParseListBody(ss);
if head(ss) = rparensym then ss := tail(ss) else Error
end ParseList
Rule: ListBody ListElt RestBody (4)
procedure ParseListBody(in out ss);
begin
ParseListElt(ss);
ParseRestBody(ss)
end ParseListBody
Rule: RestBody [ , ListBody (5), (6)
procedure ParseRestBody(in out ss);
begin
if head(ss) = commasym then
ss := tail(ss);
ParseListBody(ss)
else
skip
end ParseRestBody
Rule: ListElt number [ List (7), (8)
procedure ParseListElt(in out ss);
begin
if head(ss) = numbersym then
ss := tail(ss);
else
ParseList(ss)
end ParseListElt
17.7 Building a Parse Tree
How can we construct the parse tree?
Insert code to collect the components corresponding to the RHS of the
rule applied.
Add code to apply rule at end of code that checks each rule.
Scanner returns symbol value as well as symbol kind.
Parser procedure returns tree as well as consuming input.
17.8 Parser for Nested Lists
procedure ParseList(in out ss; out t);
begin
if head(ss) = lparensym then
ss := tail(ss);
ParseRestList(ss, u);
t := Tree(List, Leaf(lparensym), u))
else
Error
end ParseList
procedure ParseRestList(in out ss; out t);
begin
ss := tail(ss);
t := Tree(RestList, Leaf(rparensym)))
else
ParseListBody(ss, u);
ss := tail(ss);
t := Tree(RestList, u, Leaf(rparensym))
else Error
end ParseRestList
17.9. LL(1) GRAMMARS 97
procedure ParseListBody(in out ss; out t);
begin
ParseListElt(ss, u);
ParseRestBody(ss, v);
t := Tree(ListBody, u,v))
end ParseListBody
procedure ParseRestBody(in out ss; out t);
begin
if head(ss) = commasym then
ss := tail(ss);
ParseListBody(ss, u);
t := Tree(RestBody, Leaf(commasym), u))
else
skip;
t := Tree(RestBody, ))
end ParseRestBody
procedure ParseListElt(in out ss; out t);
begin
if head(ss).type = numbersym then
t := Tree(ListElt, Leaf(head(ss).value)));
ss := tail(ss)
else
ParseList(ss, u);
t := Tree(ListElt, u))
end ParseListElt
17.9 LL(1) Grammars
Recursive Descent is an LL(1) parsing technique:
Leftmost derivations
Left-ro-right scanning of input
1 symbol lookahead.
This only works for some grammars:
Requirement 1:
Two productions for the same nonterminal cannot produce strings that
start with the same terminal.
Requirement 2:
If a nonterminal can produce , it cannot start with any terminal that
can also follow it.
17.10 First and Follow sets
For any grammar (, N, S, P) and sentential form ( N)
:
rst() = x [ x for some
follow() = x [ S x for some , ( N)
That is, rst() is all those terminals that can appear rst in any string
derived from , and
follow() is all those terminals that can appear immediately after in any
sentential form derived from the start symbol.
Now, the LL(1) requirements are:
Requirement 1: If N and N , then rst() rst() = .
Requirement 2: If N , then rst(N) follow(N) = .
Consider the grammar for arithmetic expressions:
E T [ E +T [ E T (1), (2), (3)
T F [ T F [ T/F (4), (5), (6)
F id [ (E) (7), (8)
rst(id) = id
rst((E)) = (
first(F) = first(T F) = first(T/F) = id, (
first(T) = first(E +T) = first(E T) = id, (
Requirement 1 is satised for productions 7,8, but violated for productions
1,2,3 and 4,5,6.
Requirement 2 is trivially satised (no productions).
17.11 Transforming CFGs to LL(1) form
Turn left-recursion into right recursion.
For example:
E T [ E +T [ E T
T F [ T F [ T/F
F id [ (E)
;
E T [ T +E [ T E
T F [ F T [ F/T
F id [ (E)
Left factor.
N
N
N
;
N N
N
N
[
where N
is a new nonterminal.
17.11. TRANSFORMING CFGS TO LL(1) FORM 99
E T [ T +E [ T E
T F [ F T [ F/T
F id [ (E)
;
E TE
[ +E [ E
T FT
[ T [ /T
F id [ (E)
Now Requirement 1 is satised by all nonterminals.
But now E
and T
are nullable, so we need to check Requirement 2:

rst(E
) = +, rst(T
) = , /
follow(E
) = follow(E) = )
follow(T
) = follow(T) = rst(E
) = +,
rst (E
) follow(E
) = rst (T
) follow(T
) = OK!
Cant always turn a CFG into LL(1) form.
Some CFLs cant be parsed deterministically.
E.g. ww
R
[ w a, b
: S [ a S a [ b S b
Breaks LL(1) requirement (2), because S but rst(S) = follow(S) =
a, b.
If parser sees a as next symbol, cannot decide whether to do Sa a = a
or Sa aSaa.
Some CFLs can be parsed deterministically bottom up, but not top down.
E.g. a
x+y
b
x
c
y
[ x, y 0 : S [ a T b [ a S c T [ a T b
Breaks LL(1) condition (1). Cant be left-factored. Top down parser cant
tell whether initial a will eventually match a b or a c.
Chapter 18
Pushdown Automata
18.1 Finite and Innite Automata
L
0
= a
m
b
n
[ m, n 0 is regular: a
is its regular expression.

` _
S
` _
S
+
1
b
L
1
= a
n
b
n
[ n 0 is not regular (pumping lemma), but it is context free:
S aSb [ is its CFG.
If we augment L
0
s NFA with a counter, a super-NFA can recognize
L
1
:
` _
S
a; c:=c+1
` _
S
c=0
1
b; c:=c1
L
2
= w
1
w
2
[ w
1
, w
2
a, b
is regular: (a +b)
(a +b)
is its regular
expression.
` _
S
` _
S
+
1
a
L
3
= ww
R
[ w a, b
is not regular, but it is context free: S

aSa [ bSb [ is its CFG.
If we augment L
2
s NFA with a stack, the machine can recognize L
3
:
` _
S
a; s.push(a)
b; s.push(b)
` _
S
s.isempty()
1
a; s.pop()==a
b; s.pop()==b
101
102 CHAPTER 18. PUSHDOWN AUTOMATA
A Pushdown Automaton is simply a NFA augmented with a stack.
18.2 Pushdown Automata
A PDA is a NFA with a stack.
At each transition, we can:
read a symbol x;
pop a symbol y from the stack; and
push a symbol z onto the stack.
We draw this as follows:
\
x;y;z
\
Any of x, y, z may be , indicating that nothing is read, popped, or pushed
at that transition.
We label start states with and nal states with +, as before; but this time,
a nal state is accepting only if the stack is empty.
Our NPDA for L
3
= ww
R
is now written:
` _
S
0
;;
a;;a
b;;b
` _
S
+
1
a;a;
b;b;
A stack may also serve as a counter, by stacking and matching some arbitrary
symbol (say #); so L
1
= a
n
b
n
is:
` _
S
0
;;
a;;#
` _
S
+
1
b;#;
L
4
= a
n
b
2n
[ n 0 (S aSbb [ )
` _
S
0
a;;#
;;
` _
S
1
;;#
.
` _
S
+
2
b;#;
18.3. FORMAL DEFINITION OF PDA 103

L
5
= a
n
b
n/2
[ n 0 (S aT [ T T aaTb [ )
` _
S
0
;;
a;;#
` _
S
+
1
;#;
` _
S
+
2
b;#;
.
L
6
= a
m
b
n
c
m+n
[ m, n 0 (S aSc [ T T bTc [ )
` _
S
0
;;
a;;#
` _
S
1
;;
b;;#
` _
S
+
2
c;#;
L
7
= a
m
b
n
c
mn
[ m n 0 (S aSc [ T T aTb [ )
` _
S
0
;;
a;;#
` _
S
+
1
;;
b;#;
` _
S
+
2
c;#;
18.3 Formal Denition of PDA

P = (, , Q, q
0
, F, )
is the alphabet of input symbols, which can appear in sentences recog-
nized by the PDA.
is the alphabet of symbols that can appear on the stack: may or may
not be the same as .
Q is the set of states.
q
0
Q is the start state.
F Q is the set of nal states.
is the transition function, which will (nondeterministically) map a state,
an input symbol (or ), and a stack symbol (or ) to a new state and a
new sequence of stack symbols:
: Q( ) ( ) 2
Q

18.4 Deterministic and Nondeterministic PDAs
PDAs may be deterministic or nondeterministic, much like nite acceptors.
A deterministic PDA is one in which every input string has a unique path
through the machine.
This means that at each state, it must be possible to deterministically decide
whether to take a transition that:
reads a symbol from the input, but pops nothing from the stack (x; ; ?)
reads no input, but pops a symbol from the stack (; y; ?);
both reads and pops simultaneously (x; y; ?); or
neither reads nor pops (; ; ?).
Consider again the PDA for L
6
= a
m
b
n
c
m+n
:
` _
S
0
;;
a;;#
` _
S
1
;;
b;;#
` _
S
+
2
c;#;
Applying the same algorithm as we used for NFA FA:

` _
S
+
0
a;;#
b;;#
c;#;
` _
S
+
012
a;;#
b;;#
c;#;
,~
~
~
~
~
~
~
~
~
~
` _
S
+
12
b;;#
c;#;
_ n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
` _
S
+
2
c;#;
The resulting PDA is deterministic, and generates L

6
.
However, this does not happen with L
3
= ww
R
:
` _
S
0
;;
a;;a
b;;b
` _
S
+
1
a;a;
b;b;
\ __
S
,+
0
a;;a
a;a;
b;;b
b;b;
a;;a
b;;b
` _
S
+
1
a;a;
b;b;
` _
S
+
0
a;;a
b;;b
a;a;
b;b;
f
f
f
f
f
f
f
f
f
f
` _
S
+
01
a;a;
b;b; ,}
}
}
}
}
}
}
}
}
a;;a
b;;b
` _
S
+
1
a;a;

b;b;
.
18.5. CFG PDA 105
This language cannot be parsed deterministically.
18.5 CFG PDA
Every language generated by a CFG may be accepted by a PDA.
The proof is by construction. We will in fact show how to build two dierent
PDAs (corresponding to top-down and bottom-up parsers) for every CFG.
For both constructions, we suppose we have a CFG G = (, N, S, P) and we
will construct a PDA P = (, , Q, q
0
, F, ).
18.6 Top-Down construction
= = N Q = 0, 1 q
0
= 0 F = 1
(0, , ) (1, S)
For each x , (1, x, x) (1, ) (match)
For each (X ) P, (1, , X) (1, ) (expand)
` _
0
;;S
` _
1
+
x;x; (match)
.
;X; (expand)
18.7 S aS [ T T b [ bT
` _
0
;;S
'` __
1
+
a;a;
b;b;
;S;aS

;S;T
;T;b
_
;T;bT
.
state input stack (top to left)
0 abb
1 abb S expand
1 abb aS match
1 bb S expand
1 bb T expand
1 bb bT match
1 b T expand
1 b b match
1 accept
18.8 Bottom-Up construction
= = N Q q
p
, q
f
q
0
= q
p
F = q
f
For each x , (q
p
, x, ) (q
p
, x) (shift)
For each (X ) P, where =
0
, . . . ,
n
:
create new states q
1
, . . . , q
n
;
(q
p
, ,
n
) (q
n
, )
(q
n
, ,
n1
) (q
n1
, )

(q
1
, ,
0
) (q
p
, X) (reduce)
(q
p
, , S) (q
f
, )
18.9 S aS [ T; T b [ bT
` _
q
+
f

;a;S
'` __
q
p
a;;a
b;;b
;S;
_
;T;S
;T;
;b;T
;S;

;b;T
state input stack (top to left)

q
p
abb shift
q
p
bb a shift
q
p
b ba shift
q
p
bba reduce
q
p
Tba reduce
q
p
Ta reduce
q
p
Sa reduce
q
p
S accept
q
f

18.10 PDA CFG
Every language accepted by a PDA may be generated by a CFG.
We must show how to construct a CFG G = (, N, S, P) from an arbitrary
PDA P = (, , Q, q
0
, F, ).
To make the construction simpler, we suppose:
[F[ = 1, i.e. P has only one accepting state. If [F[ > 1, add new states
q
, q
f
/ Q, put F = q
f
, for some new stack symbol y
/ , and for each

q F, add transitions
\
q
;;y
` _
q
;y
;
` _ q
f .
Every transition either pushes one stack symbol, or pops one stack symbol,
but not both.
18.11. PDA TO CFG FORMALLY 107
Replace any transition
\
q
1
x;y;z
\
q
3 that has both y and z not
(x ) by the transitions
\
q
1
x;y;
\
q
2
;;z
\
q
3 for some
new state q
2
/ Q.
Replace any transition
\
q
1
x;;
\
q
3 by the transitions
\
q
1
x;y
;
\
q
2
;;y
\
q
3
for some new state q
2
/ Q and stack symbol y
/ .
Each nonterminal A
pq
in the CFG represents a sequence of transitions from
state p to state q, with no net change to the stack.
Note that the rst transition in the sequence must be a push, and the last
must be a pop:
\
p
x;;y
\
r
\
s
x
;z;
\
q
.
Case 1: y = z
Put A
pq
xA
rs
x
.
Case 2: y ,= z There must be some intermediate transition which pops the y
pushed by the rst transition:
\
p
x;;y
\
r
\
r
;y;
\
s
\
s
x
;z;
\
q
.
Put A
pq
A
ps
A
s
q
.
18.11 PDA to CFG Formally
We transform P = (, , Q, q
0
, q
f
, ) to G = (, N, S, P).
Let N = Q Q: nonterminals of the grammar are pairs of states of the
automaton. For convenience, we write A
pq
for the pair (p, q).
Let S = A
q0q
f
.
1. For each p Q, put A
pp
in P.
2. For each p, q, r Q, put A
pq
A
pr
A
rq
in P.
3. For each p, q, r, s Q, y , and x, x
,
if contains transitions
\
p
x;;y
\
r and
\
s
x
;y;
\
q
,
put A
pq
xA
rs
x
in P.
18.12 Example (L
6
from slide 69)
` _
0
;;
a;;#
\
1
;;
b;;#
` _
2
+
c;#;
We have just one nal state.

The ; ; transitions are not allowed. Choose $ / , and put instead:
` _
0
;;$
a;;#
\
3
;$;
\
1
;;$
b;;#
\
4
;$;
` _
2
+
c;#;
1. A
00
A
11
A
22
A
33
A
44

2. A
02
A
01
A
12
A
03
A
01
A
13
[ A
02
A
23
A
04
A
01
A
14
[ A
02
A
24
[ A
03
A
34
(N.B. should also have e.g. A
01
A
03
A
31
, but can see from the shape of
the PDA that this will be useless.)
3. (a) Consider the transition
\
0
a;;#
. Its pair is
\
2
c;#;
.
So: A
02
aA
02
c.
(b) Consider the transition
\
1
b;;#
. Its pair is
\
2
c;#;
.
So: A
12
bA
12
c.
(c) Consider the transition
\
0
;;$
\
3 . Its pair is
\
3
;$;
\
1 . So:
A
01
A
33
, i.e. A
01
A
33
.
(d) Consider the transition
\
1
;;$
\
4 . Its pair is
\
4
;$;
\
2 . So:
A
12
A
44
, i.e. A
12
A
44
.
Summarising:
A
02
A
01
A
12
[ aA
02
c
A
01
A
33
A
12
bA
12
c [ A
44
A
33

A
44

and simplifying (A
02
becomes S, A
12
becomes T, everything else is ):
S T [ aSc
T bTc [
which indeed generates the language a
m
b
n
c
m+n
[ m, n 0.
Chapter 19
Non-CF Languages
19.1 Not All Languages are Context Free
We know Reg CFL: all regular languages are context free, but there are
context free languages that are not regular.
Are there languages that are not context free?
Yes, there are many such languages!
The following languages cannot be generated by any context free grammar,
nor can they be recognized by any push-down automaton.
a
n
b
n
c
n
[ n 0 (would need two counters)
ww [ w a, b
(would need queue, not stack)

next weeks lotto numbers (would need a miracle)
Many of the constraints on programming languages cannot be expressed
(easily) using CFGs. For example:
all the identiers in a list of declarations must be distinct
identiers must be declared before they are used
procedure calls must have arguments consistent with their declarations.
These are constraints on the context in which a particular piece of otherwise
context-free syntax may occur. Approaches to dealing with them include:
ad hoc approaches: write a CFG to dene the CF parts of the language
(sometimes called a covering grammar), build a parser, then augment
the parser with code to check the context constraints;
attribute grammars: CFGs annotated to show the extra relationships be-
tween nonterminals;
context-sensitive grammars.
19.2 Context Sensitive Grammars
A phrase-structure grammar is a structure G = (, N, S, P), where , N,
and S are as we have seen already.
109
110 CHAPTER 19. NON-CF LANGUAGES
Dierent classes of languages arise by placing dierent constraints on the
productions, P.
Let X, Y N;
; and , ( N)
.
Regular grammar: X Y or X
Context free grammar: X
Context sensitive grammar: , where [[ [[.
For example, a CSG may include a production abSd abcTd, which says
in eect that S cT, but only if it is preceded by ab and followed by d.
19.3 Example (1)
The following CSG generates the language described by the regular expression
(a+b)
(ac+bd).
S aS
S bS
aS aT
bS bU
T c
U d
19.4 Example (2)
The following CSG generates the language wcw [ w a, b
.
S c S aTS S bUS
Ta aT Tb bT Tc ca
Ua aU Ub bU Uc cb
19.5 Generating the empty string
Our denition for CSG has productions where [[ [[. This doesnt
permit productions ([[ = 0), so we must also allow S if the language to
be generated contains .
The following CSG generates the language a
n
b
n
c
n
[ n 0.
S [ aTbc
aTb aaTbbU [ ab
Ub bU
Uc cc
19.6. CFL CSL 111
19.6 CFL CSL
We just saw an example of a language (a
n
b
n
c
n
) that is context sensitive but not
context free (proof: pumping lemma for context free languages not covered in
this course).
To show CFL CSL, we need only show that every context free language
is context sensitive.
Let G = (, N, S, P) be a CFG for a language L. We will construct a CSG
G
= (, N, S, P
).
Without loss of generality, suppose P has no productions, except perhaps
on S.
For each production X in P, put (where = X) in P
.
Now every production has the form where , ( N)
,
[[ = 1, and (with perhaps one permitted exception), [[ 1,
so G
is a CSG.
112 CHAPTER 19. NON-CF LANGUAGES
Chapter 20
Closure Properties
20.1 Closure Properties
We now turn our attention to closure properties.
Recall that regular languages are closed under union, concatenation, Kleene
closure, complementation, and intersection. That is, if L
1
and L
2
are regular:
L
1
L
2
is regular;
L
1
L
2
is regular;
L
1
is regular;
L
1
is regular; and
L
1
L
2
is regular.
Do the corresponding closure properties hold for context-free languages?o
20.2 Union of Context Free Languages is Con-
text Free
Theorem If L
1
and L
2
are context free languages, their union L
1
L
2
is also
context free.
Proof (using grammars) Let G
1
= (
1
, N
1
, S
1
, P
1
) and G
2
= (
2
, N
2
, S
2
, P
2
)
be CFGs for L
1
and L
2
respectively.
Without loss of generality, let N
1
N
2
= (if not, systematically rename
all nonterminals in one of the grammars). Also, let S , N
1
N
2
be a fresh
nonterminal symbol.
G
3
= (
1
2
, N
1
N
2
S, S, P
1
P
2
S S
1
, S S
2
) is a
CFG for L
1
L
2
.
113
114 CHAPTER 20. CLOSURE PROPERTIES
Example
Language(G
1
) = a
m
b
m
Language(G
2
) = b
n
c
n
S aSb S bSc
S S
First, rename the nonterminals of G
1
and G
2
apart, by adding subscripts:
G
1
= (a, b, c, S
1
, S
1
, S
1
aS
1
b, S
1
)
G
2
= (a, b, c, S
2
, S
2
, S
2
bS
2
c, S
2
)
Now form the union as above; the resulting grammar:
S S
1
[ S
2
S
1
aS
1
b [ S
2
bS
2
c [
is a CFG for a
m
b
m
b
n
c
n
.
Alternative proof (using PDAs) Let P
1
= (
1
,
1
, Q
1
, q
1
, F
1
,
1
) and P
2
=
(
2
,
2
, Q
2
, q
2
, F
2
,
2
) be PDAs for L
1
and L
2
respectively.
Without loss of generality, let Q
1
Q
2
= , and let q
0
, Q
1
Q
2
be a
fresh state.
P
3
= (
1
2
,
1
2
, Q
1
Q
2
q
0
, q
0
, F
1
F
2
,
3
) is a PDA
for L
1
L
2
, where
3
is
1

2
together with transitions
3
(q
0
, , )
(q
1
, ) and
3
(q
0
, , ) (q
2
, ).
Example
P
1
:
` _
q
1
;;
a;;#
` _
q
+
2
b;#;
P
2
:
` _
q
3
;;
b;;#
` _
q
+
4
c;#;
P
3
:
\
q
1
;;
a;;#
` _
q
+
2
b;#;
` _
q
0
;;
;;
b
b
b
b
b
b
b
b
b
\
q
3
;;
b;;#
` _
q
+
4
c;#;
20.3 Concatenation of CF Languages is Context

Free
Theorem If L
1
and L
2
are context free languages, their concatenation L
1
L
2
is also context free.
20.4. KLEENE STAR OF A CF LANGUAGE IS CF 115
Proof (using grammars) Let G
1
and G
2
be as before, with no shared non-
terminals, and S a fresh nonterminal.
G
3
= (
1
2
, N
1
N
2
S, S, P
1
P
2
S S
1
S
2
) is a CFG
for L
1
L
2
.
Example: L
1
= a
m
b
m
L
2
= b
n
c
n
L
3
= a
m
b
m
b
n
c
n
= a
m
b
m+n
c
n
G
1
= (a, b, S
1
, S
1
, S
1
aS
1
b, S
1
)
G
2
= (b, c, S
2
, S
2
, S
2
bS
2
c, S
2
)
G
3
= (a, b, c, S, S
1
, S
2
, S,
S S
1
S
2
, S
1
aS
1
b, S
1
, S
2
bS
2
c, S
2
)
20.4 Kleene Star of a CF Language is CF
Theorem If L is a context free language, so is L
.
Proof (using grammars) Let G = (, N, S, P), and let S
, N be a fresh
nonterminal.
G
= (, N S
, S
, P S
, S
SS
) is a CFG for L
.
Example L = a
m
b
m
L
= a
m
b
m
G = (a, b, S, S, S aSb, S )
G
= (a, b, S, S
, S
,
S
SS
, S
, S aSb, S )
Exercise Eliminate the productions from G
.
20.5 Intersections and Complements of CF Lan-
guages
Theorem The intersection L
1
L
2
of CF languages L
1
and L
2
may be CF, or
it may not.
Proof We already know that intersections of regular languages are regular, so
if L
1
and L
2
are regular (and hence CF), L
1
L
2
is also regular (and hence
CF).
However, if we take L
1
= a
n
b
n
c
m
and L
2
= a
n
b
m
c
m
, which are both
CF (see Cohen, p385), the intersection L
3
= a
n
b
n
c
n
, which we have
already seen is not CF.
Theorem The complement L
of a CF language L may or may not be CF.

Proof Suppose L
1
and L
2
are CF; we know L
1
L
2
may be non-CF.
However, if complements were CF, (L
1
L
2
)
would be CF: contradic-

tion!
116 CHAPTER 20. CLOSURE PROPERTIES
Chapter 21
Summary of CF Languages
21.1 Why context-free?
There are non-regular languages
Proof: the pumping lemma
Sentences of many languages have a natural nested or recursive structure
contrast with regular languages, whose structure is essentially se-
quence/selection/repetition
the nested structure can be exposed by writing context free grammars,
and by drawing parse trees
terminals appear in sentences, and at the leaves of parse trees
nonterminals name categories of fragments of sentences, and on the
interior nodes of parse trees.
productions describe how the nonterminals relate to one another, and
to the terminals.
21.2 Phrase-structure grammars
A phrase-structure grammar G = (, N, S, P) where:
is a nite set of terminals;
N is a nite set of nonterminals, N = ;
S is the start symbol, S N; and
P is a set of productions.
A string (or sentence) is a sequence of terminals
.
A sentential form is a sequence of terminals and/or nonterminals ( N)
.
A production is a pair of sentential forms, written (, (N)
).
117
118 CHAPTER 21. SUMMARY OF CF LANGUAGES
21.3 Special cases of Phrase-structure grammars
A Context Free Grammar (CFG) is a phrase-structure grammar in which
every production has N. That is, the left-hand side of every
production is a single nonterminal.
We call the class of languages generated by CFGs context free languages
(CFL).
A Regular Grammar (RG) is a CFG with the additional property that, for
every production ,
or
N. That is, the right-hand side

of every production is a sequence (perhaps empty) of terminals, optionally
followed by a single nonterminal.
The class of languages generated by RGs is the same as the class accepted
by nite automata, so by Kleenes theorem is the same as the class of
regular languages.
Every RG is a CFG, so every regular language is a context free language:
Reg CFL.
A Context Sensitive Grammar (CSG) is a phrase-structure grammar in
which every production has either [[ [[, or = S and = .
That is, the left- and right-hand sides are arbitrary sentential forms, with
the only restriction being that the right-hand side is no shorter than the
left-hand side, except that there may be a single production for the start
symbol. They are sometimes called non-reducing grammars.
We call the class of languages generated by CSGs context sensitive lan-
guages (CSL).
Every CFG may be transformed to an equivalent CFG whose only pro-
duction is on the start symbol, so every CFG is equivalent to a CSG.
Hence, every context-free language is a context sensitive language: CFL
CSL.
21.4 Derivations
Given G = (, N, S, P) and w
, a derivation in G of w is a sequence
0
. . .
n
of sentential forms, where
0
= S, alpha
n
= w, and for each 0 i < n,
i+1
is derived from
i
by replacing some nonterminal n in
i
by , where
(n ) P.
G generates w i there is a derivation in G of w.
A leftmost derivation is one in which the leftmost nonterminal of
i
is always
replaced.
If there is a derivation in G of w, there is a leftmost derivation.
String w is ambiguous for G if there is more than one leftmost derivation in
G of w.
G is ambiguous if at least one w is ambiguous for G. There may or may not
be an equivalent unambiguous grammar.
21.5. PARSING 119
21.5 Parsing
Parsing is the process of nding derivations.
The derivation may be summarised by a parse tree.
Parsing in regular grammars essentially simulates the operation of a NFA.
Parsing in arbitrary grammars is, in general, more dicult.
21.6 Recursive descent and LL(1) grammars
An LL(1) grammar is a CFG that may be parsed top-down and deterministically.
A grammar is LL(1) if every nonterminal satises the two requirements.
If a grammar is LL(1), there may or may not be an equivalent LL(1) gram-
mar.
A recursive descent parser is a one-o recogniser for an LL(1) grammar.
21.7 Pushdown automata
A PDA P = (, , Q, q
0
, F, ) is an automaton that can accept a string w if
there is a path from the initial state q
0
to some nal state in F such that:
the stack is empty initially and nally;
the sequence of read symbols along the path is w;
each pop symbol along the path matches the symbol currently at the top
of the stack.
The class of languages accepted by PDAs is exactly same as the class of
languages generated by CFGs (i.e., the context-free languages).
A PDA is deterministic (DPDA) if, for every q Q, x , and y , there
is at most one enabled transition.
There may or may not be a DPDA that accepts some CFL.
21.8 Closure
There are non-CF languages: for example, a
n
b
n
c
n
[ n 0.
Proof : the pumping lemma for CF languages (NOT DONE).
The class of CFLs is closed under:
union
concatenation
Kleene closure
The class of CFLs is not closed under:
120 CHAPTER 21. SUMMARY OF CF LANGUAGES
intersection
complementation
21.9 Constructions on CFLs
FA to regular grammar
Regular grammar to FA (NOT DONE)
Ambiguous to unambiguous grammar (perhaps)
Remove Lambda productions
Remove unit productions
CFG to LL(1) form (perhaps)
CFG to PDA (top-down and bottom-up)
PDA to CFG
Part III
Turing Machines
121
Chapter 22
Turing Machines I
22.1 Introduction
So far in COMP 202 you have seen:
nite automata
pushdown automata
In this part of COMP 202 we will look at Turing machines.
Turing machines are named after the English logician Alan Turing. They
were introduced in 1936 in his paper On computable numbers with an applica-
tion to the Entscheidungsproblem.
We can think of nite automata, pushdown automata and Turing machines
as models of computing devices. A nite automaton has
a nite set of states
no memory.
A pushdown automaton has
unlimited memory with restricted access.
A Turing machine has
unlimited memory with unrestricted access
Dates:
Finite automata were rst described in 1943
Pushdown automata were rst described in 1961
Turing machines were rst described in 1936
123
124 CHAPTER 22. TURING MACHINES I
22.2 Motivation
While thinking of nite automata, pushdown automata, and Turing machines
as machines of increasing power is quite useful it does not give any insight into
why or how Turing invented his abstract machines. This story is worth telling.
Nowadays, it is impossible to say exactly how many computing devices there
are in the world, we can only give some vague estimate in terms of hundreds of
thousands, or millions. We can say with great precision how many computers
there were in the world in the early 1930s: none.
How then did Turing come to invent Turing machines?
22.3 A crisis in the foundations of mathematics
In the beginning of the 20th century mathematics was facing a crisis in its
foundations.
David Hilbert was the leading mathematician of the early 1900s. He believed
that mathematics would escape intact from the crisis it faced.
Hilbert believed that mathematics was decidable: that is, for every mathe-
matical problem, there is an algorithm which either solves it or shows that no
solution is possible.
The problem of showing that mathematics was decidable came to be known
as the Entscheidungsproblem: German for decision problem.
In order to make progress on this problem it was necessary to get a clearer
idea about
algorithms, and
the class of computatble functions.
22.4 The computable functions: Turings approach
The question that Alan Turing was really trying to investigate was What class
of functions can be computed by a person?
In the 1930s a computer was a person who performed calculations. The
calculations must be algorithmic in nature: that is they must be the sorts of
thing one could, in principle, build a machine to perform.
In On computable numbers with an application to the Entscheidungsprob-
lem, Turing imagines what actions a computer could perform, and tries to
abstract away the details.
22.5 What a computer does
Imagine a person sitting at a desk performing calculations on paper.
The person can:
read the symbols that have been written on the paper,
22.6. WHAT HAVE WE ACHIEVED? 125
write symbols on the paper,
erase what has been written, and
perform actions dependent on what symbols were read.
We abstract away some of the details.
We assume that the paper is in the form of a tape of individual squares, each
of which can either be blank or hold just one symbol.
The computer can focus attention on only one cell of the tape at a time.
We assume that the tape is not limited.
The actions that the abstract computer can perform are then
reading a symbol,
writing a symbol,
erasing a symbol, and
focussing attention on the next or the previous cell.
We can give a formal description of these abstract machines.
Turing then asserts that these actions are the sorts of things one could, in
principle, build a machine to perform.
Turing did, in fact, become involved with the early eorts to build real,
physical machines, but in the 1930s his focus was on purely abstract machines.
22.6 What have we achieved?
Now we have a formal model of an abstract computer we can study the func-
tions that it can compute. We can call such functions the Turing machine
computable functions.
There is no certainty that the Turing machine computable functions are all
and only the computable functions, because we may have made a mistake when
analysing the actions of the computer.
22.7 The computable functions: Churchs ap-
proach
At the same time that Turing was thinking about abstract machines the Ameri-
can logician Alonzo Church was taking a dierent approach to dening the class
of computable functions.
Church had developed a notation for describing functions, called the -
calculus. The language of the -calculus is very simple. A term of the -calculus
is:
a variable, or
an application of two -terms, or
the abstraction of a variable over a -term
More formally:
TERM VAR
[ TERM TERM
[ VAR . TERM
VAR x
1
, x
2
, x
3
, . . .
We have one rule, called -reduction:
(x.M)N
[N/x]M
where [N/x]M is read as substitute N for x in M.
Substitution is algorithmic. Now, the -calculus looks nothing like Turing
machines, and it was constructed on a completely dierent basis.
Nonetheless, the -calculus lets us dene functions. We can call such func-
tions the -calculus computable functions.
Just as before, there is no certainty that the -calculus computable functions
are all and only the computable functions.
22.8 First remarkable fact
Although the -calculus and Turings abstract machines look completely dier-
ent it turns out that they both dene exactly the same class of functions.
So the -calculus computable functions are just the same functions as the
Turing machine computable functions.
Many other approaches to dening the class of computable functions have
been proposed, e.g.
the -recursive functions
Post systems
unlimited register machines (URMs)
Minsky systems
. . .
All have been shown to dene exactly the same class of functions.
Furthermore no-one has come up with a function which is obviously com-
putable and which is not in this class.
22.9. CHURCH-TURING THESIS 127
22.9 Church-Turing thesis
The assertion that the class of Turing machine computable functions is the class
of computable functions is called the Church-Turing thesis.
The Church-Turing thesis not not something that can be formally proven.
This is not because we are stupid, or lack cunning, but because it relates our
informal notion of computable with a formal system.
It is, of course, possible to give a formal proof that the -calculus computable
functions are just the same functions as the Turing machine computable func-
tions.
22.10 Second important fact
Every Turing machine embodies an algorithm. We can think of the initial
conguration of the tape for a Turing machine as the data (or input) which the
machine is supplied with.
We can describe one Turing machine to another by using symbols on a
tape.
We can construct a Turing machine T
U
which takes as input a description
of any Turing machine T
1
, and behaves just like T
1
.
22.11 The universal machine
A machine like T
U
is called a universal Turing machine.
A universal Turing machine can be made to behave like any Turing machine,
just by supplying it with appropriate data.
The existence of universal Turing machines is quite remarkable.
Physical calculating machines had been constructed prior to the 1930s but
these were all special purpose machines.
Turing had shown that special purpose machines are pointless:
if you want a machine to add up tables of nancial data, build a universal
machine and then describe the appropriate special machine to it;
if you want a machine to nd numeric solutions to dierential equations,
build a universal machine and then describe the appropriate special ma-
chine to it;
if you want a machine to play music backwards, build a universal machine
and then describe the appropriate special machine to it;
if you want a machine to do anything (algorithmic): build a universal
machine and then describe the appropriate special machine to it.
In modern parlance we call a universal Turing machine a computer, and we
call the process of supplying it with appropriate data to mimic another Turing
machine programming.
This is the sense in which a computer is a general purpose machine.
Of course in 1936 it was not possible to build a practical physical approxi-
mation to a universal Turing machine.
22.12 Third important fact
We have still not seen what Turing machines have to do with the Entschei-
dungsproblem.
We can use the idea of a universal Turing machine to show that there are
indeed undecidable problems in mathematics.
Because a universal Turing machine can be used to encode any Turing ma-
chine we can use universal machines to ask questions about Turing machines
themselves. We can use this technique to construct a purely mathematical
problem which is undecidable.
Chapter 23
Turing machines II
23.1 Introduction
In the last lecture we looked at how Turing machines came to be developed, and
gave an informal description of
the Church-Turing thesis
Universal Turing machines
formally undecidable problems
Now we will proceed with a formal development of the theory of Turing
machines.
Recall that we stated that we can think of nite automata, pushdown au-
tomata and Turing machines as models of computing devices. A nite automa-
ton has
no memory.
A pushdown automaton has
unlimited memory with restricted access.
A Turing machine has
unlimited memory with unrestricted access
129
130 CHAPTER 23. TURING MACHINES II
23.2 Informal description
Recall that our informal description of a Turing machine was that there was:
an innite tape
a head, which can:
read a symbol
write a symbol
erase a symbol
move left
move right
23.3 How Turing machines behave: a trichotomy
Consider the following program (adapted from Cohen):
read x;
if x < 0 then halt;
if x = 0 then x := 1/x;
while x > 0 do x := x + 1
This program can do 3 things:
it can halt;
it can crash;
it can run forever.
Turing machines exhibit the same behaviour. They can:
halt;
crash;
run forever.
23.4 Informal example
Suppose we want to dene a Turing machine to accept the language
w#w[w 0, 1
Our input string will be presented to us on a tape and initially the head is
located at the leftmost end of the tape.
For example the input might be 10011#10011.
How could we decide whether the string was in the language?
Clearly the head is going to have to move back and forth along the string.
23.5. TOWARDS A FORMAL DEFINITION 131
Lets see how we might go about this.
What we do rst depends on whether we are looking at a 0, 1 or a #.
If we are looking at a # then we must move right and check that the next
cell is empty.
If is is a 0 or a 1 then what should we do?
we should mark the cell as visited
we should go o and nd the #
then we should nd a 0 or a 1 as appropriate in the next cell
then we should mark this cell
and then we should repeat this until we are nished
We can mark a cell by writing a new symbol, say x in it.
But of course now we have a tape with 0s, 1s, #s and xs in it so our
method will have to change a bit.
23.5 Towards a formal denition
The formal denition of a Turing machine follows a similar pattern to the formal
denitions that we have given of nite automata and pushdown automata.
Dierent textbooks give slightly dierent denitions. For example
John Martins Introduction to languages and the theory of computation
denes a Turing machine as a 5-tuple,
Michael Sipsers Introduction to the theory of computation denes a
Turing machine as a 7-tuple, and
Daniel Cohens Introduction to computer theory splits the dierence and
says we have to give 6 things to dene a Turing machine.
So, care must be taken when reading from more than one source. We shall
use Cohens denition, with slight adaptations.
23.6 Alphabets
In a Turing machine we need two alphabets,
, the input alphabet
, the tape alphabet
We use for the blank symbol, much as we use for the empty string and
for the empty language.
Cohen stipulates that , and , .
Often we will have .
23.7 The head and the tape
We have an innite tape, and a head which is located at one of the cells.
We supply our Turing machine with input w = w
1
w
2
. . . w
n1
w
n
by entering
the symbols w
1
, w
2
. . . w
n1
, w
n
in the rst n cells of the tape.
All the other cells in the tape initially have in them.
Initially the head is located at the rst cell in the tape.
Because we can write on the tape we can use it as a memory.
23.8 The states
We have a nite set of states: Q.
One of the states is the start state: q
0
Q.
Some subset of the states are the halt states: F Q.
23.9 The transition function
We have a transition function, , which depends on:
which state we are in
what symbol is in the cell the head is at
and which can tell us
what state to go to
what symbol to write in the cell the head is at
whether to move the head left or right.
23.10 Conguration
So, as we compute we (usually) move through the states of the machine, and
(usually) the head moves along the tape.
We can represent the conguration of the machine as a triple, constisting of:
the state the machine is in
the contents of the tape
the location of the head
23.11. FORMAL DEFINITION 133
23.11 Formal denition
Denition 12 (Turing machine) A Turing machine is a 6-tuple (Q, , , , q
0
, F)
where:
is a nite set: the input alphabet
is a nite set: the tape alphabet
is a function from Q ( ) to Q ( ) L, R: the
transition function
q
0
Q: the start state
23.12 Representing the computation
We represent the conguration of a Turing machine like:
s
1
. . .
k
. . .
n
where:
s is the state the machine is in

1
. . .
k
. . .
n
is the meaningful part of the tape

k
is the symbol about to be read
For example, if the start state is S
1
, and the input is babba, then the initial
conguration will be:
S1
babba
We can use a sequence of congurations to trace the computation that the
machine performs.
23.13 A simple machine
Let M
1
= (Q, , , , q
0
, F), where
Q = S
1
, S
2
, S
3
, S
4
= a, b = a, b q
0
= S
1
F = S
4
and is given by the table:

State Reading State Writing Moving
S
1
a S
2
a R
S
1
b S
2
b R
S
2
b S
3
b R
S
3
a S
3
a R
S
3
b S
3
b R
S
3
S
4
R
1
` _
1
a,a,R
b,b,R
\
2
b,b,R
\
3
a,a,R
b,b,R
_
,,R
` _
4
+
Every time we read a symbol we move one step to the right along the tape
Writing the symbol we have just read leaves the cell unaected by our
visit.
23.15 Some traces
If we give this machine abb we will get:
S1
abb
S2
abb
S3
abb
S3
abb
S4
abb
If we give this machine bab we will get:
S1
bab
S2
bab crash
A little thought should show that this machine accepts Language((a + b)b(a +
b)
)
We should not be surprised that this machine accepts a regular language
as it just traversed the input string from left to right and did not change the
contents of the tape.
23.16 Turing machines can accept regular lan-
guages
This is a general property, which we can express as a theorem.
Theorem 15 Every regular language is can be accepted by a Turing machine.
The proof consists of taking an FA which accepts a regular language and turning
it into a Turing machine which accepts the same language. Basically, we add a
new halt state and adjust the transition function.
23.17 Proof
Let L be a regular language. Then L is accepted by an FA M
L
= (Q, , , q
0
, F).
Then T
L
= (Q
, ,
, q
0
, F
) where:
23.18. ANOTHER MACHINE 135
F
= S
HALT
= Q F
=
=
if (s, ) is dened then
(s, ) = ((s, ), , R)
f F
(f, ) = (S
HALT
, , R)
is a Turing machine which accepts L.
23.18 Another machine
Let M
2
= (Q, , , , q
0
, F), where Q = S
1
, S
2
, S
3
, S
4
, S
5
, S
6
, = a, b,
= a, A, B, q
0
= S
1
, F = S
6
, and is given by:
S
1
a S
2
A R
S
2
a S
2
a R
S
2
B S
2
B R
S
2
b S
3
B L
S
3
B S
3
B L
S
3
A S
5
A R
S
3
a S
4
a L
S
4
a S
4
a L
S
4
A S
1
A R
S
5
B S
5
B R
S
5
S
6
R
2
` _
1
a,A,R
\
2
a,a,R
B,B,R
_
b,B,L
\
3
B,B,L
A,A,R
a,a,L
z
z
z
z
z
z
z
z
z
z
\
5
B,B,R
,,R
` _
6
+
\
4
a,a,L
_
A,A,R
p
p
p
p
p
p
p
p
p
p
This machine accepts the language a
n
b
n
, n > 0.
This machine does write on the tape, and the head moves both left and
right.
23.20 A trace
If we give this machine aabb we will get:
S1
aabb
S2
Aabb
S2
Aabb
S3
AaBb
S4
AaBb
S1
AaBb
S2
AABb
S2
AABb
S3
AABB
S3
AABB
S5
AABB
S5
AABB
S5
AABB
S6
AABB
To get a clearer idea of what is going on here try tracing the computation
on a longer string like aaaabbbb.
Chapter 24
Turing Machines III
24.1 An example machine
Consider the following machine T
1
= (Q, , , , q
0
, F), where Q = S
1
, S
2
, S
3
= a, b = a, b q
0
= S
1
F = S
3
and is given by the table:
S
1
S
1
R
S
1
b S
1
b R
S
1
a S
2
a R
S
2
a S
3
a R
S
2
b S
1
b R
` _
1
,,R
b,b,R
_
a,a,R
\
2
b,b,R
.
a,a,R
` _
3
+
Now, T
1
behaves as follows:
if the string contains a aa we reach the halt state, so the string is accepted
by the TM
if the string does not contain a aa and ends in an a then the machine
crashes
if the string does not contain a aa and ends in a b then the machine loops
24.2 Some denitions
Any Turing machine can exhibit this trichotomy, so, for every Turing machine
T we dene
137
138 CHAPTER 24. TURING MACHINES III
accept(T) to be the set of strings on which T halts
reject(T) to be the set of strings on which T crashes
loop(T) to be the set of strings on which T loops
24.3 Computable and computably enumerable
languages
Denition 13 (Computable language) A language L is computable if there
is some Turing machine T such that:
1. accept(T) = L
2. loop(T) =
3. reject(T) = L
Condition 2 tells us that the Turing machine must either halt gracefully or
crash: it cannot go on forever.
Denition 14 (Computably enumerable language) A language L is com-
putably enumerable if there is some Turing machine T such that:
1. accept(T) = L
2. loop(T) reject(T) = L
Some authors (Turing and Cohen included) use the terms recursive and
recursively enumerable.
Theorem 16 Every computable language is computably enumerable
Proof Suppose L is computable. Then by denition there is a Turing ma-
chine such that
accept(T) = L
loop(T) =
reject(T) = L
Now, loop(T) reject(T) = L = L.
Hence every computable language is computably enumerable.
24.4 Deciders and recognizers
Informally, we can think of a computable language as being one for which we can
write a decider, a program whose behaviour is sure to tell us whether a string
is in the language or not.
24.5. A DECIDABLE LANGUAGE 139
We can think of a computably enumerable language as one for which we can
write a recognizer. A recognizer is a program which will tell us if a string is in
the language, but which may loop if the string is not in the language.
A language for which we can write a decider is called a decidable language.
A language for which we can write a recognizer is called a semi-decidable
language.
24.5 A decidable language
Recall that a language is just a set of strings, so we can think of this problem
in terms of whether membership of some arbitrary set is decidable.
As long as we can encode the elements of the set as strings, the set is a
language.
Let B be a nite automaton, and w a string over the alphabet of B. Consider
the set:
A
FA
= (B, w)[w Language(B)
Is A
FA
decidable, i.e. is it decidable whether a string is in the language of a
nite automaton?
Theorem 17 If B is a nite automaton, and w a string over the the alphabet
of B then (B, w)[w Language(B) is decidable.
Proof We already know that, for any FA B, there is a TM B
that accepts
the same language. By inspection of the construction, we can easily tell that
loop(B
) = .
We must construct a Turing machine which takes as input a description of
B
and the string w, and which halts if w accept, and crashes otherwise.
A universal TM can do this.
24.6 Another decidable language
If A is an FA, then the set:
A[Language(A) =
is also decidable.
We must write a program which takes (a description of) A and checks
whether any accepting state can be reached from the start state.
24.7 A corollary
Let Language(C) = (Language(A)Language(B))(Language(A)Language(B))
Language(C) is the symmetric dierence of Language(A) and Language(B).
Since the set of regular languages is closed under complementing, intersection
and union, then Language(C) is regular if Language(A) and Language(B) are.
140 CHAPTER 24. TURING MACHINES III
Now, Language(C) = i Language(A) = Language(B).
So, in order to check whether two FA are equal we only have to write a
program which takes their descriptions, constructs the FA which accepts their
symmetric dierence and checks whether this is .
24.8 Even more decidable sets
We have discussed FA and regular languages. What about the set:
A
PDA
= (P, w)[w Language(P)
where P is a pushdown automaton, and w a word of the input alphabet of
P. Is this set decidable?
It turns out that this set also is decidable.
This is why we like context-free grammars: we can be sure we can write
parsers for them. In fact, the proof that A
PDA
is decidable allows us to construct
a parser from any CFG.
24.9 An undecidable set
What about the set:
A
TM
= (T, w)[w accept(T)
where T is a Turing machine and w is a word over the input alphabet of T.
This set is not decidable. We will see why later.
Chapter 25
Turing Machines IV
25.1 Introduction
We are looking at how we can construct an undecidable set. Recall:
Denition 15 (Computable language) A language L is computable if there
is some Turing machine T such that:
1. accept(T) = L
2. loop(T) =
3. reject(T) = L
Informally, we can think of a computable language as being one for which
we can write a decider, a program whose behaviour is sure to tell us whether a
string is in the language or not.
A language for which we can write a decider is called a decidable language.
Denition 16 (Computably enumerable language) A language L is com-
putably enumerable if there is some Turing machine T such that:
1. accept(T) = L
2. loop(T) reject(T) = L
25.2 A
FA
is decidable
Let B be a nite automaton, and w a string over the the alphabet of B, then
A
FA
= (B, w)[w Language(B)
is decidable. We can construct a Turing machine which takes as input (a
description of) B and the string w, and which halts if w Language(B), and
crashes otherwise.
141
142 CHAPTER 25. TURING MACHINES IV
25.3 A
PDA
is decidable
Let P be a pushdown automaton, and w a string over the the alphabet of P,
then
A
PDA
= (P, w)[w Language(P)
is decidable. We can construct a Turing machine which takes as input (a
description of) P and the string w, and which halts if w Language(P), and
crashes otherwise.
25.4 The halting problem
Let T be a Turing machine, and w a string over the the input alphabet of T,
then
A
TM
is not decidable. This problem is often called the halting problem, because
it asks about the halting behaviour of Turing machines. Proof
Since we are trying to show that A
TM
is not decidable we assume that A
TM
is
decidable and derive a contradiction.
If A
TM
is decidable then there is a Turing machine H such that:
1. accept(H) = A
TM
2. loop(H) =
3. reject(H) = A
TM
We can write this as:
w accept(T) implies (T, w) accept(H)
w , accept(T) implies (T, w) reject(H)
or we could treat H as a little program:
H(T, w)

= if T(w) halts
then yes
else no
Now, suppose we dene a new machine D, which accepts machines which do not
accept themselves as input.
There is no reason why we cannot give a Turing machine itself as input, any
more than there is a reason why we cannot give a program itself as input.
We can write D as a little program:
D(T)

= if H(T, T) = yes
then crash
else halt
Now, what happens if we give D itself as input?
25.5. OTHER UNDECIDABLE PROBLEMS 143
D(D) = if H(D, D) = yes
then crash
else halt
But our denition of H requires that H(D, D) = yes precisely when D(D)
halts.
So, D(D) = if D(D) halts then crash else halt.
That is, D reject(D) if and only if D accept(D).
Now, it is not possible for D accept(D) and D reject(D) both to hold,
so D loop(D).
Hence, D(D) loops; therefore H(D, D) loops.
This is the contradiction that we sought.
Hence, if T is a Turing machine, and w a string over the the input alphabet
of T, then
A
TM
is not decidable.
Our undecidable problem really turns on the existence of universal Turing
machines, i.e. on the fact that Turing machines are powerful enough to describe
themselves and their own behaviour.
25.5 Other undecidable problems
Are there real problems which are undecidable, or are they merely mathe-
matical curiosities?
First order logic It is not possible to write a theorem prover which, given a
logical expression, is certain to be able to say whether the expression can
be proved or not.
It is possible to write a semi-decision procedure: given a provable expres-
sion it is possible to say that it is provable.
Verication It is not possible to write a program which, given a specication
S and a program P, will determine whether the program P meets the
specication S.
Programming It is not possible to write a programwhich, given a specication
S, will construct a program P that meets it.
25.6 Closure properties
Theorem 18 The class of computable languages is closed under
1. union,
2. intersection, and
3. complementation.
(It is closed under concatenation and Kleene closure as well, but we wont
prove these results.)
Proof: Suppose L
1
and L
2
are computable. Then there exist Turing ma-
chines T
1
and T
2
such that:
accept(T
1
) = L
1
reject(T
1
) = L
1
loop(T
1
) =
accept(T
2
) = L
2
reject(T
2
) = L
2
loop(T
2
) =
1. (Union) Let T
3
be a TM that simulates T
1
and T
2
simultaneously. For
example, it might perform steps of T
1
and T
2
in turn, on disjoint parts of
the tape. Let T
3
halt if T
1
OR T
2
halts; otherwise, both T
1
and T
2
crash,
so T
3
crashes.
accept(T
3
) = accept(T
1
) accept(T
2
) = L
1
L
2
reject(T
3
) = reject(T
1
) reject(T
2
) = L
1
L
2
= L
1
L
2
loop(T
3
) =
2. (Intersection) Let T
4
simulate T
1
and T
2
simultaneously. Let T
4
crash if
T
1
OR T
2
crashes; otherwise, both halt, so T
4
halts.
accept(T
4
) = accept(T
1
) accept(T
2
) = L
1
L
2
reject(T
4
) = reject(T
1
) reject(T
2
) = L
1
L
2
= L
1
L
2
loop(T
4
) =
3. (Complementation) Let T
5
simulate T
1
. Let T
5
crash if T
1
halts, and halt
if T
1
crashes (one of these must eventually happen!).
accept(T
5
) = reject(T
1
) = L
1
reject(T
4
) = accept(T
1
) = L
1
loop(T
4
) =
25.7 Computable and computably enumerable
languages
Theorem 19 A language L is computable i L is c.e. and L is c.e.
Proof: If is easy: L is computable, so by closure, L is computable. All
computable languages are c.e.
Only if is harder. Let T
1
and T
2
be TMs such that:
accept(T
1
) = L reject(T
1
) loop(T
1
) = L
accept(T
2
) = L reject(T
2
) loop(T
2
) = L
Construct a TM T that simulates T
1
and T
2
simultaneously. If T
1
halts, T
halts. If T
2
halts, T crashes. One of these must happen, since every string w
belongs either to L or to L.
accept(T) = accept(T
1
) = L
reject(T) = accept(T
2
) = L
loop(T) =
25.8. A LANGUAGE WHICH IS NOT C.E. 145
25.8 A language which is not c.e.
Now that we have an undecidable language we can go further and dene a
language which is not even computably enumerable.
Theorem 20 A
TM
is not c.e.
Proof: A
TM
is c.e., by universal Turing machine.
If A
TM
were also c.e. then A
TM
would be computable (Theorem 2).
But A
TM
is not computable, so A
TM
is not c.e.
Chapter 26
Turing Machines V
26.1 A hierarchy of classes of language
We have now seen in COMP 202 a whole collection of classes of languages:
all possible languages
c.e. languages
computable languages
context-sensitive languages
context-free languages
regular languages
nite languages
Each of these is a proper subset of the one above it.
26.2 A hierarchy of classes of grammar
In the 1950s the linguist Noam Chomsky produced a hierarchy of classes of
grammars, corresponding exactly to the four classes above:
Type Grammar Language
Type 0 phrase structure grammars computably enumerable languages
Type 1 context sensitive grammars context sensitive languages
Type 2 context free grammars context free languages
Type 3 regular grammars regular languages
147
148 CHAPTER 26. TURING MACHINES V
26.3 Grammars for natural languages
Chomskys hierarchy is important, although it does not include every possible
class of language. We can look for a ner structure than the one we have
presented (LL(1) grammars, for example), but this is outside the scope of this
course.
Chomsky was (is) a linguist, and he was really concerned with what sorts of
grammars are required to describe natural languages, like English, M aori, Urdu,
Swahili and so on.
Consider the following examples (adapted from Gazdar and Mellishs Natural
Language processing in PROLOG):
A doctor hired another doctor.
A doctor whom a doctor hired hired another doctor.
A doctor whom a doctor whom a doctor hired hired hired another doctor.
A doctor whom a doctor whom a doctor whom a doctor hired hired hired
hired another doctor.
. . .
This sentence is of the form:
A doctor (whom a doctor)
n
(hired)
n
hired another doctor.
so it is context free but not regular. Are there any phenomena in English
which require us to go beyond context free?
Surprisingly, the answer is no!
In fact there is only one natural language which requires that we use a
context-sensitive grammar.
There is a structure which can occur in the dialect of Swiss-German spoken
around Z urich which make use of strings of the form:
a
m
b
n
c
m
d
n
This apparent lack of complexity in natural languages seems surprising: sur-
prising enough that the authors of books on articial intelligence and on formal
language theory regularly make false pronouncements on this issue. Moral: look
to linguists for facts about natural languages.
26.4 A hierarchy of classes of automaton
For four of these classes of language, there is a corresponding class of automaton:
Type Automaton Language
Type 0 Turing machine computably enumerable languages
Type 1 Linear-bounded automaton context sensitive languages
Type 2 Pushdown automaton context free languages
Type 3 Finite automaton regular languages
We have studied all except linear-bounded automata.
26.5. DETERMINISTIC AND NONDETERMINISTIC AUTOMATA 149
26.5 Deterministic and nondeterministic automata
We know that for Type 3 automata, nondeterminism makes no dierence:
Kleenes theorem tells us that the class of languages accepted by nondetermin-
istic nite automata (NFAs) is the same as the class accepted by deterministic
nite automata (FAs).
We know that for Type 2 automata, nondeterminism does make a dier-
ence: there are languages that can be accepted by nondeterministic pushdown
automata (PDAs) that cannot be accepted by any deterministic pushdown au-
tomaton (DPDA).
We ignore Type 1 automata.
What about Type 0?
26.6 Nondeterminstic Turing Machines
Our denition of Turing Machines was deterministic: given a state and a symbol
on the tape, tells us exactly which state to go to, what to write on the tape,
and in which direction to move the tape:
: Q( ) Q( ) L, R
This can easily be modied to allownondeterministic Turing Machines (NTMs):
: Q( ) 2
Q({}){L,R}
A string w is accepted by a NTM N = (Q, , ,
, q
0
, F) if there is some
path from q
0
to some q F on a tape loaded initially with w.
We must be more careful about looping NTMs (what do we say if, for some
NTM N and some string w, there is a path through N that rejects, and another
that loops?).
If we consider only accepting paths, and do not distinguish rejecting from
looping (so we cannot distinguish computable from computably enumerable lan-
guages), we get the surprising result that nondeterminism makes no dierence.
26.7 NTM=TM
Clearly any deterministic Turing Machine can be described by a NTM: nonde-
terminism is not compulsory! So TM NTM.
We must show NTM TM: that is, that any NTM T may be simulated
by a TM T
.
An NTM has only nitely many edges: label the edges of T with unique
natural numbers. Now, for any string w accept(T), there is at least one nite
sequence of labels corresponding to T accepting w.
We design a Universal Turing Machine, T
, that enumerates all paths in turn,

checks whether the path is valid for w and T, and accepts when it nds one
that is (Cohen has the details). Hence, accept(T) accept(T
).
150 CHAPTER 26. TURING MACHINES V
If w / accept(T), T
will try longer and longer paths and will never halt.
Hence, loop(T
) accept(T).
Together with reject(T) = (obvious), we have accept(T
) = accept(T).
26.8 More variations on Turing Machines
Our Turing Machines have a distinguished set of nal states, and at each step
read and write one tape symbol and move left or right.
It makes no dierence to the class of languages accepted if, instead of a set
of nal states, we add a halt instruction (H):
: Q( ) Q( ) L, R, H
It makes no dierence to the class of languages accepted if, instead of insist-
ing on a tape move at each step, we allow a stay instruction (S):
: Q( ) Q( ) L, R, S
Our Turing Machines have a single tape that is innite in both directions,
a distinguished set of nal states, and at each step read and write one tape
symbol and move left or right.
It makes no dierence to the class of languages accepted if we restrict the
tape so that it is innite in only one direction (in fact, that is Cohens rst
denition).
It makes no dierence to the class of languages accepted if we allow multiple
innite tapes: the TM decides what to do based on the values beneath the head
on all k tapes, and writes all k tapes simultaneously. Of course, k > 0.
Chapter 27
Summary of the course
This lecture is a summary of the whole course.
27.1 Part 0 Algorithms, and Programs
Specications
signature
preconditions
postconditions
Imperative languages and applicative languages
Program verication
assertions
invariants
27.2 Part I Formal languages and automata
Denitions of alphabet, word, language . . .
Regular expressions and regular languages
Finite automata:
Deterministic nite automata
Nondeterministic nite automata
Nondeterministic nite automata with
Kleenes Theorem
Pumping Lemma
Closure properties
151
152 CHAPTER 27. SUMMARY OF THE COURSE
27.3 Part II Context-Free Languages
Regular grammars
Context-free grammars
Normal forms
Recursive descent parsing
LL(1) grammars
Pushdown automata
Deterministic and nondeterministic PDAs
Top-down and bottom-up parsers
Non-CF languages
Closure properties
27.4 Part III Turing Machines
Origins and denition of Turing machines
Universal machines
Computable and computably enumerable languages
Undecidable problems
Chomsky hierarchy
Variations on Turing machines
27.5 COMP 202 exam
According to the Universitys www page
http://www.vuw.ac.nz/timetables/exam-timetable.aspx
the three-hour nal exam is:
on Monday 31 October,
in HMLT206,
starting at 9:30am
The exam will cover the whole course but will have slightly more emphasis
on the second half, as the mid-term test examined to rst half.
The format will be similar, though not necessarily identical, to the last two
years exams, which may be found on the course web site.
In particular, you will be asked to:
27.6. WHAT NEXT? 153
give denitions;
state theorems;
outline proofs;
describe constructions;
carry out constructions; and
discuss the signicance of results.
27.6 What next?
Some obvious directions to look in are:
How can we use specications and verication in a constructive way, to
help us design correct programs?
How can we write grammars which are convenient to use?
Dierent models of computation give us dierent formalisms for writing
programs: what are the advantages and disadvantages of each?
Given a decidable problem: can we say how hard it is to solve it?
Problems which are undecidable in general may be decidable in special
cases: how can we characterise these?

Lecture Notes by Leslie and Nickson 2005

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Lecture Notes by Leslie and Nickson 2005

Caricato da

Copyright:

Formati disponibili

Collected Lecture Slides for COMP 202

Formal Methods of Computer Science

the new transition function . . . . . . . . . . . . . . . . . . . . 55

the new set of accepting states . . . . . . . . . . . . . . . . . . 56

(s, t, 0) else false

= 0 k [s[ (r s[0 .. k 1] = t[0 .. k 1]) [s[ = [t[.

= (r s[0 .. k 1] = t[0 .. k 1]) [s[ = [t[ k [s[ 1.

= 0 [s[ [s[ r s[0..[s[ 1] = t[0..[s[ 1] [s[ = [t[.

Denition 6 (Length of a string) The length of a string is the number of

= , 1, 0, 11, 10, 01, 00, . . .

= , kk, k, kkkk, kkk, kkk,

has the highest precedence and + the lowest.

is the Kleene closure of the language de-

) = , 0, 1, 00, 11, 000, 111, . . .

) = 1, 0, 10, 00, 100, 000, 1000, . . .

Language(r +s) is Language(r) Language(s)

= , 0, 1, 00, 01, 10, 11, . . .

= , 1, 11, . . . , 0, 01, 011, . . .

= , 0, 1, 00, 01, 10, 11, . . .

44 CHAPTER 8. FINITE AUTOMATA

8.8 Examples of constructing an automaton

This automaton correctly accepts and the string 1.

46 CHAPTER 8. FINITE AUTOMATA

For clarity and conciseness, we sometimes choose to show as a table:

And, of course we can give a pictorial representation. M

+0((1 +0)(1 +0))

), which accepts the same language.

= is the only sensible choice. It also seems reasonable to keep

(S, ) = if (S, ) then (S, ) else

after we had read 0, any state reachable from S

using until no new states are created.

9.17 NFA with transitions

9.21 Graphical representation of M

9.22. EASY THEOREM 55

), an NFA which accepts the same language.

the new transition function

We can call this the closure of (S, ).

the new set of accepting states

= F, but this is wrong.

9.27 Graphical representation of M

9.30 Base cases

In these cases we assume that we have automata to accept r and s and we

Informally: a machine whose start state is the only accepting state.

Assume that we have an automaton M

+ba through NFA, NFA, to FA.

is not regular when [A[ > 1.

15.6 Regular Grammars

(i.e., there are no nonterminals left)

N: that is, a (possibly

, the nal partial parse tree t

, N be a brand new nonterminal.

S) is equivalent to G, and may

S, S cS, S TU, T , T aT, U , U bTU

is in the language generated by

follow() = x [ S x for some , ( N)

are nullable, so we need to check Requirement 2:

is its regular expression.

is not regular, but it is context free: S

18.3. FORMAL DEFINITION OF PDA 103

18.3 Formal Denition of PDA

104 CHAPTER 18. PUSHDOWN AUTOMATA

Applying the same algorithm as we used for NFA FA:

The resulting PDA is deterministic, and generates L