Grammar Free

Context-free grammar
A context-free grammar (CFG) is a term used in formal

language theory to describe a certain type of formal
grammar. A context-free grammar is a set of production
rules that describe all possible strings in a given formal
language. Production rules are simple replacements. For
example, the rule
generate the same language?) is undecidable.
Rules can also be applied in reverse to check if a string is

grammatically correct according to the grammar.
1 Background
Context-free grammars arise in linguistics where they are

used to describe the structure of sentences and words in
natural language, and they were in fact invented by the
linguist Noam Chomsky for this purpose, but have not
really lived up to their original expectation. By contrast,
A
in computer science, as the use of recursively-dened
Replaces A with . There can be multiple replacement concepts increased, they were used more and more. In
an early application, grammars are used to describe the
rules for any given value. For example,
structure of programming languages. In a newer applicaA
tion, they are used in an essential part of the Extensible
Markup Language (XML) called the Document Type
A
Denition.[2]
means that A can be replaced with either or .
In linguistics, some authors use the term phrase strucIn context-free grammars, all rules are one to one, one to ture grammar to refer to context-free grammars,
many, or one to none. These rules can be applied regard- whereby phrase-structure grammars are distinct from
less of context. The left-hand side of the production rule dependency grammars. In computer science, a popular
is also always a nonterminal symbol. This means that the notation for context-free grammars is BackusNaur form,
symbol does not appear in the resulting formal language. or BNF.
So in our case, our language contains the letters and
but not A .[1]
Here is an example context-free grammar which de- Since the time of Pini, at least, linguists have described
scribes all two-letter strings containing the letters and the grammars of languages in terms of their block struc.
ture, and described how sentences are recursively built
up from smaller phrases, and eventually individual words
S AA
or word elements. An essential property of these block
A
structures is that logical units never overlap. For example, the sentence:
A
If we start with the nonterminal symbol S then we can
John, whose blue car was in the garage, walked
use the rule S AA to turn S into AA . We can
to the grocery store.
then apply one of the two later rules. For example, if
we apply A to the rst A we get A . If we
then apply A to the second A we get . Since can be logically parenthesized as follows:
both and are terminal symbols, and in context-free
grammars terminal symbols never appear on the left hand
(John, ((whose blue car) (was (in the garage))),
side of a production rule, there are no more rules that
(walked (to (the grocery store)))).
can be applied. This same process can be used, applying
the second two rules in dierent orders in order to get all
A context-free grammar provides a simple and mathepossible strings within our simple context-free grammar.
matically precise mechanism for describing the methods
Languages generated by context-free grammars are by which phrases in some natural language are built from
known as context-free languages (CFL). Dierent smaller blocks, capturing the block structure of sencontext-free grammars can generate the same context- tences in a natural way. Its simplicity makes the formalfree language. It is important to distinguish properties ism amenable to rigorous mathematical study. Imporof the language (intrinsic properties) from properties of a tant features of natural language syntax such as agreement
particular grammar (extrinsic properties). The language and reference are not part of the context-free grammar,
equality question (do two given context-free grammars but the basic recursive structure of sentences, the way in
1
2 FORMAL DEFINITIONS
which clauses nest inside other clauses, and the way in

which lists of adjectives and adverbs are swallowed by
nouns and verbs, is described exactly.
members of R are called the (rewrite) rules or productions of the grammar. (also commonly symbolized by a P)
The formalism of context-free grammars was developed

4. S is the start variable (or start symbol), used to repin the mid-1950s by Noam Chomsky,[3] and also their
resent the whole sentence (or program). It must be
classication as a special type of formal grammar (which
an element of V.
[4]
he called phrase-structure grammars). What Chomsky
called a phrase structure grammar is also known now as
a constituency grammar, whereby constituency grammars
stand in contrast to dependency grammars. In Chomskys 2.1 Production rule notation
generative grammar framework, the syntax of natural language was described by context-free rules combined with A production rule in R is formalized mathematically as
a pair (, ) R , where V is a nonterminal and
transformation rules.
(V ) is a string of variables and/or terminals;
Block structure was introduced into computer rather than using ordered pair notation, production rules
programming languages by the Algol project (1957 are usually written using an arrow operator with as its
1960), which, as a consequence, also featured a left hand side and as its right hand side: .
context-free grammar to describe the resulting Algol
syntax. This became a standard feature of computer It is allowed for to be the empty string, and in this case it
languages, and the notation for grammars used in is customary to denote it by . The form is called
[6]
concrete descriptions of computer languages came to an -production.
be known as BackusNaur form, after two members It is common to list all right-hand sides for the same leftof the Algol language design committee.[3] The block hand side on the same line, using | (the pipe symbol) to
structure aspect that context-free grammars capture is separate them. Rules 1 and 2 can hence be
so fundamental to grammar that the terms syntax and written as 1 | 2 . In this case, 1 and 2 is called
grammar are often identied with context-free grammar the rst and second alternative, respectively.
rules, especially in computer science. Formal constraints
not captured by the grammar are then considered to be
part of the semantics of the language.
2.2 Rule application
Context-free grammars are simple enough to allow the
construction of ecient parsing algorithms which, for a
given string, determine whether and how it can be generated from the grammar. An Earley parser is an example
of such an algorithm, while the widely used LR and LL
parsers are simpler algorithms that deal only with more
restrictive subsets of context-free grammars.
For any strings u, v (V ) , we say u directly yields

v, written as u v , if (, ) R with V and
u1 , u2 (V ) such that u = u1 u2 and v = u1 u2
. Thus, v is a result of applying the rule (, ) to u.
2.3 Repetitive rule application

For any strings u, v (V ) , we say u yields v,
written as uv (or u v in some textbooks), if

k 1 u1 , , uk (V ) such that u = u1
A context-free grammar G is dened by the 4-tuple:[5]
u2 uk = v . In this case, if k 2 (i.e.,
+
G = (V, , R, S) where
u = v ), the relation uv holds. In other words, ()
+
and () are the reexive transitive closure (allowing a
1. V is a nite set; each element v V is called a word to yield itself) and the transitive closure (requiring
nonterminal character or a variable. Each variable at least one step) of () , respectively.
represents a dierent type of phrase or clause in
the sentence. Variables are also sometimes called
syntactic categories. Each variable denes a sub- 2.4 Context-free language
language of the language dened by G.
Formal denitions
The language of a grammar G = (V, , R, S) is the set

2. is a nite set of terminals, disjoint from V, which
make up the actual content of the sentence. The set
of terminals is the alphabet of the language dened
L(G) = {w : S w}
by the grammar G.
3. R is a nite relation from V to (V ) , where the A language L is said to be a context-free language (CFL),
asterisk represents the Kleene star operation. The if there exists a CFG G, such that L = L(G) .
3.2
2.5
Well-formed nested parentheses and square brackets
Proper CFGs
A context-free grammar is said to be proper,[7] if it has

no unreachable symbols: N V : , (V
) : S N
3
The rst rule allows the S symbol to multiply; the second
rule allows the S symbol to become enclosed by matching
parentheses; and the third rule terminates the recursion.
3.2 Well-formed nested parentheses and

square brackets
no unproductive symbols: N V : w :
A second canonical example is two dierent kinds of

N w
matching nested parentheses, described by the produc no -productions: N V : (N, ) R
tions:
+
no cycles: N V : N N
S SS
S ()
Every context-free grammar can be eectively transS (S)
formed into a weakly equivalent one without unreach[8]
S []
able symbols, a weakly equivalent one without unproductive symbols,[9] and a weakly equivalent one withS [S]
out cycles.[10] Every context-free grammar not producing
can be eectively transformed into a weakly equiva- with terminal symbols [ ] ( ) and nonterminal S.
lent one without -productions;[11] altogether, every such
The following sequence can be derived in that grammar:
grammar can be eectively transformed into a weakly
equivalent proper CFG.
([ [ [ ()() [ ][ ] ] ]([ ]) ])
However, there is no context-free grammar for generating
all sequences of two dierent types of parentheses, each
The grammar G = ({S}, {a, b}, P, S) , with produc- separately balanced disregarding the other, but where the
two types need not nest inside one another, for example:
tions
2.6
Example
[(])
S aSa,
S bSb,
or
S ,
[ [ [ [(((( ] ] ] ]))))(([ ))(([ ))([ )( ])( ])( ])
is context-free. It is not proper since it includes an production. A typical derivation in this grammar is
3.3
A regular grammar
S aSa aaSaa aabSbaa aabbaa.
Every regular grammar is context-free, but not all

context-free grammars are regular. The following
R
This makes it clear that L(G) = {ww : w {a, b} } context-free grammar, however, is also regular.
. The language is context-free, however it can be proved
Sa
that it is not regular.
S aS
3
3.1
Examples
Well-formed parentheses
S bS
The terminals here are a and b, while the only nonterminal is S. The language described is all nonempty strings
of a s and b s that end in a .
The canonical example of a context free grammar is This grammar is regular: no rule has more than one nonparenthesis matching, which is representative of the gen- terminal in its right-hand side, and each of these nontereral case. There are two terminal symbols "(" and ")" and minals is at the same end of the right-hand side.
one nonterminal symbol S. The production rules are
Every regular grammar corresponds directly to a
nondeterministic nite automaton, so we know that this
S SS
is a regular language.
S (S)
Using pipe symbols, the grammar above can be described
S ()
more tersely as follows:
3 EXAMPLES
S a | aS | bS
3.4
Matching pairs
In a context-free grammar, we can pair up characters the

way we do with brackets. The simplest example:
S aSb
S ab
( S ) * S - S / S (by rule 8, applied to the

leftmost S)
( S ) * S - S / ( S ) (by rule 8, applied to the
rightmost S)
( S + S ) * S - S / ( S ) (etc.)
(S+S)*S-S*S/(S)
(S+S)*S-S*S/(S+S)
(x+S)*S-S*S/(S+S)
(x+y)*S-S*S/(S+S)
This grammar generates the language {an bn : n 1}

, which is not regular (according to the pumping lemma
for regular languages).
(x+y)*x-S*y/(S+S)
The special character stands for the empty string. By

changing the above grammar to
(x+y)*x-z*y/(x+x)
(x+y)*x-S*y/(x+S)
(x+y)*x-z*y/(x+S)
Note that many choices were made underway as to which

rewrite was going to be performed next. These choices
look quite arbitrary. As a matter of fact, they are, in the
n n
we obtain a grammar generating the language {a b : sense that the string nally generated is always the same.
n 0} instead. This diers only in that it contains the For example, the second and third rewrites
empty string while the original grammar did not.
S * S - S (by rule 6, applied to the leftmost
S)
S aSb |
3.5
Algebraic expressions
Here is a context-free grammar for syntactically correct

inx algebraic expressions in the variables x, y and z:
1. S x
S * S - S / S (by rule 7, applied to the rightmost S)

could be done in the opposite order:
2. S y
S - S / S (by rule 7, applied to the rightmost

S)
3. S z
S * S - S / S (by rule 6, applied to the leftmost

S)
4. S S + S
5. S S - S
6. S S * S
7. S S / S
8. S ( S )
This grammar can, for example, generate the string
(x+y)*x-z*y/(x+x)
as follows:
S (the start symbol)
Also, many choices were made on which rule to apply to

each selected S. Changing the choices made and not only
the order they were made in usually aects which terminal
string comes out at the end.
Lets look at this in more detail. Consider the parse tree
of this derivation:
S | /|\ S - S / \ /|\ /|\ S * S S / S / | | \ /|\ x /|\ /|\ ( S ) S * S (
S ) / | | \ /|\ z y /|\ S + S S + S | | | | x y x x
Starting at the top, step by step, an S in the tree is expanded, until no more unexpanded Ses (nonterminals) remain. Picking a dierent order of expansion will produce
a dierent derivation, but the same parse tree. The parse
tree will only change if we pick a dierent rule to apply
at some position in the tree.
S * S - S (by rule 6, applied to the leftmost

S)
But can a dierent parse tree still produce the same terminal string, which is ( x + y ) * x - z * y / ( x + x ) in this
case? Yes, for this particular grammar, this is possible.
Grammars with this property are called ambiguous.
S * S - S / S (by rule 7, applied to the rightmost S)
For example, x + y * z can be produced with these two

dierent parse trees:
S - S (by rule 5)
3.7
Derivations and syntax trees
S S | | /|\ /|\ S * S S + S / \ / \ /|\ z x /|\ S + S S * S | | | | x y 3.6.3 Other examples

yz
The formation rules for the terms and formulas of formal
However, the language described by this grammar is
logic t the denition of context-free grammar, except
not inherently ambiguous: an alternative, unambiguous
that the set of symbols may be innite and there may be
grammar can be given for the language, for example:
more than one start symbol.
Tx
Ty
3.7 Derivations and syntax trees
Tz
SS*T
A derivation of a string for a grammar is a sequence of

grammar rule applications that transforms the start symbol into the string. A derivation proves that the string
belongs to the grammars language.
SS/T
A derivation is fully determined by giving, for each step:
SS+T
SS-T
T(S)
ST
the rule applied in that step
the occurrence of its left hand side to which it is ap(once again picking S as the start symbol). This alterplied
native grammar will produce x + y * z with a parse tree
similar to the left one above, i.e. implicitly assuming the
association (x + y) * z, which is not according to standard For clarity, the intermediate string is usually given as well.
operator precedence. More elaborate, unambiguous and
context-free grammars can be constructed that produce For instance, with the grammar:
parse trees that obey all desired operator precedence and (1) S S + S (2) S 1 (3) S a
associativity rules.
the string
3.6
Further examples
1+1+a
can be derived with the derivation:
S (rule 1 on rst S) S+S (rule 1 on second S) S+S+S

(rule 2 on second S) S+1+S (rule 3 on third S)
A context-free grammar for the language consisting of S+1+a (rule 2 on rst S) 1+1+a
all strings over {a,b} containing an unequal number of Often, a strategy is followed that deterministically deteras and bs:
mines the next nonterminal to rewrite:
3.6.1
Example 1
SU|V
U TaU | TaT | UaT
V TbV | TbT | VbT
T aTbT | bTaT |
in a leftmost derivation, it is always the leftmost nonterminal;

in a rightmost derivation, it is always the rightmost
nonterminal.
Here, the nonterminal T can generate all strings with the

same number of as as bs, the nonterminal U generates
all strings with more as than bs and the nonterminal V
generates all strings with fewer as than bs. Omitting the
third alternative in the rule for U and V doesn't restrict
the grammars language.
Given such a strategy, a derivation is completely determined by the sequence of rules applied. For instance, the
leftmost derivation
3.6.2
can be summarized as
Example 2
Another example of a non-regular language is

{bn am b2n : n 0, m 0} . It is context-free
as it can be generated by the following context-free
grammar:
S bSbb | A
A aA |
S (rule 1 on rst S) S+S (rule 2 on rst S) 1+S

(rule 1 on rst S) 1+S+S (rule 2 on rst S) 1+1+S
(rule 3 on rst S) 1+1+a
rule 1, rule 2, rule 1, rule 2, rule 3
The distinction between leftmost derivation and rightmost
derivation is important because in most parsers the transformation of the input is dened by giving a piece of code
for every grammar rule that is executed whenever the rule
is applied. Therefore, it is important to know whether
the parser determines a leftmost or a rightmost derivation
UNDECIDABLE PROBLEMS
because this determines the order in which the pieces of is no -production (that is, a rule that has the empty string
code will be executed. See for an example LL parsers as a product). If a grammar does generate the empty
and LR parsers.
string, it will be necessary to include the rule S
A derivation also imposes in some sense a hierarchical , but there need be no other -rule. Every context-free
structure on the string that is derived. For example, if grammar with no -production has an equivalent gramthe string 1 + 1 + a is derived according to the leftmost mar in Chomsky normal form or Greibach normal form.
Equivalent here means that the two grammars generate
derivation:
the same language.
S S + S (1)
1 + S (2)
1 + S + S (1)
1 + 1 + S (2)
1 + 1 + a (3)
The especially simple form of production rules in Chomsky normal form grammars has both theoretical and practical implications. For instance, given a context-free
grammar, one can use the Chomsky normal form to construct a polynomial-time algorithm that decides whether
a given string is in the language represented by that grammar or not (the CYK algorithm).
the structure of the string would be:

{ { 1 }S + { { 1 }S + { a }S }S }S
5 Closure properties
Context-free languages are closed under union,

where { ... }S indicates a substring recognized as belong- concatenation, Kleene star,[12] substitution (in paring to S. This hierarchy can also be seen as a tree:
ticular homomorphism),[13] inverse homomorphism,[14]
and intersection with a regular language.[15] They are not
S /|\ / | \ / | \ S '+' S | /|\ | / | \ '1' S '+' S | | '1' 'a'
closed under general intersection (hence neither under
This tree is called a parse tree or concrete syntax tree
complementation) and set dierence.[16]
of the string, by contrast with the abstract syntax tree. In
this case the presented leftmost and the rightmost derivations dene the same parse tree; however, there is another
6 Decidable problems
(rightmost) derivation of the same string
S S + S (1)
S + a (3)
There are algorithms to decide whether a context-free

language is empty, and whether it is nite.[17]
S + S + a (1)
S + 1 + a (2)
7 Undecidable problems
1 + 1 + a (2)
Some questions that are undecidable for wider classes
of grammars become decidable for context-free gramand this denes the following parse tree:
mars; e.g. the emptiness problem (whether the grammar
S /|\ / | \ / | \ S '+' S /|\ | / | \ | S '+' S 'a' | | '1' '1'
generates any terminal strings at all), is undecidable for
If, for certain strings in the language of the grammar, context-sensitive grammars, but decidable for contextthere is more than one parsing tree, then the grammar free grammars.
is said to be an ambiguous grammar. Such grammars are However, many problems are undecidable even for
usually hard to parse because the parser cannot always context-free grammars. Examples are:
decide which grammar rule it has to apply. Usually, ambiguity is a feature of the grammar, not the language, and
an unambiguous grammar can be found that generates the 7.1 Universality
same context-free language. However, there are certain
languages that can only be generated by ambiguous gram- Given a CFG, does it generate the language of all
mars; such languages are called inherently ambiguous lan- strings over the alphabet of terminal symbols used in its
guages.
rules?[18][19]
A reduction can be demonstrated to this problem from
the well-known undecidable problem of determining
4 Normal forms
whether a Turing machine accepts a particular input (the
halting problem). The reduction uses the concept of a
Every context-free grammar that does not generate the computation history, a string describing an entire compuempty string can be transformed into one in which there tation of a Turing machine. A CFG can be constructed
7
that generates all strings that are not accepting computation histories for a particular Turing machine on a parrev
|b
S 1 S1rev | |N SN
ticular input, and thus it will accept all strings only if the
rev
machine doesn't accept that input.
where i denotes the reversed string i and b doesn't
occur among the ai ; and let grammar consist of the rule
7.2
Language equality
T a1 T a1 | |ak T ak |b
Given two CFGs,

language?[19][20]
do they generate the same
The undecidability of this problem is a direct consequence of the previous: it is impossible to even decide
whether a CFG is equivalent to the trivial CFG dening
the language of all strings.
7.3
Language inclusion
Given two CFGs, can the rst one generate all strings that
the second one can generate?[19][20]
If this problem was decidable, then language equality
could be decided too: two CFGs G1 and G2 generate the
same language if L(G1) is a subset of L(G2) and L(G2)
is a subset of L(G1).
7.4
Then the Post problem given by 1 , . . . , N , 1 , . . . , N

has a solution if and only if and share a derivable string.
8 Extensions
An obvious way to extend the context-free grammar formalism is to allow nonterminals to have arguments, the
values of which are passed along within the rules. This
allows natural language features such as agreement and
reference, and programming language analogs such as the
correct use and denition of identiers, to be expressed in
a natural way. E.g. we can now easily express that in English sentences, the subject and verb must agree in number. In computer science, examples of this approach include ax grammars, attribute grammars, indexed grammars, and Van Wijngaarden two-level grammars. Similar
extensions exist in linguistics.
Being in a lower or higher level of the

An extended context-free grammar (or regular right
Chomsky hierarchy
part grammar) is one in which the right-hand side of the
production rules is allowed to be a regular expression over

Using Greibachs theorem, it can be shown that the two the grammars terminals and nonterminals. Extended
following problems are undecidable:
context-free grammars describe exactly the context-free
languages.[21]
Given a context-sensitive grammar, does it describe Another extension is to allow additional terminal syma context-free language?
bols to appear at the left hand side of rules, constrain Given a context-free grammar, does it describe a
regular language?[19][20]
7.5
Grammar ambiguity
Given a CFG, is it ambiguous?

The undecidability of this problem follows from the fact
that if an algorithm to determine ambiguity existed, the
Post correspondence problem could be decided, which is
known to be undecidable.
7.6
Language disjointness
Given two CFGs, is there any string derivable from both

grammars?
If this problem was decidable, the undecidable Post
correspondence problem could be decided, too: given
strings 1 , . . . , N , 1 , . . . , N over some alphabet
{a1 , . . . , ak } , let the grammar consist of the rule
ing their application. This produces the formalism of

context-sensitive grammars.
9 Subclasses
There are a number of important subclasses of the
context-free grammars:
LR(k) grammars (also known as deterministic
context-free grammars) allow parsing (string recognition) with deterministic pushdown automata
(PDA), but they can only describe deterministic
context-free languages.
Simple LR, Look-Ahead LR grammars are subclasses that allow further simplication of parsing.
SLR and LALR are recognized using the same PDA
as LR, but with simpler tables, in most cases.
LL(k) and LL(*) grammars allow parsing by direct
construction of a leftmost derivation as described
above, and describe even fewer languages.
12 NOTES
Simple grammars are a subclass of the LL(1) grammars mostly interesting for its theoretical property
that language equality of simple grammars is decidable, while language inclusion is not.
Bracketed grammars have the property that the terminal symbols are divided into left and right bracket
pairs that always match up in rules.
11 See also
Parsing expression grammar
Stochastic context-free grammar
Algorithms for context-free grammar generation
Pumping lemma for context-free languages
Linear grammars have no rules with more than one 11.1 Parsing algorithms
nonterminal in the right hand side.
CYK algorithm
Regular grammars are a subclass of the linear grammars and describe the regular languages, i.e. they
correspond to nite automata and regular expressions.
LR parsing extends LL parsing to support a larger range
of grammars; in turn, generalized LR parsing extends
LR parsing to support arbitrary context-free grammars.
On LL grammars and LR grammars, it essentially performs LL parsing and LR parsing, respectively, while on
nondeterministic grammars, it is as ecient as can be
expected. Although GLR parsing was developed in the
1980s, many new language denitions and parser generators continue to be based on LL, LALR or LR parsing
up to the present day.
GLR parser
LL parser
Earley algorithm
12 Notes
[1] Stephen Scheinberg, Note on the Boolean Properties of
Context Free Languages, Information and Control, 3, 372
375 (1960).
[2] Introduction to Automata Theory, Languages, and Computation, John E. Hopcroft, Rajeen Motwani, Jerey D.
Ullman, Addison Wesley, 2001, p.191
[3] Hopcroft & Ullman (1979), p. 106.
10
Linguistic applications
Chomsky initially hoped to overcome the limitations of

context-free grammars by adding transformation rules.[4]
Such rules are another standard device in traditional linguistics; e.g. passivization in English. Much of generative
grammar has been devoted to nding ways of rening
the descriptive mechanisms of phrase-structure grammar
and transformation rules such that exactly the kinds of
things can be expressed that natural language actually allows. Allowing arbitrary transformations doesn't meet
that goal: they are much too powerful, being Turing complete unless signicant restrictions are added (e.g. no
transformations that introduce and then rewrite symbols
in a context-free fashion).
Chomskys general position regarding the non-contextfreeness of natural language has held up since then,[22]
although his specic examples regarding the inadequacy
of context-free grammars in terms of their weak generative capacity were later disproved.[23] Gerald Gazdar and Georey Pullum have argued that despite a few
non-context-free constructions in natural language (such
as cross-serial dependencies in Swiss German[22] and
reduplication in Bambara[24] ), the vast majority of forms
in natural language are indeed context-free.[23]
[4] Chomsky, Noam (Sep 1956), Three models for

the description of language (PDF), Information Theory, IEEE Transactions, 2 (3): 113124,
doi:10.1109/TIT.1956.1056813, archived from the
original (PDF) on 2013-10-18, retrieved 2007-06-18
[5] The notation here is that of Sipser (1997), p. 94. Hopcroft
& Ullman (1979) (p. 79) dene context-free grammars
as 4-tuples in the same way, but with dierent variable
names.
[6] Hopcroft & Ullman (1979), pp. 9092.
[7] Nijholt, Anton (1980), Context-free grammars: covers,
normal forms, and parsing, Lecture Notes in Computer
Science, 93, Springer, p. 8, ISBN 3-540-10245-0, MR
590047.
[8] Hopcroft & Ullman (1979), p.88, Lemma 4.1
[9] Hopcroft & Ullman (1979), p.89, Lemma 4.2
[10] This is a consequence of the unit-production elimination
theorem in Hopcroft & Ullman (1979), p.91, Theorem
4.4
[11] Hopcroft & Ullman (1979), p.91, Theorem 4.4
[12] Hopcroft & Ullman (1979), p.131, Theorem 6.1
[13] Hopcroft & Ullman (1979), p.131-132, Theorem 6.2

[18] Sipser (1997), Theorem 5.10, p. 181.
[19] Hopcroft & Ullman (1979), p. 281.
[20] Hazewinkel, Michiel (1994), Encyclopaedia of mathematics: an updated and annotated translation of the Soviet
Mathematical Encyclopaedia, Springer, Vol. IV, p. 56,
ISBN 978-1-55608-003-6.
[21] Norvell, Theodore. A Short Introduction to Regular Expressions and Context-Free Grammars (PDF). p. 4. Retrieved August 24, 2012.
[22] Shieber, Stuart (1985), Evidence against the contextfreeness of natural language (PDF), Linguistics and Philosophy, 8 (3): 333343, doi:10.1007/BF00630917.
[23] Pullum, Georey K.; Gerald Gazdar (1982), Natural
languages and context-free languages, Linguistics and
Philosophy, 4 (4): 471504, doi:10.1007/BF00360802.
[24] Culy, Christopher (1985), The Complexity of the Vocabulary of Bambara, Linguistics and Philosophy, 8 (3):
345351, doi:10.1007/BF00630918.
13
References
Hopcroft, John E.; Ullman, Jerey D. (1979), Introduction to Automata Theory, Languages, and Computation, Addison-Wesley. Chapter 4: Context-Free
Grammars, pp. 77106; Chapter 6: Properties of
Context-Free Languages, pp. 125137.
Sipser, Michael (1997), Introduction to the Theory of Computation, PWS Publishing, ISBN 0-53494728-X. Chapter 2: Context-Free Grammars, pp.
91122; Section 4.1.2: Decidable problems concerning context-free languages, pp. 156159; Section 5.1.1: Reductions via computation histories:
pp. 176183.
J. Berstel, L. Boasson (1990). Jan van Leeuwen, ed.
Context-Free Languages. Handbook of Theoretical
Computer Science. B. Elsevier. pp. 59102.
14
External links
Computer programmers may nd the stack exchange answer to be useful.

Non-computer programmers will nd more academic introductory materials to be enlightening.
10
15
15
15.1
TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES
Text and image sources, contributors, and licenses

Text
Context-free grammar Source: https://en.wikipedia.org/wiki/Context-free_grammar?oldid=747787908 Contributors: AxelBoldt,

LC~enwiki, Brion VIBBER, Zundark, Jan Hidders, Arvindn, William Avery, Boleslav Bobcik, Hephaestos, K.lee, JohnOwens, Tim Starling, Kwertii, Rp, Wwwwolf, TakuyaMurata, Minesweeper, Nanshu, Julesd, Cadr, Rotem Dan, Kaihsu, BAxelrod, Timwi, Dcoetzee,
David Latapie, Fuzheado, Doradus, TheLibrarian, Jni, Phil Boswell, Robbot, Jaredwf, Mountain, Peak, Naddy, P0lyglut, Saforrest, Ruakh,
Giftlite, Sundar, Tristanreid, ConradPino, Augur, Tyler McHenry, Urhixidur, Ehamberg, Delepster, Luqui, Jayc, Bender235, ZeroOne, Ben
Standeven, Kronoss, Brynosaurus, Kwamikagami, Chentz, LeonardoGregianin, .:Ajvol:., Iltseng, Saccade, Jonsafari, John Fader, Obradovic
Goran, Florian D., Qsebas, Arthena, Babajobu, Caesura, Alaudo, Adrian.benko, Kenyon, Oleg Alexandrov, FrancisTyers, Miaow Miaow,
Oliphaunt, Ruud Koot, Smmurphy, Flamingspinach, Pete142, SixWingedSeraph, FreplySpang, Emallove, Sdornan, Mathbot, Vsion, Quuxplusone, J.H, Zeus-chu, Grafen, Muu-karhu, Crasshopper, Mararo, Ruds, Cedar101, JoanneB, LeonardoRob0t, Chip Zero, Josh Triplett,
SmackBot, Jokl, RaulMiller, Karl Stroetmann, Mcld, Gelingvistoj, Nbarth, Kostmo, Dumpendebat, SundarBot, Daniel.Cardenas, Yoshigev,
SashatoBot, Khazar, N3bulous, Rijkbenik, Breno, Arialblack, Tim.thelion, Rizome~enwiki, Yoderj, Dreftymac, Paul Foxworthy, Zero
sharp, Jafet, HenkeB, Simeon, Cydebot, Future Perfect at Sunrise, Gogo Dodo, QTJ, Ebrahim, DavePeixotto, Merutak, Flszen, Escarbot,
JordanDeLong, AnAj, Weixifan, Hermel, JAnDbot, Kishany444, TAnthony, 1&only, Tedickey, VegKilla, A3nm, David Eppstein, Eugenwpg, Shikhil pict, Sibi antony, Rbrewer42, Inimino, AntiSpamBot, Azndragon1987, DorganBot, CBKAtTopsails, VolkovBot, TreasuryTag,
Camrn86, TXiKiBoT, Nxavar, Lejatorn, AlleborgoBot, Romuald Wrblewski, SieBot, Bedelato, Yintan, Likebox, KoenDelaere, Svick,
AlanUS, Anchor Link Bot, UKoch, Universityuser, Alexbot, SpikeToronto, MystBot, Algebran, Addbot, DOI bot, Download, Lightbot, Jarble, Legobot, Luckas-bot, Yobot, Fraggle81, AnomieBOT, Hodduc, Materialscientist, Citation bot, LilHelpa, Marchash, Johnan, Kmetla,
FrescoBot, Sae1962, Citation bot 1, Erwinrcat, Tjo3ya, Spencer.matthew, Fastilysock, DARTH SIDIOUS 2, RjwilmsiBot, EmausBot,
WikitanvirBot, Zahnradzacken, ZroBot, The Stingray, Susfele, Mastermind3k, Dominique.devriese, Peterh5322, Mouhaned99, Givicencio, Jamesdady, Mikhail Ryazanov, ClueBot NG, Satellizer, Gran, Oddbodz, Helpful Pixie Bot, Som999, BG19bot, Arnavchaudhary,
Marcelkcs, Ringesherre, Gerasimovd, Saratchandraprasad, Victor Yus, ChrisGualtieri, Jochen Burghardt, Phylake, Conorbrady, Melody
Lavender, My-2-bits, W. P. Uzer, Dough34, Mhauwe, Upkarsh, ColRad85, Gareld Gareld, Amit gh, Programmers are evil, Korogba and
Anonymous: 259
15.2
Images
File:Question_book-new.svg Source: https://upload.wikimedia.org/wikipedia/en/9/99/Question_book-new.svg License: Cc-by-sa-3.0

Contributors:
Created from scratch in Adobe Illustrator. Based on Image:Question book.png created by User:Equazcion Original artist:
Tkgd2007
15.3
Content license
Creative Commons Attribution-Share Alike 3.0

Grammar Free

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Grammar Free

Caricato da

Copyright:

Formati disponibili

Context-free grammar

A context-free grammar (CFG) is a term used in formal

generate the same language?) is undecidable.

Rules can also be applied in reverse to check if a string is

Context-free grammars arise in linguistics where they are

which clauses nest inside other clauses, and the way in

The formalism of context-free grammars was developed

For any strings u, v (V ) , we say u directly yields

2.3 Repetitive rule application

written as uv (or u v in some textbooks), if

The language of a grammar G = (V, , R, S) is the set

Well-formed nested parentheses and square brackets

A context-free grammar is said to be proper,[7] if it has

3.2 Well-formed nested parentheses and

A second canonical example is two dierent kinds of

[ [ [ [(((( ] ] ] ]))))(([ ))(([ ))([ )( ])( ])( ])

S aSa aaSaa aabSbaa aabbaa.

Every regular grammar is context-free, but not all

more tersely as follows:

In a context-free grammar, we can pair up characters the

( S ) * S - S / S (by rule 8, applied to the

This grammar generates the language {an bn : n 1}

The special character stands for the empty string. By

Note that many choices were made underway as to which

Here is a context-free grammar for syntactically correct

S * S - S / S (by rule 7, applied to the rightmost S)

S - S / S (by rule 7, applied to the rightmost

S * S - S / S (by rule 6, applied to the leftmost

Also, many choices were made on which rule to apply to

S * S - S (by rule 6, applied to the leftmost

S * S - S / S (by rule 7, applied to the rightmost S)

For example, x + y * z can be produced with these two

Derivations and syntax trees

S S | | /|\ /|\ S * S S + S / \ / \ /|\ z x /|\ S + S S * S | | | | x y 3.6.3 Other examples

3.7 Derivations and syntax trees

A derivation of a string for a grammar is a sequence of

A derivation is fully determined by giving, for each step:

the rule applied in that step

S (rule 1 on rst S) S+S (rule 1 on second S) S+S+S

in a leftmost derivation, it is always the leftmost nonterminal;

Here, the nonterminal T can generate all strings with the

Another example of a non-regular language is

S (rule 1 on rst S) S+S (rule 2 on rst S) 1+S

the structure of the string would be:

Context-free languages are closed under union,

There are algorithms to decide whether a context-free

Given two CFGs,

do they generate the same

Then the Post problem given by 1 , . . . , N , 1 , . . . , N

Being in a lower or higher level of the

production rules is allowed to be a regular expression over

Given a CFG, is it ambiguous?

Given two CFGs, is there any string derivable from both

ing their application. This produces the formalism of

Chomsky initially hoped to overcome the limitations of

[4] Chomsky, Noam (Sep 1956), Three models for

[15] Hopcroft & Ullman (1979), p.135-136, Theorem 6.5

Computer programmers may nd the stack exchange answer to be useful.

TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

Text and image sources, contributors, and licenses

Context-free grammar Source: https://en.wikipedia.org/wiki/Context-free_grammar?oldid=747787908 Contributors: AxelBoldt,

File:Question_book-new.svg Source: https://upload.wikimedia.org/wikipedia/en/9/99/Question_book-new.svg License: Cc-by-sa-3.0