Sei sulla pagina 1di 3

In formal language theory, a context-free grammar is said to be in Chomsky normal form if all of its production rules are of the

form: or or

where A, B and C are nonterminal symbols, is a terminal symbol (a symbol that represents a constant value), S is the start symbol, and is the empty string. Also, neither B nor C may be the start symbol. Every grammar in Chomsky normal form is context-free, and conversely, every context-free grammar can be transformed into an equivalent one which is in Chomsky normal form. Several algorithms for performing such a transformation are known. Transformations are described in most textbooks on automata theory, such as (Hopcroft and Ullman, 1979). As pointed out by Lange and Lei, the drawback of these transformations is that they can lead to an undesirable bloat in grammar size. The size of a grammar is the sum of the sizes of its production rules, where the size of a rule is one plus the length of its right-hand side. Using | G | to denote the size of the original grammar G, the size blow-up in the worst case may range from | G | 2 to 22 | G | , depending on the transformation algorithm used (Lange and Lei, 2009). Another way to define Chomsky normal form (e.g., Hopcroft and Ullman 1979, and Hopcroft et al. 2006) is: A formal grammar is in Chomsky reduced form if all of its production rules are of the form: or

where A, B and C are nonterminal symbols, and is a terminal symbol. When using this definition, Bor C may be the start symbol. Only those context-free grammars which do not generate the empty string, can be transformed into Chomsky reduced form. Another way to define Chomsky normal form (e.g., Hopcroft and Ullman 1979, and Hopcroft et al. 2006) is: A formal grammar is in Chomsky reduced form if all of its production rules are of the form: or

where A, B and C are nonterminal symbols, and is a terminal symbol. When using this definition, Bor C may be the start symbol. Only those context-free grammars which do not generate the empty string, can be transformed into Chomsky reduced form. [edit]Converting

a grammar to Chomsky Normal Form

1. Introduce S0
Introduce a new start variable, S0 and a new rule variable. where S is the previous start

2. Eliminate all

rules where and where V is the CFG's

rules are rules of the form variable alphabet. Remove every rule with

on its right hand side (RHS). For each rule with A in its RHS, add a set

of new rules consisting of the different possible combinations of A replaced or not replaced with . If a rule has A as a singleton on its RHS, add a new rule unless R has already

been removed through this process. For example, examine the following grammar G:

G has one

rule. When the

is removed, we get the following:

Notice that we have to account for all possibilities of rules.

and so we actually end up adding 3

3. Eliminate all unit rules

After all the

rules have been removed, you can begin removing unit rules, or rules whose RHS

contains one variable and no terminals (which is inconsistent with CNF). To remove add rule unless this is a unit rule which has already been removed.

4. Clean up remaining rules that are not in Chomsky normal form. Replace h wit where Ai are new variables.

If

, replace ui in above rules with some new variable vi and add rule