UNIT- II
CONTEXT-FREE LANGUAGES
AND
NORMAL FORMS
SYLLABUS
• Context-free grammars
• More examples
• Union, concatenations, and *’s of CFLs
• Derivation trees and ambiguity
• Unambiguous CFG for algebraic expressions
• Normal Forms – CNF – GNF
Need for CFG
• C++ Program
What is Context-free grammar?
• It is defined as a formal notation or a model for
describing natural language or programming
language.
• Application:
– Compilers – Parser Phase
Example-Production Rule
Formal Definition of CFG
• Definition − A context-free grammar (CFG) consisting of a
finite set of grammar rules is a quadruple
G= (N, T, P, S)
Where,
– N is a set of non-terminal symbols.
– T is a set of terminals where N ∩ T = NULL.
– P is a set of rules, P: N → (N ∪ T)*, i.e., the left-hand side of the
production rule P does have any right context or left context.
– S is the start symbol.
Example
The grammar ({A}, {a, b, c}, P, A), P : A → aA, A → abc.
The grammar ({S}, {a, b}, P, S), P: S → aSa, S → bSb, S → ε
The grammar ({S, F}, {0, 1}, P, S), P: S → 00S | 11F, F → 00F | ε
Construction of CFG
1. Construct the CFG for the language having any
number of a's over the set ∑= {a}
RE=a* Derive a string "aaaaaa", we can
start with start symbols.
P2: S => aS
=>aaS rule 1
Q → aQ =>aaaS rule 1
=>aaaaS rule 1
Q→ε =>aaaaaS rule 1
=>aaaaaaS rule 1
G=({S},{a, ε },P2,Q} =>aaaaaaε rule 2
=>aaaaaa
N={S}
T={a, ε }
S= Start Symbol
2. Construct a CFG for a language
L = {wcwR | where w € (a, b)*}.
Strings= {aacaa, bcb, abcba, bacab, abbcbba,...}
P1: Derive: abbcbba
Derivation:
S → aSa rule 1
S =>aSa
S → bSb rule 2 => abSba from rule 2
=> abbSbba from rule 2
S→c => abbcbba from rule 3
G= ( {S} ,{a,b,c} , P, {S}) =>abbcbba
[Link] a CFG for the language L = anb2n
where n>=1.
String = {abb, aabbbb, aaabbbbbb....}.
P:
S → aSbb | abb
Tuples: G =({S},{a,b},P, {S})
Derive: aabbbb
Derivation
S =>aSbb
=>aabbbb
4. Design a CFG for a language accepting balanced
parenthesis. Derive: ({[]})
Derivation:
T={ (,),{,},[,] } S=> ( S)
=>( { S } )
=>( { [ s ] } )
w= { (),{},[],(()),({[]}), …….} =>( { [ ] } )
P:
S->(S) /[S]/{S} / SS
S-> ε
5. Design a CFG for a language accepting Arithmetic
Expression Drive- a+a-a
Derivation:
E=>E+E
T={ +, -, *, /, a } =>E+E-E
=>a+E-E
P: =>a+a-E
=>a+a-a
E->E+E/E-E/E*E/ E/E / (E) / a
Given a CFG, find the Language generated by G
i. G=(N,T,P,S) where N={S}, T={a,b} P={S→aSb, S→ab}
S=>ab
S=>aSb
=>aaSbb
=>aaabbb
S=>aSb
=>aabb
W={ab,aabb,aaabbb,.....}
Ans: L={ an bn / n>=1 }
ii. G=(P,{Ꜫ,0,1},P,P}, P:P→0 | 1 |Ꜫ |0P0 |1P1
P=>0
P=>1
P=>Ꜫ
P=>0P0
=>000
P =>1P1
=>10P01
=>101P101
=>1010101
W={Ꜫ,0,1,010,101,11011,…}
Ans: L={Language of Palindromes}
Context Free Language and Regular Grammar
• CFL:
– A language L is said to be a CFL if there is a CGF , G such that
L=L(G).
– Let G =(N,T,P,S ) be a CFG, The Language Generated by G is
• Regular Grammar:
– A regular grammar is a mathematical object, G, with four
components,
G = (N, Σ, P, S),
where P is in any of the following form
• A → aB, A → a , A → ε
for A, B ∈ N, a ∈ Σ, and ε the empty string
Union, Concatenation and * of CFL’s
Statement: The context-free languages are closed under union, concatenation
and Kleene closure.
Theorem: IF L1 and L2 are CFL, the languages L1UL2, L1.L2 and L1* are also
CFL’s.
• Proof: If we start with CFG’s
G1=(N1,T1,P1,s1), G2=(N2,T2,p2,S2)
G1, and G2 generates L1 and L2 respectively,
Case 1: L(G) ⊆ L1UL2
A Grammar, Gu=(N, T, P,S) generating L1UL2
N=N1UN2U{S}
where, S is the new symbol not in N1 or N2
Then , P= P1 UP2U{S->S1|S2}
Hence, if x ∈ L(G1) then
And we start the derivation with S->S1
Similarly, if x ∈L(G2) then
and we start the derivation with S->S2
Therefore, L(G) ⊆ L1UL2
Case 2: L(G) ⊆ L1.L2
A Grammar, Gu=(N, T, P,S) generating L1UL2
N=N1UN2U{S}
where, S is the new symbol not in N1 or N2
Then , P= P1 UP2U{S->S1.S2}
Hence, if x ∈ L1L2 then x=x12 where x1 ∈L1 and x2 ∈L2
And we start the derivation with
S=>S1S2=>*x1S2 =>*x1x2=x
Hence L(G) ⊆ L1.L2
Case 3: L(G*) ⊆ L1*
A grammar G=(N,T,P,S) generated by L1*
Where,
N=N1 U {S}
P = P1U{ S->S1S|ε}
Hence L(G*) ⊆ L1*
Hence:
The context-free languages are closed under union, concatenation
CFG for RE
Obtain the CFG for the RE (011+1)*.(01)*
(011+1) :
A->011|1
(011+1)* :
B->AB |ε
(01) :
C->01
(01)* :
D->CD| ε
(011+1)*.(01)*
S->BD
B->AB / ε
A->011/1
D->CD/ ε
C->01
Derivation
Process of deriving a string using the grammar
Types :
Left Most Derivation (LMD)
Right Most Derivation (RMD)
Example :
E->E+E|E-E|E*E|E/E|(E)|id
String : id+id*id
LMD
Derivation trees and Ambiguity
• Derivation tree is a graphical representation for
the derivation.
• Called as Parse Tree
Properties:
• The root node is always a node indicating start
symbols.
• The leaf node is always terminal nodes.
• The interior nodes are always the non-terminal
nodes.
Example
1. Construct derivation tree for the following Production
Derivation Tree
P: E = E + E
E=E*E
E=a|b|c
String= a * b + c
E=>E+E
=>E*E+E
=>a*E+E
=>a*b+E
=> a*b+c
2. Draw a derivation tree for the string "bab" from
the CFG given by
P: S → bSb | a | b
Derivation Tree
String : bbabb
S=>bSb
=>bbSbb
=>bbabb
3. Draw a Parse Tree for the string " aabbabba " from
the CFG given by
P:S → aB | bA Derivation tree
A → a | aS | bAA
B → b | bS | aBB
String=aabbabba
S=>aB
=>aaBB
=>aabSB
=>aabbAB
=>aabbaB
=>aabbabS
=>aabbabbA
=>aabbabba
Parse Tree
• Yield of a Tree:
– It is the string obtained by reading the leaf's from left
to right without considering null symbol.
• Sentinel Form : Every step in the derivation
Ambiguity in Grammar
• A grammar is said to be ambiguous if there
exists more than one LMD or more than one
RMD.
• More than one parse tree for the given input
string.
• If the grammar is not ambiguous, then it is
called unambiguous.
• Drawback:
– If the grammar has ambiguity, then it is not good
for compiler construction.
Example
1. Check the given Grammar is ambiguous or not
P: E → I
E→E+E DT-1 DT-2
E→E*E
E → (E)
I → ε | 0 | 1 | 2 | ... | 9
String: 3 * 2 + 5
Hence G is ambiguous
2. Check whether the given grammar G is ambiguous
or not.
• P: S → aSb | SS DT1 DT2
S→ε
• String =aabb
Hence G is Ambiguous Grammar
Unambiguous Grammar
• A grammar can be unambiguous if the
grammar does not contain ambiguity.
Example
• P: S → S + A
A→A*B|B
B→C^B|C
C→a
Chomsky's Normal Form (CNF)
• A CFG is in CNF if all production rules satisfy
one of the following conditions:
– A non-terminal generating two non-terminals.
example, S → AB.
– A non-terminal generating a terminal. For
example, S → a.
Example G:
G1 = {S → AB, S → c, A → a, B → b} --CNF
G2 = {S → aA, A → a, B → c} ----- not in CNF
Example
1. G= {S → aA, A → a, B → c} Convert to CNF
Solution:
Step 1: Eliminate start symbol from the RHS.
S1 → S Where S1 is the new start symbol.
Step 2: In the grammar, remove the null, unit and
useless productions.
No Null Production
Step 3: Eliminate terminals from the RHS of the
production if they exist with other non-terminals or
terminals.
For example, production
S → aA can be decomposed as:
S → RA
R→a
Step 4: Eliminate RHS with more than two non-terminals.
For example,
S → ASB
can be decomposed as:
• S → RS
• R → AS
1. Convert the given CFG to CNF. Consider the given grammar G1:
S → a | aA | B
A → aBB | ε
B → Aa | b
Solution:
Step 1: We will create a new production S1 → S, as the start symbol S appears on
the RHS.
The grammar will be:
S1 → S
S → a | aA | B
A → aBB | ε
B → Aa | b
Step 2: Removal of null productions. As grammar G1 contains A → ε null
production
S1 → S
S → a | aA | B
A → aBB
B → Aa | b | a
Now, as grammar G1 contains Unit production S → B, its removal
yield:
S1 → S
S → a | aA | Aa | b
A → aBB
B → Aa | b | a
Also remove the unit production S1 → S, its removal from the
grammar yields:
S1 → a | aA | Aa | b
S → a | aA | Aa | b
A → aBB
B → Aa | b | a
• Step 3: In the production rule
S0 → aA | Aa,
S → aA | Aa,
A → aBB
B → Aa,
terminal a exists on RHS with non-terminals. So we will replace
terminal a with X:
S0 → a | XA | AX | b
S → a | XA | AX | b
A → XBB
B → AX | b | a
X→a
S0 → a | XA | AX | b
S → a | XA | AX | b
A → XBB
B → AX | b | a
X→a
Step 4: In the production rule
A → XBB,
RHS has more than two symbols, removing it from grammar yield:
S0 → a | XA | AX | b
S → a | XA | AX | b
A → RB
B → AX | b | a
X→a
R → XB
Hence the given grammar is converted to CNF
Greibach Normal Form (GNF)
• A CFG(context free grammar) is in GNF(Greibach
normal form) if all the production rules satisfy one of
the following conditions:
– A non-terminal generating a terminal.
For example, A → a.
– A non-terminal generating a terminal which is
followed by any number of non-terminals.
For example, S → aASB.
Example :
G1 = {S → aAB | aB, A → aA| a, B → bB | b} --GNF
G2 = {S → aAB | aB, A → aA | ε, B → bB | ε} --not GNF
Steps for converting CFG into GNF
Step 1: Convert the grammar into CNF.
Step 2: If the grammar exists left recursion,
eliminate it.
Step 3: In the grammar, convert the given
production rule into GNF form.
1. Convert the given Grammar to GNF
S→ XB | AA
A → a | SA
B→b
X→a
– As the given grammar G is already in CNF and there is
no left recursion,
– Skip step 1 and step 2 and directly go to step 3.
S→ XB | AA
A → a | SA
B→b
X→a
Step 3: The production rule A → SA is not in GNF,
S → XB | AA
A → a | XBA | AAA
B→b
X→a
The production rule S → XB and B → XBA is not in GNF
S → aB | AA
A → a | aBA | AAA
B→b
X→a
• Now we will remove left recursion (A → AAA),
we get:
S → aB | AA
A → aC | aBAC
C → AAC | ε
B→b
X→a
• Now we will remove null production C → ε, we get:
S → aB | AA
A → aC | aBAC | a | aBA
C → AAC | AA
B→b
X→a
• The production rule S → AA is not in GNF,
S → aB | aCA | aBACA | aA | aBAA
A → aC | aBAC | a | aBA
C → AAC
C → aCA | aBACA | aA | aBAA
B→b
X→a
• The production rule C → AAC is not in GNF,
S → aB | aCA | aBACA | aA | aBAA
A → aC | aBAC | a | aBA
C → aCAC | aBACAC | aAC | aBAAC
C → aCA | aBACA | aA | aBAA
B→b
X→a
Hence the given grammar is in GNF