Predictive Parser
{ Grammar and Algorithm }
LL(1) Grammar
• Predictive parsers can be constructed for
a class of grammars called LL(1).
• L->Left
L->Leftmost derivation
1->One input symbol at each step
• No left recursive or ambiguous grammar
can be LL(1)
Conditions of LL(1)
Grammar G is called LL(1) if and only if
whenever,
If A->α|β are two distinct productions of G,
the following conditions hold :-
1. (a) FIRST(α) , FIRST(β) must be disjoint.
This is to be able to deterministically guess
the production.
(b) At most one of the strings α or β can
derive ε (Since FIRST(α), FIRST(β) are
disjoint.
2. If α -> ε then FIRST(β) and FOLLOW(A)
must be disjoint
Algorithm for construction of parsing table
INPUT :- Grammar G
OUTPUT:- Parsing table M
For each production A -> α , do the following :
1. For each terminal ‘a’ in FIRST(A), add
A-> α to M[A,α].
2. If ε is in FIRST(α) then for each terminal b
in FOLLOW(A). A -> α to M[A,b]. If b is $
then also add A -> α to M[A,$].
3. If there is no production in M[A,a] , then
set M[A,a] to error.
Example of LL(1) grammar
E -> TE’
E -> +TE’|ε
T -> FT’
T’ -> *FT’|ε
F -> (E)|id
Generated Parser Table
For String id + id * id
Non INPUT
Terminal SYMBOLS
id + * ( ) $
E E -> TE’ E -> TE’
E’ E -> +TE’ E’ -> ε E’ -> ε
T T -> FT’ T -> FT’
T’ T’ -> ε T’ -> *FT’ T’ -> ε T’ -> ε
F F -> id F -> (E)
Table driven predictive parsing
• A predictive parser can be built by maintaining a
stack explicitly.
• The table driven parser has an input buffer, stack
containing sequence of grammar symbols, parsing
table and an output stream.
• The input buffer contains the string to be parsed
followed by $.
• Initially the stack contains start symbol of the
grammar on the top followed by $.
• The parsing table deterministically guesses the
correct production to be used.
Flow Chart
Table
Parser
Input Buffer
Stack
Procedure of predictive parser
• The current symbol of the input string is
maintained by a pointer say ‘ip’.
• In every step consider the set {a,α} where ‘a’ is
the symbol pointed by the ‘ip’ and ‘α’ is the top
of the stack.
• If ‘α’ is a Non Terminal ,then see the table cell
M{α,a} for the production.
[Link] M{α,a} is a valid production then pop the
stack , push the production into the stack.
[Link] M{α,a} is error or blank then report an error
• .If ‘α’ is a terminal then pop it from the stack and
also increment the input pointer ‘ip’ to point the
next symbol in the input string.
• The output will be the set of productions
• The following example illustrates the top-down
predictive parser using parser table .
String: id + id * id
Grammar: Mentioned LL(1) Grammar in
Previous slides
MATCHED STACK INPUT ACTION
E$ id+id * id$
TE’$ id+id * id$ E->TE’
FT’E’$ id+id * id$ T->FT’
id T’E’$ id+id * id$ F->id
id T’E’$ +id * id$ Match id
id E’$ +id * id$ T’->Є
id +TE’$ +id * id$ E’-> +TE’
id+ TE’$ id * id$ Match +
id+ FT’E’$ id * id$ T-> FT’
id+ idT’E’$ id * id$ F-> id
id+id T’E’$ * id$ Match id
id+id * FT’E’$ * id$ T’-> *FT’
id+id * FT’E’$ id$ Match *
id+id * idT’E’$ id$ F-> id
id+id * id T’E’$ $ Match id
id+id * id E’$ $ T’-> Є
id+id * id $ $ E’-> Є
• Parsing methods:
– Top-down parsing
– Bottom-up parsing
– Universal
• Non recursive predictive parsing
– Predictive parser can be implemented by recursive-descent parsing
(may need to manipulate the grammar, e.g eliminating left
recursion and left factoring).
• Requirement: by looking at the first terminal symbol that a
nonterminal symbol can derive, we should be able to choose the right
production to expand the nonterminal symbol.
– If the requirement is met, the parser easily be implemented using a
non-recursive scheme by building a parsing table.
• A parsing table example
(1) E->TE’
(2) E’->+TE’ id + * ( ) $
(3) E’->e E (1) (1)
(4) T->FT’ E’ (2) (3) (3)
(5) T’->*FT’
(6) T’->e
T (4) (4)
(7) F->(E) T’ (6) (5) (6) (6)
(8) F->id F (8) (7)
• Using the parsing table, the
predictive parsing program works
like this:
– A stack of grammar symbols ($ on the bottom)
– A string of input tokens ($ at the end)
– A parsing table, M[NT, T] of productions
– Algorithm:
– put ‘$ Start’ on the stack ($ is the end of input
string).
1) if top == input == $ then accept
2) if top == input then
pop top of the stack; advance to next input
symbol; goto 1;
3) If top is nonterminal
if M[top, input] is a production then
replace top with the production; goto 1
else error
4) else error
– Example:
id + * ( ) $
(1) E->TE’ E (1) (1)
(2) E’->+TE’
(3) E’->e
E’ (2) (3) (3)
(4) T->FT’ T (4) (4)
(5) T’->*FT’ T’ (6) (5) (6) (6)
(6) T’->e F (8) (7)
(7) F->(E)
(8) F->id
Stack input production
$E id+id*id$
$E’T id+id*id$ E->TE’
$E’T’F id+id*id$ T->FT’
$E’T’id id+id*id$ F->id
$E’T’ +id*id$
…...
This produces leftmost derivation:
E=>TE’=>FT’E’=>idT’E’=>….=>id+id*id
• How to construct the parsing table?
– First(a): Here, a is a string of symbols. The set of
terminals that begin strings derived from a. If a is
empty string or generates empty string, then empty
string is in First(a).
– Follow(A): Here, A is a nonterminal symbol. Follow(A)
is the set of terminals that can immediately follow A in
a sentential form.
– Example:
S->iEtS | iEtSeS|a
E->b
First(a) = ?, First(iEtS) = ?, First(S) = ?
Follow(E) = ? Follow(S) = ?
• How to construct the parsing table?
– With first(a) and follow(A), we can build the parsing
table. For each production A->a:
• Add A->a to M[A, t] for each t in First(a).
• If First(a) contains empty string
– Add A->a to M[A, t] for each t in Follow(A)
– if $ is in Follow(A), add A->a to M[A, $]
• Make each undefined entry of M error.
– See the example 4.18 (page 191).
• Compute FIRST(X)
– If X is a terminal then FIRST(X) = {X}
– If X->e, add e to FIRST(X)
– if X->Y1 Y2 … Yk and Y1 Y2 … Yi-1==>e, where I<= k, add
every none e in FIRST(Yi) to FIRST(X). If Y1…Yk=>e, add e to
FIRST(X).
– FIRST(Y1 Y2 … Yk): similar to the third step.
E->TE’ FIRST(E) = {(, id}
E’->+TE’|e FIRST(E’)={+, e}
T->FT’ FIRST(T) = {(, id}
T’->*FT’ | e FIRST(T’) = {*, e}
F->(E) | id FIRST(F) = {(, id}
• Compute Follow(A).
– If S is the start symbol, add $ to Follow(S).
– If A->aBb, add Frist(b)-{e} to Follow(B).
– If A->aB or A->aBb and b=>e, add Follow(A) to Follow(B).
E->TE’ First(E) = {(, id}, Follow(E)={), $}
E’->+TE’|e First(E’)={+, e}, Follow(E’) = {), $}
T->FT’ First(T) = {(, id}, Follow(T) = {+, ), $}
T’->*FT’ | e First(T’) = {*, e}, Follow(T’) = {+, ), $}
F->(E) | id First(F) = {(, id}, Follow(F) = {*, +, ), $}
• LL(1) grammar:
– First L: scans input from left to right
– Second L: produces a leftmost derivation
– 1: uses one input symbol of lookahead at each step to make a
parsing decision.
– A grammar whose parsing table has no multiply-defined entries is
a LL(1) grammar.
– No ambiguous or left-recursive grammar can be LL(1)
– A grammar is LL(1) iff for each set of A productions, where
– A | | ... | The following conditions hold:
1 2 n
First ( i ) First ( j ) {}, when 1 i n and 1 j n and i j
if i , then
(a) no, j e, when i j
(b) First( j ) Follow(A) {}, when i j.
• Example, build LL(1) parsing table for the
following grammar:
S-> i E t S e S | i E t S | a
E -> b