SYNTAX ANALYSIS
OR
PARSING
Lecture 08
1
LR(1) ITEM
To avoid some of invalid reductions, the states need to carry
more information.
Extra information is put into a state by including a terminal
symbol as a second component in an item.
A LR(1) item is:
.
A → α β,a where a is the look-head of the LR(1)
item
(a is a terminal or end-marker.)
2
LR (1) ITEMS
3
INTUITION BEHIND LR (1) ITEMS
4
INTUITION BEHIND LR (1) ITEMS
5
THE CLOSURE FUNCTION
6
THE CLOSURE FUNCTION
7
LR (1) EXAMPLE
S -> CC
C -> cC
C -> d
8
THE GOTO FUNCTION
9
CONSTRUCTION OF LR(1) PARSING TABLES
1. Construct the canonical collection of sets of LR(1) items for G’.
C←{I0,...,In}
.
2. Create the parsing action table as follows
• If a is a terminal, A→α aβ,b in Ii and goto(Ii,a)=Ij then action[i,a]
.
is shift j.
.
• If A→α ,a is in Ii , then action[i,a] is reduce A→α where A≠S’.
• If S’→S ,$ is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the grammar is
not LR(1).
3. Create the parsing goto table
• for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j
4. All entries not defined by (2) and (3) are errors.
10
5. Initial state of the parser contains S’→.S,$
LALR Parsing Tables
• LALR stands for LookAhead LR.
• LALR parsers are often used in practice because
LALR parsing tables are smaller than LR(1) parsing
tables.
• The number of states in SLR and LALR parsing tables
for a grammar G are equal.
• But LALR parsers recognize more grammars than
SLR parsers.
• A state of LALR parser will be again a set of LR(1)
items.
CREATING LALR PARSING TABLES
Canonical LR(1) Parser € LALR Parser
shrink # of states
• This shrink process may introduce a reduce/reduce
conflict in the resulting LALR parser (so the grammar
is NOT LALR)
• But, this shrink process does not produce a
shift/reduce conflict.
The Core of A Set of LR(1) Items
• The core of a set of LR(1) items is the set of its
first component.
Ex: S → L =R,$
R → L ,$
.. S → L =R
R→L
.. Core
• We will find the states (sets of LR(1) items) in a canonical LR(1)
parser with same cores. Then we will merge them as a single
state.
I1:L → .
id ,= Same Core ..
A new state:I12: L → id ,=
L → id ,$
.
I2:L → id ,$ Merge Them
• We will do this for all states of a canonical LR(1) parser to get
the states of the LALR parser.
• In fact, the number of the states of the LALR parser for a
grammar will be equal to the number of states of the SLR
CREATION OF LALR PARSING TABLES
• Create the canonical LR(1) collection of the sets of LR(1) items
for the given grammar.
• Find each core; find all sets having that same core; replace
those sets having same cores with a single set which is their
union.
C={I0,...,In} € C’={J1,...,Jm} where m ≤ n
• Create the parsing tables (action and goto tables) same as the
construction of the parsing tables of LR(1) parser.
– Note that: If J=I1 ∪ ... ∪ Iksince I1,...,Ik have same cores
€ cores of goto(I1,X),...,goto(I2,X) must be same.
– So, goto(J,X)=K where K is the union of all sets of items
having same cores as goto(I1,X).
• If no conflict is introduced, the grammar is LALR(1) grammar.
(We may only introduce reduce/reduce conflicts; we cannot
introduce a shift/reduce conflict)
SHIFT/REDUCE CONFLICT
• We say that we cannot introduce a shift/reduce conflict during
the shrink process for the creation of the states of a LALR
parser.
• Assume that we can introduce a shift/reduce conflict. In this
case, a state of LALR parser must have:
.
A → α ,a and .
B → β aγ,b
• This means that a state of the canonical LR(1) parser must
have:
.
A → α ,a and .
B → β aγ,c
But, this state has also a shift/reduce conflict. i.e. The original
canonical LR(1) parser has a conflict.
(Reason for this, the shift operation does not depend on
lookaheads)
Reduce/Reduce Conflict
• But, we may introduce a reduce/reduce conflict during the
shrink process for the creation of the states of a LALR parser.
.
I1 : A → α ,a .
I2: A → α ,b
.
B → α ,b .
B → α ,c
⇓
.
I12: A → α ,a/b € reduce/reduce conflict
.
B → α ,b/c
Any Questions ?
17