0% found this document useful (0 votes)
49 views49 pages

CD Unit-2.notes

Unit-2 covers Syntax Analysis in compilers, detailing the phases of lexical and syntax analysis, the importance of syntax checking, and various parsing techniques including top-down and bottom-up parsing. It explains derivation processes, parse trees, and the distinction between ambiguous and unambiguous grammars, along with examples and limitations of syntax analysis. The unit also discusses context-free grammar, types of parsers, and the elimination of left recursion in grammar.

Uploaded by

NARAHARI VUTTI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views49 pages

CD Unit-2.notes

Unit-2 covers Syntax Analysis in compilers, detailing the phases of lexical and syntax analysis, the importance of syntax checking, and various parsing techniques including top-down and bottom-up parsing. It explains derivation processes, parse trees, and the distinction between ambiguous and unambiguous grammars, along with examples and limitations of syntax analysis. The unit also discusses context-free grammar, types of parsers, and the elimination of left recursion in grammar.

Uploaded by

NARAHARI VUTTI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

[CD] Unit-2

Unit-2 syllabus
Syntax Analysis
Introduction, Context-Free Grammars, Writing a Grammar, Top-Down Parsing, Bottom-
Up Parsing, Introduction to LR Parsing: Simple LR, More Powerful LR Parsers, Using
Ambiguous Grammars and Parser Generators.

Syntax analysis:

 An input string is passed through several phases in a compiler. The first phase is
the lexical analysis, where the input is scanned and is divided into tokens. Syntax
analysis is the second phase of a compiler. The output of syntax analysis is used as
input to the semantic analyzer.
 In syntax analysis, the compiler checks the syntactic structure of the input string,
i.e., whether the given string follows the grammar or not. It uses a data structure
called a parse tree or syntax tree to make comparisons. The parse tree is formed by
matching the input string with the pre-defined grammar. If the parsing is
successful, the given string can be formed by the grammar, else an error is
reported.
Importance of Syntax Analysis
 It is used to check if the code is grammatically correct or not.
 It helps us to detect all types of syntax errors.
 It gives an exact description of the error.
 It rejects invalid code before actual compiling.

Parsing Techniques
The parsing techniques can be divided into two types:

Top-down parsing: The parse tree is constructed from the root to the leaves in top-down
parsing. Some most common top-down parsers are Recursive Descent Parser and LL
parser.

Bottom-up parsing: The parse tree is constructed from the leaves to the tree’s root in
bottom-up parsing. Some examples of bottom-up parsers are the LR parser, SLR parser,
CLR parser, etc.

Derivation
The derivation is the process of using the production rules (grammar) to derive the input
string. There are two decisions that the parser must make to form the input string:
Deciding which non-terminal is to be replaced. There are two options to do this:

a) Left-most Derivation: When the non-terminals are replaced from left to right, it is
called left-most derivation.

D.SANJEEVA REDDY Page 1


[CD] Unit-2

b) Right-most Derivation: When the non-terminals are replaced from right to left, it is
called right-most derivation.

Deciding the production rule using which the non-terminal will be replaced.
Parse Tree
Parse tree is a graphical representation of the derivation. It is used to see how the given
string is derived from the start symbol. The start symbol is the root of the parse tree, and
the characters of the input string become the leaves.

Example

Consider the following set of production rules where ‘E’ is a non-terminal and ‘id’ is a
terminal:
E -> E + E (This means that E can be replaced with E + E)
E -> E * E (This means that E can be replaced with E * E)
E -> id (This means that E can be replaced with ‘id’. Since ‘id’ is a non-terminal, it will
not be replaced further, thus, it will form a leaf.)
We will construct a parse tree using the left-most derivation of “id + id * id”.

Steps Parse Tree

Step-1: Replace E
with E * E.
Result: E * E

Step-2: Replace
leftmost E with E + E
Result: E + E * E

D.SANJEEVA REDDY Page 2


[CD] Unit-2

Step-3,4,5: Replace
all E’s with id.
Result : id + id * id

Thus, we can generate a given string by following the production rules using the parse
trees.
Ambiguity: Grammar is ambiguous if there is more than one parse tree for any string.
For example, for the above string and grammar, we can construct two parse trees:

Ambiguous grammar is not considered suitable for a compiler design. There is no method
that can detect ambiguity or remove ambiguity. If there is an ambiguity in the grammar,
one has to remove it by either rewriting the whole grammar or by following associativity
and precedence constraints.
Limitations of Syntax Analysis
It cannot determine if the token is a valid token or not.
It cannot determine whether a token is used before or not.
It cannot determine whether the operation performed on tokens is valid or not.
It cannot tell whether the token was initialized or not.

Parse Tree-

The process of deriving a string is called as derivation.


The geometrical representation of a derivation is called as a parse tree or derivation
tree.

D.SANJEEVA REDDY Page 3


[CD] Unit-2

1. Leftmost Derivation-

The process of deriving a string by expanding the leftmost non-terminal at each step is
called as leftmost derivation.
The geometrical representation of leftmost derivation is called as a leftmost derivation
tree.

Example-

Consider the following grammar-


S → aB / bA
S → aS / bAA / a
B → bS / aBB / b
(Unambiguous Grammar)

Let us consider a string w = aaabbabbba


Now, let us derive the string w using leftmost derivation.

Leftmost Derivation-

S → aB
→ aaBB (Using B → aBB)
→ aaaBBB (Using B → aBB)
→ aaabBB (Using B → b)
→ aaabbB (Using B → b)
→ aaabbaBB (Using B → aBB)
→ aaabbabB (Using B → b)
→ aaabbabbS (Using B → bS)
→ aaabbabbbA (Using S → bA)
→ aaabbabbba (Using A → a)

D.SANJEEVA REDDY Page 4


[CD] Unit-2

2. Rightmost Derivation-

The process of deriving a string by expanding the rightmost non-terminal at each step is
called as rightmost derivation.
The geometrical representation of rightmost derivation is called as a rightmost
derivation tree.

Example-

Consider the following grammar-


S → aB / bA
S → aS / bAA / a
B → bS / aBB / b
(Unambiguous Grammar)

Let us consider a string w = aaabbabbba


Now, let us derive the string w using rightmost derivation.

Right most Derivation-

S → aB
→ aaBB (Using B → aBB)
→ aaBaBB (Using B → aBB)
→ aaBaBbS (Using B → bS)
→ aaBaBbbA (Using S → bA)
→ aaBaBbba (Using A → a)
→ aaBabbba (Using B → b)
→ aaaBBabbba (Using B → aBB)
→ aaaBbabbba (Using B → b)
→ aaabbabbba (Using B → b)

D.SANJEEVA REDDY Page 5


[CD] Unit-2

For unambiguous grammars, Leftmost derivation and Rightmost derivation represents the
same parse tree.
For ambiguous grammars, Leftmost derivation and Rightmost derivation represents different
parse trees.

Here,
The given grammar was unambiguous.
That is why, leftmost derivation and rightmost derivation represents the same parse tree.

Leftmost Derivation Tree = Rightmost Derivation Tree


Ambiguous Grammar
Unambiguous Grammar

1. Ambiguous Grammar-

A grammar is said to ambiguous if for any string generated by it, it produces more than
one-
Parse tree
Or derivation tree
Or syntax tree
Or leftmost derivation
Or rightmost derivation

Example-

Consider the following grammar-


E → E + E / E x E / id

D.SANJEEVA REDDY Page 6


[CD] Unit-2

This grammar is an example of ambiguous grammar.


Any of the following reasons can be stated to prove the grammar ambiguous-

Reason-01:

Let us consider a string w generated by the grammar-


w = id + id x id
Now, let us draw the parse trees for this string w.

Since two parse trees exist for string w, therefore the grammar is ambiguous.
2. Unambiguous Grammar-

A grammar is said to unambiguous if for every string generated by it, it produces exactly
one-
Parse tree
Or derivation tree
Or syntax tree
Or leftmost derivation
Or rightmost derivation

Example-

Consider the following grammar-


E→E+T/T
T→TxF/F
F → id
Unambiguous Grammar

This grammar is an example of unambiguous grammar.

D.SANJEEVA REDDY Page 7


[CD] Unit-2

Context-Free Grammar (CFG)

G = (V, T, P, S)

Explanation
G in the above expression stands for grammar. It consists of the set of all the production
rules with the help of which we can generate the string of a language.
T stands for the final set of the terminal symbol. We have to use the lower case alphabet
to show them.
V and T are somewhat opposite as V stands for the set of all non-terminal symbols, and
we have to use capital letters to denote it.
P is the collection of all the production rules used to replace all non-terminals symbols of
a string with terminal or other non-terminal symbols.
S stands for the start symbol, used for deriving string.
Examples

Example 1

Construct the CFG for having any number of b’s over the set ∑= {b}.
Solution:
The regular expression for the above language is
r.e. = b*
Production rules are as given below:
S → bS rule 1
S → ε rule 2
Now, we will derive the string “bbbbbb.”
S
bS
bbS rule 1
bbbS rule 1
bbbbS rule 1
bbbbbS rule 1
bbbbbbS rule 1
bbbbbbε rule 2
bbbbbb
The r.e. = b* can generate a set of string {ε, b, bb, bbb,.....}

D.SANJEEVA REDDY Page 8


[CD] Unit-2

S.NO Ambiguous Grammar Unambiguous Grammar

In ambiguous grammar, the leftmost In unambiguous grammar, the


and rightmost derivations are not leftmost and rightmost derivations are
1. same. same.

Amount of non-terminals in Amount of non-terminals in


ambiguous grammar is less than in unambiguous grammar is more than
2. unambiguous grammar. in ambiguous grammar.

Length of the parse tree in Length of the parse tree in


ambiguous grammar is unambiguous grammar is
3. comparatively short. comparatively large.

Speed of derivation of a tree in Speed of derivation of a tree in


ambiguous grammar is faster than unambiguous grammar is slower than
4. that of unambiguous grammar. that of ambiguous grammar.

Ambiguous grammar generates more Unambiguous grammar generates


5. than one parse tree. only one parse tree.

Ambiguous grammar contains Unambiguous grammar does not


6. ambiguity. contain any ambiguity.

Types of Parsers in Compiler Design

Parser is that phase of compiler which takes token string as input and with the help of
existing grammar, converts it into the corresponding parse tree. Parser is also known as
Syntax Analyzer.

D.SANJEEVA REDDY Page 9


[CD] Unit-2

Types of Parser:

Parser is mainly classified into 2 categories: Top-down Parser, and Bottom-up Parser.
These are explained as following below.

1. Top-down Parser:

Top-down parser is the parser which generates parse for the given input string with
the help of grammar productions by expanding the non-terminals i.e. it starts from the
start symbol and ends on the terminals. It uses left most derivation.

Further Top-down parser is classified into 2 types:


1. Recursive descent parser 2.Non-recursive descent parser.

Recursive descent parser:

It is also known as Brute force parser or the with backtracking parser. It basically
generates the parse tree by using brute force and backtracking.

Non-recursive descent parser:

It is also known as LL (1) parser or predictive parser or without backtracking parser or


dynamic parser. It uses parsing table to generate the parse tree instead of backtracking.

D.SANJEEVA REDDY Page 10


[CD] Unit-2

2. Bottom-up Parser:

Bottom-up Parser is the parser which generates the parse tree for the given input string
with the help of grammar productions by compressing the non-terminals i.e. it starts
from non-terminals and ends on the start symbol. It uses reverse of the right most
derivation.
Further Bottom-up parser is classified into 2 types: LR parser, and Operator precedence
parser.

(i). LR parser:
LR parser is the bottom-up parser which generates the parse tree for the given string by
using unambiguous grammar. It follows reverse of right most derivation.
LR parser is of 4 types:

(a). LR(0)

(b). SLR (1)

(c). LALR (1)

(d). CLR (1)

(ii). Operator precedence parser:


It generates the parse tree form given grammar and string but the only condition is two
consecutive non-terminals and epsilon never appears in the right-hand side of any
production.

Top down Parser:

 In top down technique parse tree constructs from top and input will read from left to
right. In top down, In top down parser, It will start symbol from proceed to string.
 It follows left most derivation.
 In top down parser, difficulty with top down parser is if variable contain more than
one possibility selecting 1 is difficult.
Working of Top down Parser:
Let’s consider an example where grammar is given and you need to construct a parse
tree by using top down parser technique.

Example –

D.SANJEEVA REDDY Page 11


[CD] Unit-2

S -> aABe
A -> Abc | b
B -> d
Now, let’s consider the input to read and to construct a parse tree with top down
approach.
Input –

abbcde$
Now, you will see that how top down approach works. Here, you will see how you can
generate a input string from the grammar for top down approach.
 First, you can start with S -> a A B e and then you will see input string a in the
beginning and e in the end.
 Now, you need to generate abbcde .
 Expand A-> Abc and Expand B-> d.
 Now, You have string like aAbcde and your input string is abbcde.
 Expand A->b.
 Final string, you will get abbcde.
Given below is the Diagram explanation for constructing top down parse tree. You can
see clearly in the diagram how you can generate the input string using grammar with
top down approach.

WRITING GRAMMAR:

Recursion-

Recursion can be classified into following three types-

D.SANJEEVA REDDY Page 12


[CD] Unit-2

 Left Recursion
 Right Recursion
 General Recursion

1. Left Recursion-

A production of grammar is said to have left recursion if the leftmost variable of its RHS
is same as variable of its LHS.
A grammar containing a production having left recursion is called as Left Recursive
Grammar.

Example-

S → Sa / ∈
(Left Recursive Grammar)

Left recursion is considered to be a problematic situation for Top down parsers.


Therefore, left recursion has to be eliminated from the grammar.

Elimination of Left Recursion:

Left recursion is eliminated by converting the grammar into a right recursive


grammar.

If we have the left-recursive pair of productions-


A → Aα / β
(Left Recursive Grammar)
where β does not begin with an A.

Then, we can eliminate left recursion by replacing the pair of productions with-
A → βA’
A’ → αA’ / ∈
(Right Recursive Grammar)

This right recursive grammar functions same as left recursive grammar.

D.SANJEEVA REDDY Page 13


[CD] Unit-2

2. Right Recursion-

A production of grammar is said to have right recursion if the rightmost variable of its
RHS is same as variable of its LHS.
A grammar containing a production having right recursion is called as Right Recursive
Grammar.

Example-

S → aS / ∈
(Right Recursive Grammar)

Right recursion does not create any problem for the Top down parsers.
Therefore, there is no need of eliminating right recursion from the grammar.

3. General Recursion-

The recursion which is neither left recursion nor right recursion is called as general
recursion.

Example-

S → aSb / ∈

PRACTICE PROBLEM BASED ON LEFT RECURSION ELIMINATION-

Consider the following grammar and eliminate left recursion-


A → ABd / Aa / a
B → Be / b
Solution-

The grammar after eliminating left recursion is-


A → aA’
A’ → BdA’ / aA’ / ∈
B → bB’
B’ → eB’ / ∈

Grammar With Common Prefixes-

If RHS of more than one production starts with the same symbol,

D.SANJEEVA REDDY Page 14


[CD] Unit-2

then such a grammar is called as


Grammar With Common Prefixes.

Example-

A → αβ1 / αβ2 / αβ3


(Grammar with common prefixes)

This kind of grammar creates a problematic situation for Top down parsers.
Top down parsers cannot decide which production must be chosen to parse the string in
hand.
To remove this confusion, we use left factoring.

Left Factoring-

Left factoring is a process by which the grammar with common prefixes is transformed
to make it useful for Top down parsers.

How?

In left factoring,
We make one production for each common prefixes.
The common prefix may be a terminal or a non-terminal or a combination of both.
Rest of the derivation is added by new productions.

The grammar obtained after the process of left factoring is called as Left Factored
Grammar.

Example-

PRACTICE PROBLEMS BASED ON LEFT FACTORING-

Problem-01:

D.SANJEEVA REDDY Page 15


[CD] Unit-2

Do left factoring in the following grammar-


S → iEtS / iEtSeS / a
E→b
Solution-

The left factored grammar is-


S → iEtSS’ / a
S’ → eS / ∈
E→b

Recursive Descent Parser:


It is a kind of Top-Down Parser. A top-down parser builds the parse tree from the top to
down, starting with the start non-terminal. A Predictive Parser is a special case of
Recursive Descent Parser, where no Back Tracking is required.
By carefully writing a grammar means eliminating left recursion and left factoring from
it, the resulting grammar will be a grammar that can be parsed by a recursive descent
parser.
Example:

Before removing left recursion After removing left recursion

E –> T E’
E’ –> + T E’ | e
E –> E + T | T T –> F T’
T –> T * F | F T’ –> * F T’ | e
F –> ( E ) | id F –> ( E ) | id

**Here e is Epsilon
For Recursive Descent Parser; we are going to write one program for every variable.

Example:

Grammar: E --> i E'

E' --> + i E' | e

D.SANJEEVA REDDY Page 16


[CD] Unit-2

Predictive parser (or) LL(1) PARSER (OR) dynamic parser (or) without back track
parser

LL(1) Parsing: Here the 1st L represents that the scanning of the Input will be done
from the Left to Right manner and the second L shows that in this parsing technique,
we are going to use the Left most Derivation Tree.
And finally, the 1 represents the number of look-ahead, which means how many
symbols are you going to see when you want to make a decision.

Essential conditions to check first are as follows:


1. The grammar is free from left recursion.
2. The grammar should not be ambiguous.
3. The grammar has to be left factored in so that the grammar is deterministic
grammar.

Predictive Parser:

A predictive parser is a recursive descent parser with no backtracking. It is a top-down


parser that does not require backtracking. At each step, the choice of the rule to be
expanded is made upon the next terminal symbol.
Predictive Parsers has the following components −
 Input Buffer − The input buffer includes the string to be parsed followed by an
end marker $ to denote the end of the string.

Here a, +, b are terminal symbols.


 Stack − It contains a combination of grammar symbols with $ on the bottom of the
stack. At the start of Parsing, the stack contains the start symbol of Grammar
followed by $.

D.SANJEEVA REDDY Page 17


[CD] Unit-2

 Parsing Table − It is a two-dimensional array or Matrix M [A, a] where A is non


terminal and 'a' is a terminal symbol. All the terminals are written column-wise,
and all the Non-terminals are written row wise.
 Parsing Program − The parsing program performs some action by comparing the
symbol on top of the stack and the current input symbol to be read on the input
buffer.
 Actions − Parsing program takes various actions depending upon the symbol on
the top of the stack and the current input symbol. Various Actions taken are given
below –

D.SANJEEVA REDDY Page 18


[CD] Unit-2

Following are the steps to perform Predictive Parsing


 Elimination of Left Recursion
 Left Factoring
 Computation of FIRST & FOLLOW
 Construction of Predictive Parsing Table
 Parse the Input String
Algorithm to construct LL(1) Parsing Table:
Step 1: First check all the essential conditions mentioned above and go to step 2.
Step 2: Calculate First() and Follow() for all non-terminals.
1. First(): If there is a variable, and from that variable, if we try to drive all the strings
then the beginning Terminal Symbol is called the First.
2. Follow(): What is the Terminal Symbol which follows a variable in the process of
derivation.
Step 3: For each production A –> α. (A tends to alpha)
1. Find First(α) and for each terminal in First(α), make entry A –> α in the table.
2. If First(α) contains ε (epsilon) as terminal, then find the Follow(A) and for each
terminal in Follow(A), make entry A –> ε in the table.
3. If the First(α) contains ε and Follow(A) contains $ as terminal, then make entry A –
> ε in the table for the $.
To construct the parsing table, we have two functions:
In the table, rows will contain the Non-Terminals and the column will contain the
Terminal Symbols. All the Null Productions of the Grammars will go under the Follow
elements and the remaining productions will lie under the elements of the First set.

Example-1:
Consider the Grammar:

E --> TE'
E' --> +TE' | ε
T --> FT'
T' --> *FT' | ε
F --> id | (E)
*ε denotes epsilon
Find their First and Follow sets:

D.SANJEEVA REDDY Page 19


[CD] Unit-2

First Follow

E –> TE’ { id, ( } { $, ) }

E’ –> +TE’/ε { +, ε } { $, ) }

T –> FT’ { id, ( } { +, $, ) }

T’ –> *FT’/ε { *, ε } { +, $, ) }

F –> id/(E) { id, ( } { *, +, $, ) }

Now, the LL(1) Parsing Table is:

id + * ( ) $

E E –> TE’ E –> TE’

E’ E’ –> +TE’ E’ –> ε E’ –> ε

T T –> FT’ T –> FT’

T’ T’ –> ε T’ –> *FT’ T’ –> ε T’ –> ε

F F –> id F –> (E)

D.SANJEEVA REDDY Page 20


[CD] Unit-2

As you can see that all the null productions are put under the Follow set of that symbol
and all the remaining productions are lie under the First of that symbol.
Note: Every grammar is not feasible for LL(1) Parsing table. It may be possible that
one cell may contain more than one production.

Bottom Up Parsers / Shift Reduce Parsers

Build the parse tree from leaves to root. Bottom-up parsing can be defined as an attempt
to reduce the input string w to the start symbol of grammar by tracing out the rightmost
derivations of w in reverse.

Eg.

Classification of bottom up parsers:

Shift reduce parsing


 Shift reduce parsing is a process of reducing a string to the start symbol of a
grammar.
 Shift reduce parsing uses a stack to hold the grammar and an input tape to hold
the string.

D.SANJEEVA REDDY Page 21


[CD] Unit-2

o Shift reduce parsing performs the two actions: shift and reduce. That's why it is
known as shift reduces parsing.
o At the shift action, the current symbol in the input string is pushed to a stack.
o At each reduction, the symbols will replaced by the non-terminals. The symbol is
the right side of the production and non-terminal is the left side of the production.

Basic Operations –

 Shift: This involves moving symbols from the input buffer onto the stack.
 Reduce: If the handle appears on top of the stack then, its reduction by using
appropriate production rule is done i.e. RHS of a production rule is popped out of a
stack and LHS of a production rule is pushed onto the stack.
 Accept: If only the start symbol is present in the stack and the input buffer is empty
then, the parsing action is called accept. When accepted action is obtained, it is
means successful parsing is done.
 Error: This is the situation in which the parser can neither perform shift action nor
reduce action and not even accept action.
Example 1 – Consider the grammar
S –> S + S
S –> S * S
S –> id
Perform Shift Reduce parsing for input string “id + id + id”.

D.SANJEEVA REDDY Page 22


[CD] Unit-2

There are two main categories of shift reduce parsing as follows:

1. Operator-Precedence Parsing
2. LR-Parser

Operator precedence parsing

Operator precedence grammar is kinds of shift reduce parsing method. It is applied to a


small class of operator grammars.

A grammar is said to be operator precedence grammar if it has two properties:

o No R.H.S. of any production has a∈.


o No two non-terminals are adjacent.

Designing Operator Precedence Parser-

In operator precedence parsing,


Firstly, we define precedence relations between every pair of terminal symbols.
Secondly, we construct an operator precedence table.

Defining Precedence Relations-

The precedence relations are defined using the following rules-

Rule-01:
If precedence of b is higher than precedence of a, then we define a < b
If precedence of b is same as precedence of a, then we define a = b
If precedence of b is lower than precedence of a, then we define a > b

D.SANJEEVA REDDY Page 23


[CD] Unit-2

Rule-02:

An identifier is always given the higher precedence than any other symbol.
$ symbol is always given the lowest precedence.

Rule-03:
If two operators have the same precedence, then we go by checking their associativity.

Parsing A Given String-

The given input string is parsed using the following steps-


Step-01:

Insert the following-


$ symbol at the beginning and ending of the input string.
Precedence operator between every two symbols of the string by referring the operator
precedence table.

Step-02:

Start scanning the string from LHS in the forward direction until > symbol is
encountered.
Keep a pointer on that location.

Step-03:

Start scanning the string from RHS in the backward direction until < symbol is
encountered.
Keep a pointer on that location.

Step-04:

Everything that lies in the middle of < and > forms the handle.
Replace the handle with the head of the respective production.

Step-05:

Keep repeating the cycle from Step-02 to Step-04 until the start symbol is reached.

Advantages-

The advantages of operator precedence parsing are-


The implementation is very easy and simple.
D.SANJEEVA REDDY Page 24
[CD] Unit-2

The parser is quite powerful for expressions in programming languages.

Disadvantages-

The disadvantages of operator precedence parsing are-


The handling of tokens known to have two different precedence becomes difficult.
Only small class of grammars can be parsed using this parser.

PRACTICE PROBLEM BASED ON OPERATOR PRECEDENCE PARSING-

Consider the following grammar-


E → EAE | id
A→+|x
Construct the operator precedence parser and parse the string id + id x id.

Step-01:

We convert the given grammar into operator precedence grammar.


The equivalent operator precedence grammar is-
E → E + E | E x E | id

Step-02:

The terminal symbols in the grammar are { id, + , x , $ }


We construct the operator precedence table as-

id + x $

id > > >

+ < > < >

x < > > >

$ < < <


Operator Precedence Table

Parsing Given String-

Given string to be parsed is id + id x id.


We follow the following steps to parse the given string-

D.SANJEEVA REDDY Page 25


[CD] Unit-2

Step-01:

We insert $ symbol at both ends of the string as-


$ id + id x id $

We insert precedence operators between the string symbols as-


$ < id > + < id > x < id > $

Step-02:

We scan and parse the string as-

$ < id > + < id > x < id > $


$ E + < id > x < id > $
$ E + E x < id > $
$E+ExE$
$+x$
$<+<x>$
$<+>$
$$

LR parser:

LR parsing is one type of bottom up parsing. It is used to parse the large class of
grammars.

In the LR parsing, "L" stands for left-to-right scanning of the input.

"R" stands for constructing a right most derivation in reverse.

"K" is the number of input symbols of the look ahead used to make number of parsing
decision.

D.SANJEEVA REDDY Page 26


[CD] Unit-2

LR algorithm:

The LR algorithm requires stack, input, output and parsing table. In all type of LR
parsing, input, output and stack are same but parsing table is different.

Fig: Block diagram of LR parser

Input buffer is used to indicate end of input and it contains the string to be parsed
followed by a $ Symbol.

A stack is used to contain a sequence of grammar symbols with a $ at the bottom of the
stack.

Parsing table is a two dimensional array. It contains two parts: Action part and Go To
part.

LR(0) Parser
We need two functions –
Closure()
Goto()

Augmented Grammar
If G is a grammar with start symbol S then G’, the augmented grammar for G, is the
grammar with new start symbol S’ and a production S’ -> S. The purpose of this new
starting production is to indicate to the parser when it should stop parsing and announce
acceptance of input.
Let a grammar be S -> AA
A -> aA | b
The augmented grammar for the above grammar will be
S’ -> S

D.SANJEEVA REDDY Page 27


[CD] Unit-2

S -> AA
A -> aA | b

LR (0) Items
An LR(0) is the item of a grammar G is a production of G with a dot at some position in
the right side.
S -> ABC yields four items
S -> .ABC
S -> A.BC
S -> AB.C
S -> ABC.
The production A -> ε generates only one item A -> .ε

Closure Operation:
If I is a set of items for a grammar G, then closure(I) is the set of items constructed
from I by the two rules:

1. Initially every item in I is added to closure(I).


2. If A -> α.Bβ is in closure(I) and B -> γ is a production then add the item B -> .γ to I,
If it is not already there. We apply this rule until no more items can be added to
closure(I).
Eg:

Goto Operation :
Goto(I, X) = 1. Add I by moving dot after X.
2. Apply closure to first step.

D.SANJEEVA REDDY Page 28


[CD] Unit-2

Construction of GOTO graph-

 State I0 – closure of augmented LR(0) item


 Using I0 find all collection of sets of LR(0) items with the help of DFA
 Convert DFA to LR(0) parsing table

Construction of LR (0) parsing table:

 The action function takes as arguments a state i and a terminal a (or $ , the input end
marker). The value of ACTION[i, a] can have one of four forms:
1. Shift j, where j is a state.
2. Reduce A -> β.
3. Accept
4. Error
 We extend the GOTO function, defined on sets of items, to states: if GOTO [Ii, A] =
Ij then GOTO also maps a state i and a non terminal A to state j.

EXAMPLE:
Consider the grammar S ->AA
A -> aA | b
Augmented grammar S’ -> S
S -> AA
A -> aA | b

D.SANJEEVA REDDY Page 29


[CD] Unit-2

The LR(0) parsing table for above GOTO graph will be –

 Action part of the table contains all the terminals of the grammar whereas the
goto part contains all the non terminals.
 For every state of goto graph we write all the goto operations in the table. If goto
is applied to a terminal than it is written in the action part if goto is applied on a
non terminal it is written in goto part. If on applying goto a production is reduced
( i.e if the dot reaches at the end of production and no further closure can be
applied) then it is denoted as R i and if the production is not reduced (shifted) it is
denoted as Si.
If a production is reduced it is written under the terminals given by follow of the
left side of the production which is reduced for ex: in I5 S->AA is reduced so
R1 is written under the terminals in follow(S)={$}

If in a state the start symbol of grammar is reduced it is written under $ symbol as


accepted.

D.SANJEEVA REDDY Page 30


[CD] Unit-2

LR (1) Parsing

Various steps involved in the LR (1) Parsing:

o For the given input string write a context free grammar.


o Check the ambiguity of the grammar.
o Add Augment production in the given grammar.
o Create Canonical collection of LR (0) items.
o Draw a data flow diagram (DFA).
o Construct a LR (1) parsing table.

Augment Grammar

Augmented grammar G` will be generated if we add one more production in the given
grammar G. It helps the parser to identify when to stop the parsing and announce the
acceptance of the input.

D.SANJEEVA REDDY Page 31


[CD] Unit-2

Example

Given grammar

1. S → AA
2. A → aA | b

The Augment grammar G` is represented by

1. S`→ S
2. S → AA
3. A → aA | b
Canonical Collection of LR(0) items

An LR (0) item is a production G with dot at some position on the right side of the
production.

LR(0) items is useful to indicate that how much of the input has been scanned up to a
given point in the process of parsing.

In the LR (0), we place the reduce node in the entire row.

Example:

Given grammar:

1. S → AA
2. A → aA | b

Add Augment Production and insert '•' symbol at the first position for every production in
G

1. S` → •S
2. S → •AA
3. A → •aA
4. A → •b

I0 State:

Add Augment production to the I0 State and Compute the Closure

D.SANJEEVA REDDY Page 32


[CD] Unit-2

I0 = Closure (S` → •S)

Add all productions starting with S in to I0 State because "•" is followed by the non-
terminal. So, the I0 State becomes

I0 = S` → •S
S → •AA

Add all productions starting with "A" in modified I0 State because "•" is followed by the
non-terminal. So, the I0 State becomes.

I0= S` → •S
S → •AA
A → •aA
A → •b

I1= Go to (I0, S) = closure (S` → S•) = S` → S•

Here, the Production is reduced so close the State.

I1= S` → S•

I2= Go to (I0, A) = closure (S → A•A)

Add all productions starting with A in to I2 State because "•" is followed by the non-
terminal. So, the I2 State becomes

I2 =S→A•A
A → •aA
A → •b

Go to (I2,a) = Closure (A → a•A) = (same as I3)

Go to (I2, b) = Closure (A → b•) = (same as I4)

I3= Go to (I0,a) = Closure (A → a•A)

Add productions starting with A in I3.

A→a•A
A→•aA
A → •b

D.SANJEEVA REDDY Page 33


[CD] Unit-2

Goto(I3,a)=Closure(A→a•A)=(sameasI3)
Go to (I3, b) = Closure (A → b•) = (same as I4)

I4= Goto(I0,b)=closure(A→b•)=A→b•
I5= Goto(I2,A)=Closure(S→AA•)=SA→A•
I6= Go to (I3, A) = Closure (A → aA•) = A → aA•

Drawing DFA:

The DFA contains the 7 states I0 to I6.

LR (1) Table
o If a state is going to some other state on a terminal then it correspond to a shift
move.
o If a state is going to some other state on a variable then it correspond to go to
move.
o If a state contain the final item in the particular row then write the reduce node
completely.

Explanation:

 I0 on S is going to I1 so write it as 1.
D.SANJEEVA REDDY Page 34
[CD] Unit-2

 I0 on A is going to I2 so write it as 2.
 I2 on A is going to I5 so write it as 5.
 I3 on A is going to I6 so write it as 6.
 I0, I2and I3on a are going to I3 so write it as S3 which means that shift 3.
 I0, I2 and I3 on b are going to I4 so write it as S4 which means that shift 4.
 I4, I5 and I6 all states contains the final item because they contain • in the right
most end. So rate the production as production number.

Productions are numbered as follows:

S→ AA ... (1)
A→ aA ... (2)
A→ b ... (3)
 I1 contains the final item which drives(S` → S•), so action {I1, $} = Accept.
 I4 contains the final item which drives A → b• and that production corresponds to
the production number 3 so write it as r3 in the entire row.
 I5 contains the final item which drives S → AA• and that production corresponds
to the production number 1 so write it as r1 in the entire row.
 I6 contains the final item which drives A → aA• and that production corresponds
to the production number 2 so write it as r2 in the entire row.

SLR (1) Parsing:

SLR (1) refers to simple LR Parsing. It is same as LR (0) parsing. The only difference is
in the parsing table. To construct SLR (1) parsing table, we use canonical collection of
LR (0) item.

In the SLR (1) parsing, we place the reduce move only in the follow of left hand side.

Various steps involved in the SLR (1) Parsing:

o Check the ambiguity of the grammar


o Add Augment production in the given grammar
o Create Canonical collection of LR (0) items
o Draw a data flow diagram (DFA)
o Construct a SLR (1) parsing table
o Find FOLLOW of LHS of production

D.SANJEEVA REDDY Page 35


[CD] Unit-2

SLR (1) Table Construction

The steps which use to construct SLR (1) Table is given below:

If a state (Ii) is going to some other state (Ij) on a terminal then it corresponds to a shift
move in the action part.

If a state (Ii) is going to some other state (Ij) on a variable then it correspond to go to
move in the Go to part.

If a state (Ii) contains the final item like A → ab• which has no transitions to the next
state then the production is known as reduce production. For all terminals X in FOLLOW
(A), write the reduce entry along with their production numbers.

Example

1. S -> •Aa
2. A->αβ•
1. Follow(S) = {$}
2. Follow (A) = {a}

D.SANJEEVA REDDY Page 36


[CD] Unit-2

EXAMPLE – Construct SLR parsing table for the given context-free grammar
S–>AA
A–>aA|b
Solution:
STEP1 – Find augmented grammar
The augmented grammar of the given grammar is:-
S’–>.S [0th production]
S–>.AA [1st production]
A–>.aA [2nd production]
A–>.b [3rd production]
STEP2 – Find LR(0) collection of items
Below is the figure showing the LR(0) collection of items. We will understand
everything one by one.

The terminals of this grammar are {a,b}.


The non-terminals of this grammar are {S,A}
RULE –
If any non-terminal has ‘ . ‘preceding it, we have to write all its production and add ‘ .
‘preceding each of its production.
RULE –
from each state to the next state, the ‘ . ‘ shifts to one place to the right.
 In the figure, I0 consists of augmented grammar.
 Io goes to I1 when ‘ . ‘ of 0th production is shifted towards the right of S(S’->S.).
this state is the accepted state. S is seen by the compiler.
 Io goes to I2 when ‘ . ‘ of 1st production is shifted towards right (S->A.A) . A is
seen by the compiler
 I0 goes to I3 when ‘ . ‘ of the 2nd production is shifted towards right (A->a.A) . a is
seen by the compiler.
 I0 goes to I4 when ‘ . ‘ of the 3rd production is shifted towards right (A->b.) . b is
seen by the compiler.

D.SANJEEVA REDDY Page 37


[CD] Unit-2

 I2 goes to I5 when ‘ . ‘ of 1st production is shifted towards right (S->AA.) . A is


seen by the compiler
 I2 goes to I4 when ‘ . ‘ of 3rd production is shifted towards right (A->b.) . b is seen
by the compiler.
 I2 goes to I3 when ‘ . ‘ of the 2nd production is shifted towards right (A->a.A) . a is
seen by the compiler.
 I3 goes to I4 when ‘ . ‘ of the 3rd production is shifted towards right (A->b.) . b is
seen by the compiler.
 I3 goes to I6 when ‘ . ‘ of 2nd production is shifted towards the right (A->aA.) . A
is seen by the compiler
 I3 goes to I3 when ‘ . ‘ of the 2nd production is shifted towards right (A->a.A) . a is
seen by the compiler.
STEP3 –
Find FOLLOW of LHS of production
FOLLOW(S)=$
FOLLOW(A)=a,b,$
STEP 4-
Defining 2 functions: goto [list of non-terminals] and action[list of terminals] in the
parsing table. Below is the SLR parsing table.

 $ is by default a non terminal that takes accepting state.


 0,1,2,3,4,5,6 denotes I0,I1,I2,I3,I4,I5,I6
 I0 gives A in I2, so 2 is added to the A column and 0 rows.
 I0 gives S in I1, so 1 is added to the S column and 1 row.
 similarly 5 is written in A column and 2 row, 6 is written in A column and 3 row.
 I0 gives a in I3 .so S3 (shift 3) is added to a column and 0 row.
 I0 gives b in I4 .so S4 (shift 4) is added to the b column and 0 row.
 Similarly, S3 (shift 3) is added on a column and 2,3 row ,S4(shift 4) is added on b
column and 2,3 rows.
 I4 is reduced state as ‘ . ‘ is at the end. I4 is the 3rd production of grammar(A–
>.b).LHS of this production is A. FOLLOW(A)=a,b,$ . write r3(reduced 3) in the
columns of a,b,$ and 4th row

D.SANJEEVA REDDY Page 38


[CD] Unit-2

 I5 is reduced state as ‘ . ‘ is at the end. I5 is the 1st production of grammar(S–


>.AA). LHS of this production is S.
FOLLOW(S)=$ . write r1(reduced 1) in the column of $ and 5th row
 I6 is a reduced state as ‘ . ‘ is at the end. I6 is the 2nd production of grammar( A–
>.aA). The LHS of this production is A.
FOLLOW(A)=a,b,$ . write r2(reduced 2) in the columns of a,b,$ and 6th row

CLR (1) Parsing:

CLR refers to canonical look ahead. CLR parsing use the canonical collection of LR (1)
items to build the CLR (1) parsing table. CLR (1) parsing table produces the more
number of states as compare to the SLR (1) parsing.

In the CLR (1), we place the reduce node only in the look ahead symbols.

Various steps involved in the CLR (1) Parsing:

 For the given input string write a context free grammar


 Check the ambiguity of the grammar
 Add Augment production in the given grammar
 Create Canonical collection of LR (0) and LR(1) items

 Draw a Deterministic finite automata (DFA)


 Construct a CLR (1) parsing table

LR (1) item:

LR (1) item is a collection of LR (0) items and a look ahead symbol.

LR (1) item = LR (0) item + look ahead

The look ahead is used to determine that where we place the final item.

The look ahead always add $ symbol for the argument production.

How to add lookahead with the production?


CASE 1 –
A->∝.BC, a
Suppose this is the 0th production.Now, since ‘ . ‘ precedes B,so we have to write B’s
productions as well.

D.SANJEEVA REDDY Page 39


[CD] Unit-2

B->.D [1st production]


Suppose this is B’s production. The look ahead of this production is given as we look at
previous productions ie 0th production. Whatever is after B, we find FIRST(of that value)
, that is the lookahead of 1st production.So,here in 0th production, after B, C is there.
assume FIRST(C)=d, then 1st production become
B->.D, d
CASE 2 –
Now if the 0th production was like this,
A->∝.B, a
Here, we can see there’s nothing after B. So the lookahead of 0th production will be the
lookahead of 1st production. ie-
B->.D, a
CASE 3 –
Assume a production A->a|b
A->a,$ [0th production]
A->b,$ [1st production]
Here, the 1st production is a part of the previous production, so the lookahead will be the
same as that of its previous production.
These are the 2 rules of look ahead.

Example:

CLR (1) Grammar

1. S → AA
2. A → aA
3. A → b

Add Augment Production, insert '•' symbol at the first position for every production in G
and also add the look ahead.

1. S` → •S, $
2. S → •AA, $
3. A → •aA, a/b
4. A → •b, a/b

I0 State:

Add Augment production to the I0 State and Compute the Closure

I0 = Closure (S` → •S)

D.SANJEEVA REDDY Page 40


[CD] Unit-2

Add all productions starting with S in to I0 State because "." is followed by the non-
terminal. So, the I0 State becomes

I0 = S` → •S, $

S → •AA, $

Add all productions starting with A in modified I0 State because "." is followed by the
non-terminal. So, the I0 State becomes.

Add all productions starting with A in modified I0 State because "." is followed by the
non-terminal. So, the I0 State becomes.

I0= S` → •S, $

S → •AA, $

A → •aA, a/b

A → •b, a/b

1. If any non-terminal has ‘ . ‘ preceding it, we have to write all its production and add
‘ . ‘ preceding each of its production.
2. from each state to the next state, the ‘ . ‘ shifts to one place to the right.
3. All the rules of lookahead apply here.
 In the figure, I0 consists of augmented grammar.
 Io goes to I1 when ‘ . ‘ of 0th production is shifted towards the right of S(S’->S.).
This state is the accept state . S is seen by the compiler. Since I1 is a part of the 0th
production, the lookahead is the same ie $
 Io goes to I2 when ‘ . ‘ of 1st production is shifted towards right (S->A.A) . A is
seen by the compiler. Since I2 is a part of the 1st production, the lookahead is the
same i.e. $.
 I0 goes to I3 when ‘ . ‘ of the 2nd production is shifted towards right (A->a.A) . a is
seen by the compiler. Since I3 is a part of the 2nd production, the lookahead is the
same ie a|b.
 I0 goes to I4 when ‘ . ‘ of the 3rd production is shifted towards right (A->b.) . b is
seen by the compiler. Since I4 is a part of the 3rd production, the lookahead is the
same i.e. a | b.
 I2 goes to I5 when ‘ . ‘ of 1st production is shifted towards right (S->AA.) . A is
seen by the compiler. Since I5 is a part of the 1st production, the lookahead is the
same i.e. $.

D.SANJEEVA REDDY Page 41


[CD] Unit-2

 I2 goes to I6 when ‘ . ‘ of 2nd production is shifted towards the right (A->a.A) . A


is seen by the compiler. Since I6 is a part of the 2nd production, the lookahead is the
same i.e. $.
 I2 goes to I7 when ‘ . ‘ of 3rd production is shifted towards right (A->b.) . A is seen
by the compiler. Since I6 is a part of the 3rd production, the lookahead is the same
i.e. $.
 I3 goes to I3 when ‘ . ‘ of the 2nd production is shifted towards right (A->a.A) . a is
seen by the compiler. Since I3 is a part of the 2nd production, the lookahead is the
same i.e. a|b.
 I3 goes to I8 when ‘ . ‘ of 2nd production is shifted towards the right (A->aA.) . A
is seen by the compiler. Since I8 is a part of the 2nd production, the lookahead is the
same i.e. a|b.
 I6 goes to I9 when ‘ . ‘ of 2nd production is shifted towards the right (A->aA.) . A
is seen by the compiler. Since I9 is a part of the 2nd production, the lookahead is the
same i.e. $.
 I6 goes to I6 when ‘ . ‘ of the 2nd production is shifted towards right (A->a.A) . a is
seen by the compiler. Since I6 is a part of the 2nd production, the lookahead is the
same i.e. $.
 I6 goes to I7 when ‘ . ‘ of the 3rd production is shifted towards right (A->b.) . b is
seen by the compiler. Since I6 is a part of the 3rd production, the lookahead is the
same ie $.

CLR (1) Parsing table:

Productions are numbered as follows:

1. S → AA ... (1)
2. A → aA ....(2)
3. A → b ... (3)

D.SANJEEVA REDDY Page 42


[CD] Unit-2

The placement of shift node in CLR (1) parsing table is same as the SLR (1) parsing
table. Only difference in the placement of reduce node.

I4 contains the final item which drives ( A → b•, a/b), so action {I4, a} = R3, action {I4,
b} = R3.
I5 contains the final item which drives ( S → AA•, $), so action {I5, $} = R1.
I7 contains the final item which drives ( A → b•,$), so action {I7, $} = R3.
I8 contains the final item which drives ( A → aA•, a/b), so action {I8, a} = R2, action
{I8, b} = R2.
I9 contains the final item which drives ( A → aA•, $), so action {I9, $} = R2.

LALR (1) Parsing:

LALR refers to the look ahead LR. To construct the LALR (1) parsing table, we use the
canonical collection of LR (1) items.

In the LALR (1) parsing, the LR (1) items which have same productions but different
look ahead are combined to form a single set of items

LALR (1) parsing is same as the CLR (1) parsing, only difference in the parsing table.

Example

LALR (1) Grammar

1. S → AA
2. A → aA
3. A → b

D.SANJEEVA REDDY Page 43


[CD] Unit-2

Add Augment Production, insert '•' symbol at the first position for every production in G
and also add the look ahead.

1. S` → •S, $
2. S → •AA, $
3. A → •aA, a/b
4. A → •b, a/b
DFA:

Clearly I3 and I6 are same in their LR (0) items but differ in their lookahead, so we can
combine them and called as I36.

I36 = {A → a•A, a/b/$

A → •aA, a/b/$

A → •b, a/b/$

The I4 and I7 are same but they differ only in their look ahead, so we can combine them
and called as I47.

I47 = {A → b•, a/b/$}

The I8 and I9 are same but they differ only in their look ahead, so we can combine them
and called as I89.

I89 = {A → aA•, a/b/$}

D.SANJEEVA REDDY Page 44


[CD] Unit-2

LALR (1) Parsing table:

LL Parser LR Parser

First L of LL is for left to right and second L L of LR is for left to right and R is for
is for leftmost derivation. rightmost derivation.

It follows reverse of right most


It follows the left most derivation. derivation.

Using LL parser parser tree is constructed in Parser tree is constructed in bottom up


top down manner. manner.

In LL parser, non-terminals are expanded. In LR parser, terminals are compressed.

Starts with the start symbol(S). Ends with start symbol(S).

Ends when stack used becomes empty. Starts with an empty stack.

Pre-order traversal of the parse tree. Post-order traversal of the parser tree.

Terminal is read after popping out of stack. Terminal is read before pushing into

D.SANJEEVA REDDY Page 45


[CD] Unit-2

LL Parser LR Parser

the stack.

LL is easier to write. LR is difficult to write.

Example: LR(0), SLR(1), LALR(1),


Example: LL(0), LL(1) CLR(1)

Let us see the comparison between SLR, CLR, and LALR Parser.

SLR Parser LALR Parser CLR Parser

It is very easy and cheap to It is also easy and cheap to It is expensive and
implement. implement. difficult to implement.

SLR Parser is the smallest LALR and SLR have the CLR Parser is the largest.
in size. same size. As they have less As the number of states is
number of states. very large.

Error detection is not Error detection is not Error detection can be


immediate in SLR. immediate in LALR. done immediately in CLR
Parser.

SLR fails to produce a It is intermediate in power It is very powerful and


parsing table for a certain between SLR and CLR i.e., works on a large class of
class of grammars. SLR ≤ LALR ≤ CLR. grammar.

It requires less time and It requires more time and It also requires more time
space complexity. space complexity. and space complexity.

PARSER GENERATOR:
Yet Another Compiler Compiler (YACC) is a tool that generates a parser program for
a given LALR(1) grammar. It processes the grammar and outputs a C program. YACC
takes help from the Lex tool (a lexical analyzer generator) to tokenize an input stream.
Parts of the YACC program
A YACC program contains three main sections:

D.SANJEEVA REDDY Page 46


[CD] Unit-2

Definitions
Rules
Auxiliary routines
Definitions

Definitions are at the top of the YACC input file. They include header files or any
information about the regular definitions or tokens. We define the tokens using a
modulus sign, whereas we place the code specific to C, such as the header files,
within %{%{ and %}%}.
Some examples are as follows:
%token ID
{% #include <stdio.h> %}

Rules
Rules are between %%%% and %%%%. They define the actions we take when we scan
the tokens. They execute whenever a token in the input stream matches with the
grammar. Any action in C is placed between curly brackets ( {}{}).

Auxiliary routines
Auxiliary routines includes functions that we may require in the rules section. Here, we
write a function in regular C syntax. This section includes the main() function, in which
the yyparse() function is always called.
The yyparse() function reads the tokens, performs the actions, and returns to the main
when it reaches the end of the file or when an error occurs. It returns 00 if the parsing is
successful, and 11 if it is unsuccessful.

Example

The following code is an example of a YACC program of a simple calculator taking two
operands. The .y extension file contains the YACC code for the parser generation, which
uses the .l extension file that includes the lexical analyzer.
file.y
%{
#include <ctype.h>
#include <stdio.h>
int yylex();
void yyerror();
int tmp=0;
%}

%token num
%left '+' '-'
%left '*' '/'
D.SANJEEVA REDDY Page 47
[CD] Unit-2

%left '(' ')'

%%

line :exp {printf("=%d\n",$$); return 0;};

exp :exp '+' exp {$$ =$1+$3;}


| exp '-' exp {$$ =$1-$3;}
| exp '*' exp {$$ =$1*$3;}
| exp '/' exp {$$ =$1/$3;}
| '(' exp ')' {$$=$2;}
| num {$$=$1;};

%%

void yyerror(){
printf("The arithmetic expression is incorrect\n");
tmp=1;
}
int main(){
printf("Enter an arithmetic expression(can contain +,-,*,/ or parenthesis):\n");
yyparse();
}
lexfile.l
%option noinput nounput noyywrap
%{
#include <stdlib.h>
#include <stdio.h>
#include "y.tab.h"
extern int yylval;
%}

%%

[\t] ;
[\n] return 0;

[0-9]+ { yylval = atoi(yytext);


return num;
}
. return yytext[0];
%%
Explanation

D.SANJEEVA REDDY Page 48


[CD] Unit-2

In the file.y YACC file:

Lines 1–7: We initialize the header files with the function definitions.
Lines 9–12: We initialize the tokens for the grammar.
Lines 16–23: We define the grammar used to build the calculator.
Lines 27–34: We define an error function with the main function.
In lexfile.l Lex file:

Lines 1–7: The header files are initialized along with the YACC file included as a header
file.
Lines 11–18: The regular definition of the expected tokens is defined such as the
calculator input will contain digits 0-9.

Execution

To execute the code, we need to type the following commands in the terminal below:
We type lex lexfile.l and press "Enter" to compile the Lex file.
We type yacc -d yaccfile.y and press "Enter" to compile the YACC file.
We type gcc -Wall -o output lex.yy.c y.tab.c and press "Enter" to generate the C files for
execution.
We type ./output to execute the program

D.SANJEEVA REDDY Page 49

You might also like