0% found this document useful (0 votes)

24 views42 pages

3 Syntax Analysis

Uploaded by

Salam Abdulla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views42 pages

3 Syntax Analysis

Uploaded by

Salam Abdulla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

University of sulaimani

College of science
Department of Computer Science

Compiler
Second Phase of the Compiler
Syntax Analyzer

2023-2024
Mzhda Hiwa Hama
Syntax Analyzer

The second phase of the compiler is syntax analysis

or parsing. The parser uses the first components of
the tokens produced by the lexical analyzer to create
a tree-like intermediate representation that depicts
the grammatical structure of the token stream.
Role of the Syntax Analyzer

• Constructs a tree (called a parse tree) to discover the

structure of a document/program. This parse tree is
used to guide translation.

• Syntactic error detection – reports to user where any

syntax error in the source code are.

• Recognizes sentences in a language.

• Represents the structure of the language.

Introduction to Parsing
• Parsing is the process of determining how a string of terminals
can be generated by a grammar. F → id | (E)

• A parser must be capable of constructing the tree in principle,

or else the translation cannot be guaranteed correct.

• A parser scans the input string from left to right and it makes
use of production rules for choosing appropriate derivation.

• Two parsing techniques:

• Top-down parsing
• Bottom-up parsing
Top-down parsing

• A parser is top-down if it generates a parse tree

starting from the root and precedes towards the
leaves.
• It is easier to understand and program manually
• A leftmost derivation is applied at each derivation step

 Two kinds of top-down parsing techniques will be

studied
1. Recursive-Decent parser
2. Predictive parser
Bottom-Up Parsing

• A bottom-up parse corresponds to the construction

of a parse tree for an input string beginning at the
leaves (the bottom) and working up towards the
root (the top).

• Bottom-up parsing is more general than top-down

parsing.
Example
1- Top-Down Parsing by Recursive-Descent
• It reads characters from the input stream and matches
them with terminals from the grammar.

The operation involved are :

• Start from the “Start non-terminal” and select a rule
from the production rules (CFG)
• If it was not a correct rule, then backtrack and choose
another rule.
• If every production is unsuitable for string match, then
parse tree cannot be built, and syntax error is
reported.
Example1
• Consider grammar
S  xPz
P  yw | y

• For Token stream is: xyz

1- Select rule S  xPz

x P z
Example1…cont’d
S  xPz
2- Select rule P  yw P  yw | y
• Not correct S

x P z
y w

3- Select rule P  y S
• Correct
x P z

y
Example2

• Consider the grammar

•ET+E|T
• T  var | var * T

• Token stream is: var*var

• Start with top-level non-terminal E

• Try the rules for E in order E

Try E  T + E
•Then try a rule for T  var T
+
E
• Token matches var
• But + after var does not match input var
token *
Example2…cont’d

• Try T  var* T
• Token matches.
• This will match but + after T will be unmatched
• Has exhausted the choices for E  T + E E

T E
+
• Backtrack to choice for E var * T
Example2…cont’d
• Try E  T
• Follow same steps as before for T
• Try a rule for T  var
E
Token matches var
• But there is no other token after var T

var
• Try T  var* T E
• Then try T var
• Succeed with the following parse tree T

var T
*
var
Notes
S
•Easy to implement by hand.
S a
•But does not always work …
S a
•Consider a production S  S a S a
•S will get into an infinite loop.
S a
•This case is called left-recursion. .
.
•Recursive descent does not work in .
such cases.
Left Recursion
• A production of grammar is said to have left recursion if the
leftmost variable of its RHS is same as variable of its LHS.
• A grammar containing a production having left recursion is
called as Left Recursive Grammar.

Elimination of Left Recursion

If we have the left-recursive pair of productions-
A → Aα / β
Then, we can eliminate left recursion by replacing the pair of
productions with-
A → βA’
A’ → αA’ / ∈
Example of Elimination of Left Recursion

E → E + T|T
Eliminate immediate left recursion from the Grammar

• String id + id * id
2- Top-Down Parsing by Predictive parser
• Predicts which production to use
• By looking at the next few tokens, using “lookahead”
variable
• No backtracking
• Predictive parsers accept LL(k) grammars
• L means “Left-to-Right” scan of input
• L means “Leftmost derivation”
• k means “predict based on k tokens of lookahead”
• In practice, LL(1) is used
• LL(k) grammar must be unambiguous
• LL(k) grammar must not include any left-recursion
LL(1) Parser
• input buffer : The string to be parsed
• Output: A production rule representing a step of the
derivation sequence (left-most derivation) of the string in the
input buffer.
• Stack: keeps the grammar symbols, initially contains $.
• The symbols in RHS of rule are pushed into the stack in
reverse order i.e. from right to left
• parsing table:
• a two-dimensional array
• each row is a non-terminal symbol
• each column is a terminal symbol or the special symbol $
• each entry holds a production rule.
20

LL(1) Parser Model

LL(1) Parser

Input token a + b

LL(1) Parser

Output
top
$
Stack

a + b $ Parsing table
A
B
C
Building Predictive Parser
Three steps
1. Compute FIRST and FOLLOW
2. Construct the predictive parsing table
3. Parse the input string

• Note: the grammar must be unambiguous and Left

Recursion must be eliminated

• First and follow are used to construct the predictive

parsing table
Computing First and Follow

• FIRST() is a set of the terminal symbols which

occur as first symbols in strings derived from  .
Where  is any string of grammar (terminals and
non-terminals).
• if  derives to , then  is also in FIRST() .

• FOLLOW(A) is the set of the terminals which occur

immediately after (follow) the non-terminal A. If the
strings derived from the starting symbol.
_ $ is in FOLLOW(A) if S  A
• a terminal a is in FOLLOW(A) if S  Aa
Computing FIRST
• FIRST(a) = {a} if a ∈ T
• FIRST(ε) = {ε}
• FIRST (X) for a non-terminal X
If there is production X→Y1Y2....Yk then
• FIRST(X) = FIRST(Y1) - {ε} .
• But, if ε ∈ FIRST(Y1),then add FIRST(Y2)-{ε}
• And, if ε ∈ FIRST(Y2),…
Example 1
• E → TE`
E`→ +TE` | ε
T → FT`
T`→ *FT` | ε
F → id | (E)

• FIRST(E) = FIRST(T) =FIRST(F)= { id, (}

• FIRST(E`) = {+, ε}
• FIRST(T) = FIRST(F)= { id, (}
• FIRST(T`) = {*, ε}
• FIRST(F) = { id, (}
Example 2
• type → simple
| ^ id
| array [ simple ] of type
• simple → integer
| char
| num dot num

• FIRST(simple) = { integer, char, num }

• FIRST(type) = { integer, char, num, ^, array }
Exercise 3

Find FIRST for the following grammar

• S  ACB | CbB | Ba
A  da | BC
Bg|ε
Ch|ε

• S  Aa
A  bdZ |eZ
Z  cZ |adZ | ε
Computing FOLLOW

• If S is the start symbol  $ is in FOLLOW(S)

• if A  B is a production rule
 everything in FIRST() is FOLLOW(B) except 
• If ( A  B is a production rule ) or
( A  B is a production rule and  is in FIRST() )
 everything in FOLLOW(A) is in FOLLOW(B).
We apply these rules until nothing more can be
added to any follow set
Example 1
E → T E’
E’→ + T E’ | ε
T → F T’
T’→ * F T’ | ε
F → ( E ) | id

• FIRST(E) = FIRST(T) = FIRST(F) = {( , id}

FIRST(E’) = {+, ε}
FIRST(T’) = {*, ε}

• FOLLOW(E) = {) , $}
FOLLOW(E’) = {) , $}
FOLLOW(T) = {+, ), $}
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {*, +, ), $}
Example 2

E → T E'
E' → + T E‘ | - T E‘ | ε
T → F T'
T' → * F T‘ | / F T' | ε
F → num | id
First Follow
E {num , id} {$}
E’ {+, -, ε} {$}
T {num , id} {+, -, $}
T’ {*, /, ε} {+, -, $}
F {num , id} {*, /, +, -, $}
29
Notes

• To compute FIRST(A) you must look for A on a production's

left-hand side.
• To compute FOLLOW(A) you must look for A on a
production's right-hand side.
• FIRST sets are always sets of terminals (plus, perhaps
epsilon).
• FOLLOW sets are always sets of terminals (plus, perhaps $).
• Nonterminals are never in a FIRST or a FOLLOW set.
• epsilon is never in a FOLLOW set.
Constructing the Parse Table

• Parse table summarizes the applicable RHS for each

terminal/non-terminal combination.
• Construct a parsing table T for CFG :
• For each production X → α
• Add → α to the X row for each symbol in FIRST(α)
• If α is nullable, add → α for each symbol in
FOLLOW(X)
• Entry for [S, $] is ACCEPT
• All other undefined entries of the parsing table are
error entries.
Parse table Example (1)

E → T E’
E’→ + T E’ | ε
T → F T’
T’→ * F T’ | ε
F → ( E ) | id

First Follow
E {( , id} {), $}
E’ {+, ε} {), $}
T {( , id} {+, ), $}
T’ {*, ε} {+, ), $}
F {( , id} {*, +, ), $}
32
Parse table Example 1

• Create table with:

• Put each terminals in the columns
• Put each non-terminals to rows

+ * ( ) id $
E
E’
T
T’
F
33
Parse table Example 1

• For production E → T E’
• Add E → T E’ to the E row for each symbol in FIRST(E)

+ * ( ) id $
E E → T E’ E → T E’
E’
T
T’
F
Parse table Example 1

• For production E’→ + T E’ | ε

• E’→ + T E’, Add → + T E’ to the E’ row for each symbol
in FIRST(E’)
• E’ → ε, add → ε for each symbol in FOLLOW(E’)

+ * ( ) id $
E E → T E’ E → T E’

E’ E’→ + T E’ E’ → ε E’ → ε
T
T’
F
35
Parse table Example 1

• For production T → F T’
• Add T → F T’ to the T row for each symbol in FIRST(T)

+ * ( ) id $
E E → T E’ E → T E’
E’ E’→ + T E’ E’ → ε E’ → ε

T T → F T’ T → F T’

T’
F
36
Parse table Example 1
• For production T’→ * F T’ | ε
• Add T’→ * F T’ to the T’ row for each symbol in FIRST(T’)
• Add T’ → ε for each symbol in FOLLOW(T’)

+ * ( ) id $
E E → T E’ E → T E’
E’ E’→ + T E’ E’ → ε E’ → ε
T T → F T’ T → F T’

T’ T’ → ε T’→ * F T’ T’ → ε T’ → ε

F
37
Parse table Example 1

• For production F → ( E ) | id
• Add F → ( E ) to the F row and symbol (
• Add F → id to the F row and symbol id

+ * ( ) id $
E E → T E’ E → T E’
E’ E’→ + T E’ E’ → ε E’ → ε

T T → F T’ T → F T’

T’ T’ → ε T’→ * F T’ T’ → ε T’ → ε
F F→(E) F → id
38
Parser Actions
• The symbol at the top of the stack (say X) and the current symbol in
the input string (say a) determine the parser action.

• There are four possible parser actions.

1. If X and a are $  parser halts (successful completion)

2. If X and a are the same terminal symbol (different from $)

 parser pops X from the stack, and moves the next symbol in the
input buffer.

3. If X is a non-terminal
 parser looks at the parsing table entry M[X,a]. If M[X,a] holds a
production rule XY1Y2...Yk, it pops X from the stack and pushes
Yk,Yk-1,...,Y1 into the stack. The parser also outputs the production rule
XY1Y2...Yk to represent a step of the derivation.

4. none of the above  error

• all empty entries in the parsing table are errors.
• If X is a terminal symbol different from a, this is also an error case.
LL(1)Parser Example

S  aBa
B  bB | 

stack input output

$S abba$ S  aBa
$aBa abba$
$aB bba$ B  bB
$aBb bba$
$aB ba$ B  bB
$aBb ba$
$aB a$ B
$a a$
$ $ accept, successful completion
LL(1) Parser – Example2
E  TE’
E’  +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id
LL(1) Parser – Example2…cont’d

Input is id+id
stack input output
$E id+id$ E  TE’
$E’T id+id$ T  FT ’
$E’ T ’F id+id$ F  id
$ E’ T ’id id+id$
$ E’ T ’ +id$ T’  
$ E’ +id$ E’  +FT ’
$ E’ T+ +id$
$ E’ T id$ T  FT ’
$ E’ T ’ F id$ F  id
$ E’ T ’id id$
$ E’ T ’ $ T’  
$ E’ $ E’  
$ $ accept

Unit 2
No ratings yet
Unit 2
11 pages
Parsing Techniques Explained
No ratings yet
Parsing Techniques Explained
23 pages
L5 TopDownParsing
No ratings yet
L5 TopDownParsing
30 pages
CD Unit3
No ratings yet
CD Unit3
74 pages
Parsing Technique Baar Baar
No ratings yet
Parsing Technique Baar Baar
29 pages
Syntax Analysis and Parsing Techniques
No ratings yet
Syntax Analysis and Parsing Techniques
33 pages
Unit - Ii 2.1 Syntax Analysis
No ratings yet
Unit - Ii 2.1 Syntax Analysis
122 pages
Theory of Computation and Compiler Design: Module - 4
No ratings yet
Theory of Computation and Compiler Design: Module - 4
31 pages
Module 4 - Top Down Parsing
No ratings yet
Module 4 - Top Down Parsing
31 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
68 pages
Top Down Parsing in Compiler Design
No ratings yet
Top Down Parsing in Compiler Design
34 pages
Lec 6
No ratings yet
Lec 6
102 pages
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
No ratings yet
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
44 pages
Parsing
No ratings yet
Parsing
38 pages
Top-Down Parsing Techniques
No ratings yet
Top-Down Parsing Techniques
31 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
54 pages
U 2 PPT
No ratings yet
U 2 PPT
91 pages
Compiler Design Unit-2
No ratings yet
Compiler Design Unit-2
29 pages
Parsing
No ratings yet
Parsing
33 pages
Syntax Analysis
No ratings yet
Syntax Analysis
90 pages
Top-Down Parsing Techniques Explained
No ratings yet
Top-Down Parsing Techniques Explained
36 pages
Chapter 5 Intro To Top Down Parsing
No ratings yet
Chapter 5 Intro To Top Down Parsing
50 pages
Week 10 - Non Recursive Predictive Parsor
0% (1)
Week 10 - Non Recursive Predictive Parsor
41 pages
Parsing Techniques Overview
No ratings yet
Parsing Techniques Overview
71 pages
td2 LL - 1 Parsing
No ratings yet
td2 LL - 1 Parsing
45 pages
Top Down Parsing
No ratings yet
Top Down Parsing
38 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
82 pages
Syntax Analysis I 2024
No ratings yet
Syntax Analysis I 2024
38 pages
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
No ratings yet
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
44 pages
FIRST Set in Syntax Analysis: Lecture-05
No ratings yet
FIRST Set in Syntax Analysis: Lecture-05
14 pages
Understanding Syntax Analysis and Parsing
No ratings yet
Understanding Syntax Analysis and Parsing
46 pages
Unit-II CD
No ratings yet
Unit-II CD
81 pages
Chapter 8 - Syntax Analysis
No ratings yet
Chapter 8 - Syntax Analysis
92 pages
3 1 Parsing Syntax Analysis
No ratings yet
3 1 Parsing Syntax Analysis
174 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
73 pages
Syntax Analyzer and Parsing Techniques
No ratings yet
Syntax Analyzer and Parsing Techniques
38 pages
Toc Unit 3
No ratings yet
Toc Unit 3
49 pages
Compiler Construction: Parsing: Mandar Mitra
No ratings yet
Compiler Construction: Parsing: Mandar Mitra
33 pages
Top Down Parsing
No ratings yet
Top Down Parsing
37 pages
M2 Compiler Design
No ratings yet
M2 Compiler Design
51 pages
2-Role of Parser and Parse Tree-02!08!2024
No ratings yet
2-Role of Parser and Parse Tree-02!08!2024
69 pages
Presented by Jyoti Thakur
No ratings yet
Presented by Jyoti Thakur
31 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
No ratings yet
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
26 pages
Module-2 1
No ratings yet
Module-2 1
51 pages
Chapter-4 - CS-411 Compiler Construction
No ratings yet
Chapter-4 - CS-411 Compiler Construction
8 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Operator Precedence and LL Parsing
No ratings yet
Operator Precedence and LL Parsing
31 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
CS6109 Module 5
No ratings yet
CS6109 Module 5
117 pages
Top-Down and Bottom-Up Parsing
No ratings yet
Top-Down and Bottom-Up Parsing
23 pages
Chapter 3-Syntax Analysis-II
No ratings yet
Chapter 3-Syntax Analysis-II
28 pages
Atcd Unit 2
No ratings yet
Atcd Unit 2
49 pages
Syntax & Semantic Analysis Guide
No ratings yet
Syntax & Semantic Analysis Guide
32 pages
Cdeprt
No ratings yet
Cdeprt
12 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
1 Types of Parsers in Compiler Design
100% (1)
1 Types of Parsers in Compiler Design
4 pages
Chapter 4 - Syntax Analysis CIE1
No ratings yet
Chapter 4 - Syntax Analysis CIE1
69 pages
1 Finite Automata
No ratings yet
1 Finite Automata
62 pages
2 Regular Expression
No ratings yet
2 Regular Expression
23 pages
1 Compiler Phases
No ratings yet
1 Compiler Phases
30 pages
SECJ3303 202120221 Test1b Unlocked
No ratings yet
SECJ3303 202120221 Test1b Unlocked
10 pages
2 Lexical Analyzer
No ratings yet
2 Lexical Analyzer
21 pages
Key Algorithm Techniques Explained
No ratings yet
Key Algorithm Techniques Explained
11 pages
Neural Networks Course Overview
No ratings yet
Neural Networks Course Overview
6 pages
MidTermLabTest (2021)
No ratings yet
MidTermLabTest (2021)
10 pages
Original Slides by Daniel Liang Modified Slides by Salam Abdulla
No ratings yet
Original Slides by Daniel Liang Modified Slides by Salam Abdulla
112 pages
Oral Communication Test
No ratings yet
Oral Communication Test
2 pages
Bài Tập Bổ Trợ Anh 11 Friends Global Có Giải Chi Tiết-unit-2-Đề
No ratings yet
Bài Tập Bổ Trợ Anh 11 Friends Global Có Giải Chi Tiết-unit-2-Đề
10 pages
Advanced Maths 7
No ratings yet
Advanced Maths 7
6 pages
Relational Model and Normal Forms - DPP 01
No ratings yet
Relational Model and Normal Forms - DPP 01
4 pages
Scenariostuck 6 - Sliced Scenematic
No ratings yet
Scenariostuck 6 - Sliced Scenematic
3 pages
Manual TV Philips 50 Pulgadas
No ratings yet
Manual TV Philips 50 Pulgadas
12 pages
Extra Work - Jennifer Marina Chavez Carcamo
No ratings yet
Extra Work - Jennifer Marina Chavez Carcamo
5 pages
CASE1
No ratings yet
CASE1
1 page
Proposal Look Pedestal
No ratings yet
Proposal Look Pedestal
1 page
CS302 - Lab Manual - Week No
No ratings yet
CS302 - Lab Manual - Week No
9 pages
Gaurav Pandey
No ratings yet
Gaurav Pandey
2 pages
Philippine Airlines 2023 Financial Success
No ratings yet
Philippine Airlines 2023 Financial Success
3 pages
Bipolar Worksheet - 19 - Problem Solving Sheet
No ratings yet
Bipolar Worksheet - 19 - Problem Solving Sheet
2 pages
Letter of Bid: Road Widening and Upgrading Works in Himalayan Tole, Nikosera, Madhyapur Thimi 09, Bhaktapur
0% (1)
Letter of Bid: Road Widening and Upgrading Works in Himalayan Tole, Nikosera, Madhyapur Thimi 09, Bhaktapur
2 pages
The Lived Experiences of Learners From Broken Home With Insignificant Progress Amidst Pandemic Basis in Designing A Remediation Plan
No ratings yet
The Lived Experiences of Learners From Broken Home With Insignificant Progress Amidst Pandemic Basis in Designing A Remediation Plan
12 pages
Maed Thesis Template
No ratings yet
Maed Thesis Template
192 pages
GSA Hello 01 - Growth Scale Guide
No ratings yet
GSA Hello 01 - Growth Scale Guide
10 pages
Barcelona Itinerary
No ratings yet
Barcelona Itinerary
6 pages
Burket S Oral Medicine 12th Edition Michael Glick Full
100% (2)
Burket S Oral Medicine 12th Edition Michael Glick Full
37 pages
Modals For Class 4 PDF
No ratings yet
Modals For Class 4 PDF
12 pages
Lei Ilima Girls Club Project: Calendar of Events
No ratings yet
Lei Ilima Girls Club Project: Calendar of Events
4 pages
Service Kit Transmission
No ratings yet
Service Kit Transmission
3 pages
pt6 7 8gear Hand LobePumps
No ratings yet
pt6 7 8gear Hand LobePumps
8 pages
Powerlifting Training Guide
100% (1)
Powerlifting Training Guide
3 pages
Importance of Mathematics in Society
No ratings yet
Importance of Mathematics in Society
4 pages
One Way Switch Wiring for Two Lamps
No ratings yet
One Way Switch Wiring for Two Lamps
3 pages
Pmal - 410R - Group 1 - Unit 5
No ratings yet
Pmal - 410R - Group 1 - Unit 5
6 pages
R Rep M.2418 2017 PDF e
No ratings yet
R Rep M.2418 2017 PDF e
17 pages
Bio 11 Finals Mock Exam
No ratings yet
Bio 11 Finals Mock Exam
9 pages

3 Syntax Analysis

Uploaded by

3 Syntax Analysis

Uploaded by

University of sulaimani

The second phase of the compiler is syntax analysis

• Constructs a tree (called a parse tree) to discover the

• Syntactic error detection – reports to user where any

• Recognizes sentences in a language.

• Represents the structure of the language.

• A parser must be capable of constructing the tree in principle,

• Two parsing techniques:

• A parser is top-down if it generates a parse tree

 Two kinds of top-down parsing techniques will be

• A bottom-up parse corresponds to the construction

• Bottom-up parsing is more general than top-down

The operation involved are :

• For Token stream is: xyz

1- Select rule S  xPz

• Consider the grammar

• Token stream is: var*var

• Try the rules for E in order E

Elimination of Left Recursion

LL(1) Parser Model

• Note: the grammar must be unambiguous and Left

• First and follow are used to construct the predictive

• FIRST() is a set of the terminal symbols which

• FOLLOW(A) is the set of the terminals which occur

• FIRST(E) = FIRST(T) =FIRST(F)= { id, (}

• FIRST(simple) = { integer, char, num }

Find FIRST for the following grammar

• If S is the start symbol  $ is in FOLLOW(S)

• FIRST(E) = FIRST(T) = FIRST(F) = {( , id}

• To compute FIRST(A) you must look for A on a production's

• Parse table summarizes the applicable RHS for each

• Create table with:

• For production E’→ + T E’ | ε

• There are four possible parser actions.

2. If X and a are the same terminal symbol (different from $)

4. none of the above  error

stack input output

You might also like