0% found this document useful (0 votes)
2 views

Chapter 4 Syntax Directed Translation

Syntax-directed translation (SDT) integrates syntax analysis and translation in compiler design by associating semantic actions with grammar productions. It involves components such as syntactic structure, syntax-directed definitions, and various parsing techniques, enabling tasks like code generation, type checking, and error detection. SDT facilitates the transformation of source code into target code while ensuring correctness and optimizing performance.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Chapter 4 Syntax Directed Translation

Syntax-directed translation (SDT) integrates syntax analysis and translation in compiler design by associating semantic actions with grammar productions. It involves components such as syntactic structure, syntax-directed definitions, and various parsing techniques, enabling tasks like code generation, type checking, and error detection. SDT facilitates the transformation of source code into target code while ensuring correctness and optimizing performance.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Chapter 4

Syntax Directed Translation


Introduction
• Syntax-directed translation is a powerful
technique in compiler design that enables the
integration of syntax analysis and translation
tasks in a cohesive and modular manner.
• It associates semantic actions with the
productions of the grammar
• This enables translation process to occur
simultaneously with syntax analysis
• SDT) is driven by the syntactic structure of the
source language. 2
Components of SDT
• SDT is associated with the following
components:
– Syntactic Structure
– Syntax-Directed Definitions (SDDs)
– Parsing and Translation
– Inherited and Synthesized Attributes
– Attribute Grammars
– Translation Schemes

3
Syntactic Structure
• Syntactic structure refers to the arrangement
of symbols, tokens, and expressions in a
programming language that conform to the
rules of its grammar.
• Syntactic structure determines the
hierarchical organization of elements in a
language
• It dictates how valid sentences or expressions
are
4
Syntactic Structure
• To understand syntactic structure we need to
know the following concepts:
– Grammar
– Tokens and Lexical structure
– Syntax and Parsing
– Hierarchical structure
– Terminal and Non-Terminal Symbols
– Ambiguity and Precedence

5
Grammar
• A programming language's syntactic structure
is defined by its grammar
• Grammar specifies the rules for constructing
valid sentences or expressions in the
language.
• The grammar is typically specified using a
context-free grammar (CFG) notation, such as:
– Backus-Naur Form (BNF) or
– Extended Backus-Naur Form (EBNF).
• They describe the syntax in terms of
production rules. 6
Tokens and Lexical Structure
• Syntactic structure begins with the lexical
analysis phase, which breaks down the source
code into meaningful tokens or lexemes.
• Tokens represent the smallest units of syntax
in the language, such as keywords, identifiers,
literals, operators, and punctuation symbols.
• The lexical structure defines how tokens are
formed and recognized within the source
code.
7
Syntax and Parsing
• Syntax - arrangement of tokens and
expressions that constitute valid statements
or declarations in the language.
• Parsing - process of analyzing the syntactic
structure of the source code according to the
grammar rules of the language.
• During parsing, the syntax analyzer (parser)
checks whether the input conforms to the
grammar and constructs a parse tree or
abstract syntax tree (AST) representing the
8
hierarchical structure of the code.
Hierarchical Structure
• Syntactic structure is hierarchical
• Expressions, statements, and declarations
composed of smaller syntactic units and
subexpressions.
• The hierarchical structure reflects the nesting
and composition of language constructs, such
as function calls within expressions,
statements within blocks, and declarations
within scopes.
9
Terminal and Non-Terminal
Symbols
• In the grammar, symbols are classified as non-
terminal or terminal symbols.
• Non-terminal - represent syntactic categories
or placeholders for other symbols and are
used to define production rules.
• Terminal symbols represent actual tokens or
lexemes in the language and appear in the
input code.

10
Context Free Grammars (CFGs)
• Context-free grammars are commonly used to
describe the syntactic structure of
programming languages.
• CFGs consist of a set of production rules that
define how non-terminal symbols can be
expanded into sequences of terminal and
non-terminal symbols.

11
Parsing and Translation
• Parsing - process of analyzing the syntactic
structure of the source code according to the rules
of its grammar.
• The input to the parser is a sequence of tokens
generated by the lexical analyzer from the source
code.
• The parser constructs a parse tree or abstract syntax
tree (AST) that represents the hierarchical structure
of the code according to the grammar rules.
• The parser verifies whether the source code
conforms to the syntax defined by the grammar.
12
Parsing and Translation
• Parsers can be classified into different types based on
their parsing techniques:
– Recursive Descent Parser: A top-down parser that
recursively descends through the grammar rules, typically
implemented using separate parsing functions for each
non-terminal symbol.
– LL Parser: A predictive parser that uses a lookahead token
to predict the next production to apply, commonly used
for parsing LL(k) grammars.
– LR Parser: A bottom-up parser that uses a shift-reduce
parsing technique to construct a parse tree, often
implemented using a state machine and a parsing table.
– LALR Parser: A variant of LR parser that reduces the size of
the parsing table using lookaheads. 13
Error Handling in Parsing
• Parsers must handle syntax errors gracefully
by providing meaningful error messages and
recovering from the error to continue parsing.
• Error recovery strategies include:
– panic mode recovery,
– error synchronization, and
– context-sensitive error handling.

14
Translation
• Translation is the process of transforming the
parsed source code into a form that can be
executed by the target machine.
• Translation involves various tasks such as
– semantic analysis,
– optimization, and
– code generation.

15
Translation
• Semantic analysis
– checks the correctness and meaning of the code beyond
its syntax, performing tasks such as type checking, symbol
resolution, and scope analysis.
• Optimization techniques
– aim to improve the efficiency and performance of the
generated code by applying transformations such as
constant folding, dead code elimination, and loop
optimization.
• Code generation
– produces target code (e.g., machine code, intermediate
code) that can be executed by the target machine, based
on the results of semantic analysis and optimization. 16
Intermediate Representations (IR)
• During translation, compilers often use intermediate
representations (IR) to represent the program in a
structured and platform-independent form.
• IR facilitates optimization and code generation by
providing a common representation that can be
manipulated and analyzed by compiler passes.
• Examples of intermediate representations include:
– abstract syntax trees (ASTs),
– three-address code (TAC), and
– intermediate language (IL).

17
Ambiguity and Precedence
• Syntactic structure aims to eliminate
ambiguity and define the precedence and
associativity of operators and language
constructs.
• Ambiguity - arises when a single string of
tokens can be parsed in more than one way
according to the grammar rules.

18
Syntax-Directed Definitions (SDDs)
• Syntax-directed definitions associate semantic
rules or actions with the productions of the
grammar.
• Each production in the grammar is associated
with a set of semantic actions that specify the
translation or computation to be performed
when that production is recognized during
parsing.
• These semantic rules guide the translation of
the source language into target language
19
constructs or actions.
Annotated Parse Tree
• A parse tree showing the values of attributes
at each node is called an annotated parse
tree.
• Values of attributes in nodes of annotated
parse-tree are either,
– initialized to constant values or by the lexical
analyzer.
– determined by the semantic-rules.
• The process of computing the attributes
values at the nodes is called annotating (or
decorating) of the parse tree. 20
Annotated Parse Tree -- Example
Input: 5+3*4 L

E.val=17 n

E.val=5 + T.val=12

T.val=5 T.val=3 * F.val=4

F.val=5 F.val=3 digit.lexval=4

digit.lexval=5 digit.lexval=3

21
Inherited and Synthesized Attributes
• Syntax-directed translation relies on the
concept of attributes associated with
grammar symbols.
• Inherited attributes - are passed from parent
nodes to child nodes in the parse tree and are
used to carry information downward in the
tree.
• Synthesized attributes - are computed or
derived from child nodes and propagated
upward in the parse tree to compute
22
information at higher levels of abstraction.
Inherited Attributes
• Inherited attributes are associated with non-
terminal symbols in the grammar
• They are used to pass information from parent
nodes to their child nodes in the parse tree.
• Inherited attributes represent context or
information from higher levels of the parse tree that
is needed to evaluate or compute attributes at
lower levels.
• Inherited attributes are typically passed down the
parse tree during the parsing process and are used
to guide attribute evaluation and computation at
each node. 23
Synthesized Attributes
• Synthesized attributes are associated with non-
terminal symbols in the grammar
• They are used to compute or derive information at
each node in the parse tree.
• Synthesized attributes represent information that is
derived solely from the attributes of the node's
children in the parse tree.
• Synthesized attributes are computed or synthesized
at each node based on the values of the attributes of
its child nodes
• They are used to represent information that is
aggregated or derived from lower levels of the parse
24
tree.
Propagation of Attributes
• During attribute evaluation, inherited attributes are
passed down the parse tree from parent nodes to
child nodes, providing context and information
needed for attribute computation.
• At each node in the parse tree, synthesized
attributes are computed based on the values of the
attributes of the node's children, representing
information derived from the subtree rooted at that
node.
• The propagation of inherited and synthesized
attributes continues recursively until all attributes in
the parse tree have been evaluated. 25
Attribute Grammars
• Attribute grammars provide a formal framework for
specifying syntax-directed translation schemes.
• They extend context-free grammars with attributes
and semantic rules
• They enable the definition of complex translation
tasks in a modular and compositional manner.
• AGs provide a powerful mechanism for expressing
complex semantic properties and performing
analysis and transformations during parsing and
translation

26
Translation Schemes
• Translation schemes are a concrete form of
syntax-directed translation
• They specify the translation process using a
combination of grammar productions and
semantic actions.
• Translation schemes define how the input
program is transformed into output code or
representations, such as intermediate code,
machine code, or abstract syntax trees (ASTs).
27
Example of Translation Scheme
• Consider a simple translation scheme for translating
arithmetic expressions into postfix notation:
E → E1 + T { print('+') }
E→T { print(T) }
T → T1 * F { print('*') }
T→F { print(F) }
F→(E) {}
F → id { print(id) }
• In this translation scheme, each production in the
grammar is associated with a print statement that
specifies how to generate part of the postfix
expression.
28
Application of Translation Schemes
• Translation schemes are used in various
stages of the compilation process, including
syntax analysis, semantic analysis, and code
generation.
• They provide :
– a systematic and structured way to define the
translation process and
– specify how source language constructs are
transformed into target language constructs.
29
Example 1 - Arithmetic Expressions to
Postfix Notation:
• Consider a grammar for arithmetic expressions:
E→E+T|T
T→T*F|F
F → (E) | id
• The associated semantic actions for this grammar could
produce postfix notation:
E → E1 + T { print('+') }
E→T { print(T) }
T → T1 * F { print('*') }
T→F { print(F) }
F→(E) {}
F → id { print(id) }
• When parsing the expression "a * (b + c)", the semantic
actions print "a b c + *" to generate the corresponding postfix
30
notation "a b c + *".
Example 2 – Type Checking
• Semantic actions associated with grammar
productions can perform type checking during
parsing.
• For example, in a C-like language, you could
associate semantic actions with assignments
to ensure type compatibility between the
left-hand side and right-hand side
expressions.

31
Example 3 – Generating
Intermediate Code
• Syntax-directed translation can involve
generating intermediate representations of
the source code for further processing.
• For example, you could generate three-
address code (TAC) during parsing, where
each instruction has at most three operands.

32
Example 4 - Error Detection and
Reporting
• Semantic actions can be used to detect errors
during parsing and report them to the user.
• For example, semantic actions associated
with type mismatches or undeclared
identifiers can generate error messages.

33
Example 5 - Optimization
• Semantic actions can perform optimizations
during parsing to improve the efficiency of
the generated code.
• For example, constant folding or algebraic
simplification could be performed during
syntax-directed translation.

34
Example 6 - Symbol Table Construction
• Semantic actions associated with variable
declarations can construct and populate
symbol tables during parsing.
• Symbol tables can store information about
identifiers such as their types, scopes, and
memory locations.

35
Example 7 - Control Flow Analysis
• Semantic actions can analyze control flow
constructs such as loops and conditionals
during parsing.
• This analysis can be used to perform
optimizations or generate control flow
graphs.

36
Summary
• In summary, syntax-directed translation involves
associating semantic actions with grammar
productions to guide the translation process from
source code to target code.
• These semantic actions can perform a variety of tasks
including:
– generating target code,
– performing type checking,
– detecting errors,
– optimizing code,
– constructing symbol tables, and
– analyzing control flow. 37

You might also like