Case Insensitive Regex For SQL
Case Insensitive Regex For SQL
In a case-insensitive language like SQL, a regular expression (regex) to match a keyword (like `SELECT`)
should allow all variations of letter casing (e.g., `select`, `SELECT`, `Select`, `SeLeCT`, etc.).
Regex:
regex
/select/i
Explanation:
Matches:
`SELECT`
`select`
`Select`
`SeLeCT`
etc.
Regex:
regex
[Ss][Ee][Ll][Ee][Cc][Tt]
Explanation:
and so on...
Matches:
✅ SQL Example:
If you’re scanning SQL code for the `SELECT` keyword, either approach will help identify the keyword
regardless of case.
python
import re
Would you like help building a regex for multiple SQL keywords (like `SELECT`, `FROM`, `WHERE`, etc.)?
Sure! Here's the regular expression for the SQL keyword `SELECT` in a case-insensitive language:
/select/i
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 2/76
Use this if your regex engine supports the `i` (ignore case) flag. It matches `select`, `SELECT`,
`Select`, `SeLeCT`, etc.
[Ss][Ee][Ll][Ee][Cc][Tt]
Use this if your regex engine does not support flags. It manually matches each letter in either
case.
Let me know if you want to match other SQL keywords too, like `FROM`, `WHERE`, etc., or build a pattern
to match full SQL statements.
To write a regular expression for a keyword in a case-insensitive language, you can use either of
two main methods depending on your regex engine:
This is the simplest and cleanest way to match a keyword like `"select"` in a case-insensitive language
(e.g., SQL):
Regular Expression:
regex
/select/i
Explanation:
`i` is the case-insensitive flag, making the regex match any case variation:
`SELECT`
`select`
`SeLeCt`
`sElEcT`, etc.
Supported in:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 3/76
JavaScript
Python (`[Link]`)
Java (`Pattern.CASE_INSENSITIVE`)
Most modern regex engines
Regular Expression:
regex
[Ss][Ee][Ll][Ee][Cc][Tt]
Explanation:
Also Matches:
✅ Summary Table:
Method Regex Use When
Case-insensitive flag `/select/i` Your regex engine supports `i` flag
Manual case matching `[Ss][Ee][Ll][Ee][Cc][Tt]` Flags not supported (e.g., some tools)
Would you like a version that matches multiple SQL keywords case-insensitively (like
`SELECT|FROM|WHERE`)?
To show that a given grammar is LALR(1), you need to follow a formal procedure involving parsing
table construction and state merging. Here's a step-by-step explanation:
Build LR(0) items and construct the canonical collection of LR(0) states using the closure and goto
operations.
css
A → α·β
This helps verify if the grammar is already SLR(1). If it is, then it is definitely LALR(1) (since SLR(1) ⊆
LALR(1)).
Add lookahead symbols to each LR(0) item, making them LR(1) items:
css
A → α·β , a
Identify states in the LR(1) collection that have the same LR(0) core (i.e., the same productions
without the lookahead).
Merge these states only if their lookaheads don’t cause conflicts.
If the resulting LALR(1) parsing table has no shift/reduce or reduce/reduce conflicts, the grammar is
LALR(1).
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 5/76
✅ Summary Criteria:
Grammar is LALR(1) ⇔ Its LALR(1) parsing table has no parsing conflicts.
Construct LR(1) items → Merge compatible states → Check for table conflicts.
PLY (Python)
Lark (Python with LALR support)
To check for conflicts in the LALR(1) parsing table, you follow a structured approach. This step is
crucial to determine whether a grammar is truly LALR(1).
✅ What Is a Conflict?
In a parsing table, a conflict happens when a parser doesn't know what action to take for a given state
and lookahead symbol. There are two main types:
1. Shift/Reduce Conflict:
The parser can either shift (read more input) or reduce (apply a grammar rule) — and it’s unclear
which to choose.
2. Reduce/Reduce Conflict:
The parser could reduce using two or more different rules — again, unclear which one to use.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 6/76
2. Create the LALR(1) Parsing Table
5 `)` shift to 9
5 `)` reduce by A→α
Use precedence and associativity rules (some parser tools allow this).
Consider rewriting grammar for full LR(1) if LALR(1) isn't enough.
Would you like a small grammar example showing how to find a conflict in an actual LALR(1) table?
What do you mean by attributed grammars? Discuss the translation scheme for
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 7/76
Converting an infix expression to its equivalent postfix form.
🔧 Key Concepts:
Attributes: Values associated with terminals and non-terminals (e.g., types, values, or
intermediate code).
Purpose: Attributed grammars enable translation, type checking, code generation, etc., in
compilers.
E → E + T
E → T
T → T * F
T → F
F → (E)
F → id
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 8/76
`[Link]` represents the actual variable name or token string.
1. Infix: `a + b * c`
2. Parse Tree:
E → E + T
E → T
T → F
F → a
T → T * F
T → F
F → b
F → c
`F → b` → `b`
`F → c` → `c`
`T → F` → `b`
`T → T * F` → `b c *`
`E → T` → `a`
`E → E + T` → `a b c * +`
✅ Final Postfix: `a b c * +`
✅ Summary:
Concept Description
Would you like a full implementation of this in Python or a parser generator like ANTLR or Lark?
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 9/76
how t1 value is taken here
yaml
104 : T1 = i * 4
This assumes:
The array `A` is an array of elements of size 4 bytes (commonly the size of an integer).
To access `A[i]` in memory, we calculate the byte offset:
ini
offset = i * size_of_element = i * 4
`T1` stores the memory offset of the `i-th` element in array `A`.
This line retrieves the value from array `A` at the offset `T1`.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 10/76
🧠 Summary:
TAC Line Meaning Explanation
This is a typical low-level representation of array access in intermediate code where arrays are stored
in contiguous memory.
Would you like to see this example simulated in memory for better understanding?
To show that a grammar is LR(1) but not LALR(1), you need to find a grammar where:
1. ✅ The LR(1) parsing table has no conflicts (so the grammar is LR(1)), but
2. ❌ Merging LR(1) states with identical LR(0) cores introduces conflicts, making the LALR(1)
table invalid.
✅ Step-by-Step Method:
1. Construct the Canonical LR(1) Item Sets
Now merge LR(1) items that have the same LR(0) items but different lookaheads.
If merging creates conflicting actions (like reduce/reduce) → The grammar is not LALR(1).
less
S → Aa | bAc | dc | bda
A → d
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 11/76
📌 Step 2: LALR(1) Table Has Conflict
When you merge LR(1) states that contain:
css
A → d , a
A → d , c
You get:
css
A → d , a/c
Then, in the merged state, a reduce/reduce conflict occurs for lookaheads `a` and `c`.
Hence, this grammar is LR(1) but not LALR(1).
✅ Conclusion:
To prove a grammar is LR(1) but not LALR(1), do this:
Would you like a detailed construction of the LR(1) and LALR(1) tables for the example above?
Explain with an example “why left recursion is an issue in top down parsing” and
write steps for left recursion elimination
However, left-recursive grammars can cause infinite recursion during this process.
less
A → A α | β
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 12/76
Indirect Left Recursion:
css
A → B α
B → A β
E → E + T | T
T → id
less
A → A α1 | A α2 | ... | β1 | β2 | ...
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 13/76
nginx
E → E + T | T
Transformed:
pgsql
E → T E'
E' → + T E' | ε
This ensures all indirect left recursion is handled before removing direct left recursion.
✅ Summary
Topic Explanation
Would you like help transforming a specific grammar to eliminate left recursion?
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 14/76
and rules. Attributes are associated with grammar symbols and rules are associated
with productions. If X is a symbol and a is one of its attributes, then we write X.a to
denote the value of a at a particular parse-tree node labeled X. For example, an infix-to
postfix translator might have a production and rule
PRODUCTION
E E1 + T
SEMANTIC RULE
[Link] = [Link] || [Link] || ‘+’
This production has two non terminals, E and T; the subscript in E1 distinguishes the
occurrence of E in the production body from the occurrence of E as the head. Both E and
T have a string-valued attribute code. The semantic rule specifies that the string [Link] is
formed by concatenating [Link], [Link], and the character '+'.
1.1 Inherited and Synthesized Attributes
Two kinds of attributes for nonterminals:
1. A synthesized attribute for a nonterminal A at a parse-tree node N is defined by a
semantic rule associated with the production at N. Note that the production must
have A as its head. A synthesized attribute at node N is defined only in terms of
attribute values at the children of N and at N itself.
2. An inherited attribute for a nonterminal B at a parse-tree node N is defined by a
semantic rule associated with the production at the parent of N. Note that the
production must have B as a symbol in its body. An inherited attribute at node N
is defined only in terms of attribute values at N's parent, N itself, and N's
siblings.
Example 3.3.1: The SDD in Fig. 3.3.1 is based on our familiar grammar for arithmetic
expressions with operators + and *. It evaluates expressions terminated by an end marker
n. In the SDD, each of the nonterminals has a single synthesized attribute, called val. We
also suppose that the terminal digit has a synthesized attribute lexval, which is an integer
value returned by the lexical analyzer.
Figure 3.3.1 : Syntax-directed definition of a simple desk calculator
The rule for production 1, L E n, sets [Link] to [Link], which we shall see is the
numerical value of the entire expression.
Production 2, E E1+ T, also has one rule, which computes the val attribute for
the head E as the sum of the values at E1and T. At any parse tree node N labeled E,
the value of val for E is the sum of the values of val at the children of node N
labeled E and T.
III [Link] I SEM I.T, KHIT
Page 1
UNIT-3- PART-1
COMPILER DESIGN
Production 3, E T, has a single rule that defines the value of val for E to be the
same as the value of val at the child for T.
Production 4 is similar to the second production; its rule multiplies the values at
the children instead of adding them.
The rules for productions 5 and 6 copy values at a child, like that for the third
production. Production 7 gives [Link] the value of a digit, that is, the numerical
value of the token digit that the lexical analyzer returned.
An SDD that involves only synthesized attributes is called S-attributed; the SDD in Fig.
3.3.1 has this property. In an S-attributed SDD, each rule computes an attribute for the
nonterminal at the head of a production from attributes taken from the body of the
production.
An SDD without side effects is sometimes called an attribute grammar. The rules in an
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 15/76
attribute grammar define the value of an attribute purely in terms of the values of other
attributes and constants.
Evaluation Orders for SDD's
1. Dependency Graphs
A dependency graph depicts the flow of information among the attribute instances in a
particular parse tree; an edge from one attribute instance to another means that the
value of the first is needed to compute the second. Edges express constraints implied by
the semantic rules.
For each parse-tree node, say a node labeled by grammar symbol X, the
dependency graph has a node for each attribute associated with X.
Suppose that a semantic rule associated with a production p defines the value of
synthesized attribute A.b in terms of the value of X.c. Then, the dependency graph
has an edge from X.c to A.b. More precisely, at every node N labeled A where
production p is applied; create an edge to attribute b at N, from the attribute c at
the child of N corresponding to this instance of the symbol X in the body of the
production.
Suppose that a semantic rule associated with a production p defines the value of
inherited attribute B.c in terms of the value of X.a. Then, the dependency graph
has an edge from X.a to B.c. For each node N labeled B that corresponds to an
occurrence of this B in the body of production p, create an edge to attribute c at N
from the attribute a at the node M that corresponds to this occurrence of X. Note
that M could be either the parent or a sibling of N.
Example 3.4.1: Consider the following production and rule:
PRODUCTION
SEMANTIC RU LE
E ->E1+ T
[Link]= [Link] + [Link]
At every node N labeled E, with children corresponding to the body of this production,
the synthesized attribute val at N is computed using the values of val at the two children,
labeled E and T. Thus, a portion of the dependency graph for every parse tree in which
this production is used looks like Fig. 3.4.1. As a convention, we shall show the parse tree
edges as dotted lines, while the edges of the dependency graph are solid.
Figure 3.4.1: [Link] is synthesized from [Link] and [Link]
Example 3.4.2: An example of a complete dependency graph appears in Fig. 3.4.2. The
nodes of the dependency graph, represented by the numbers 1 through 9, correspond to
the attributes in the annotated parse tree in Fig. 3.3.5
III [Link] I SEM I.T, KHIT
Page 4
UNIT-3- PART-1
COMPILER DESIGN
Figure 3.4.2: Dependency graph for the annotated parse tree of Fig. 3.3.5
.
Nodes 1 and 2 represent the attribute lexval associated with the two leaves
labeled digit.
Nodes 3 and 4 represent the attribute val associated with the two nodes labeled F.
The edges to node 3 from 1 and to node 4 from 2 result from the semantic rule
that defines [Link] in terms of [Link] In fact, [Link] equals digit .lexval, but the
edge represents dependence, not equality.
Nodes 5 and 6 represent the inherited attribute T'.inh associated with each of the
occurrences of nonterminal T'. The edge to 5 from 3 is due to the rule T'.inh =
[Link], which defines T'.inh at the right child of the root from [Link] at the left child.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 16/76
We see edges to 6 from node 5 for T'.inh and from node 4 for [Link], because these
values are multiplied to evaluate the attribute inh at node 6.
Nodes 7 and 8 represent the synthesized attribute syn associated with the
occurrences of X". The edge to node 7 from 6 is due to the semantic rule T'.syn =
T'.inh associated with production 3 in Fig. 3.3.4. The edge to node 8 from 7 is due
to a semantic rule associated with production 2.
Finally, node 9 represents the attribute [Link]. The edge to 9 from 8 is due to the
semantic rule, [Link] = T'.syn, associated with production 1.
The dependency graph of Fig. 3.4.2 has no cycles. One topological sort is the order in
which the nodes have already been numbered: 1,2,... ,9. Notice that every edge of the
graph goes from a node to a higher-numbered node, so this order is surely a topological
sort. There are other topological sorts as well, such as 1,3,5,2,4,6,7,8,9.
2. S-Attributed Definitions
Given an SDD, it is very hard to tell whether there exist any parse trees whose
dependency graphs have cycles. In practice, translations can be implemented using
classes of SDD's that guarantee an evaluation order, since they do not permit dependency
graphs with cycles. Moreover, the two classes can be implemented efficiently in
connection with top-down or bottom-up parsing.
The first class is defined as follows:
An SDD is S-attributed if every attribute is synthesized.
Example 3.4.2: The SDD of Fig 3.3.1 is an example of an S-attributed definition. Each
attribute, [Link], [Link], [Link], and [Link] is synthesized.
When an SDD is S-attributed, we can evaluate its attributes in any bottom up order of the
nodes of the parse tree. It is often especially simple to evaluate the attributes by
III [Link] I SEM I.T, KHIT
Page 5
UNIT-3- PART-1
COMPILER DESIGN
performing a postorder traversal of the parse tree and evaluating the attributes at a node
N when the traversal leaves N for the last time. That is, we apply the function postorder,
defined below, to the root of the parse tree
postorder(N) {
for ( each child C of N, from the left ) postorder(C);
evaluate the attributes associated with node N;
}
S-attributed definitions can be implemented during bottom-up parsing, since a bottom-up
parse corresponds to a postorder traversal. Specifically, postorder corresponds exactly to
the order in which an LR parser reduces a production body to its head.
3. L-Attributed Definitions
The second class of SDD's is called L-attributed definitions. The idea behind this class is
that, between the attributes associated with a production body, dependency-graph edges
can go from left to right, but not from right to left(hence "L-attributed"). More precisely,
each attribute must be either
1. Synthesized, or
2. Inherited, but with the rules limited as follows. Suppose that there is a production
A X1X2 …. Xn, and that there is an inherited attribute Xi.a computed by a rule
associated with this production. Then the rule may use only:
(a) Inherited attributes associated with the head A.
(b) Either inherited or synthesized attributes associated with the
occurrences of symbols X1X2 …. Xn located to the left of Xi.
(c) Inherited or synthesized attributes associated with this occurrence of Xi
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 17/76
itself, but only in such a way that there are no cycles in a dependency
graph formed by the attributes of this Xj .
Example 3.4.2: The SDD in Fig. 3.3.4 is L-attributed. To see why, consider the semantic
rules for inherited attributes, which are repeated here for convenience:
PRODUCTION
SEMANTIC RULE
T FT'
T' ->*F T1’
T'.inh = [Link]
T1’. Inh = T'. inhx F. val
The first of these rules defines the inherited attribute T'.inh using only [Link], and F
appears to the left of T' in the production body, as required. The second rule defines
T[.inh using the inherited attribute T'.inh associated with the head, and [Link], where F
appears to the left of T[ in the production body.
In each of these cases, the rules use information "from above or from the left," as
required by the class. The remaining attributes are synthesized. Hence, the SDD is L
attributed. (Explain about SDD’s and also evaluation orders for SDD’s.)
Here’s a complete explanation of Syntax Directed Definitions (SDD) and their evaluation orders,
organized and simplified for clarity:
🧱 Basic Structure:
For a grammar symbol `X`, and attribute `a`, we write:
pgsql
🧩 Types of Attributes
1. Synthesized Attributes
🔹 Example:
For production: `E → E1 + T`
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 18/76
Rule: `[Link] = [Link] + [Link]`
`[Link]` is synthesized from its children `E1` and `T`.
2. Inherited Attributes
🔹 Example:
For production: `T → F T'`
Rule: `T'.inh = [Link]`
`T'.inh` is inherited from sibling `F`.
E → E1 + T
Semantic Rule:
ini
1. Dependency Graphs
E → E1 + T
[Link] = [Link] + [Link]
kotlin
[Link] → [Link]
[Link] → [Link]
2. S-Attributed Definitions
Evaluated bottom-up
Can use postorder traversal (visit children before evaluating parent)
E → E1 + T
[Link] = [Link] + [Link]
This is S-attributed.
3. L-Attributed Definitions
css
A → X1 X2 ... Xi ... Xn
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 20/76
🔹 Example:
r
T → F T'
T'.inh = [Link]
🔄 Summary Table
Feature S-Attributed SDD L-Attributed SDD
Attribute Type Synthesized only Synthesized + Inherited
Evaluation Order Bottom-up (postorder) Top-down (left to right)
💡 Conclusion
SDDs provide a structured way to attach meaning to grammar rules.
Synthesized attributes work best in bottom-up parsing.
Would you like a worked-out example of a full L-attributed or S-attributed grammar in action?
where `a`, `b`, and `c` are integer arrays, we break the expression into a sequence of 3-address
operations, then represent them using triples, which are indexed representations of these operations.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 21/76
cpp
scss
cpp
Alternatively, if we want to keep only three fields (as traditional triples), we can handle the store
separately in code generation.
Simplified triple format (without explicit store):
pgsql
(0) = (LOAD, a, i)
(1) = (LOAD, b, i)
(2) = (+, (0), (1))
(3) = (STORE, c, (2)) // interpreted as c[i] = result at (2)
Let me know if you want this in quadruple or indirect triple form too.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 22/76
In a compiler, the phases are generally categorized as machine-independent or machine-dependent,
based on whether they rely on the architecture or instruction set of the target machine.
✅ Machine-Independent Phases
These phases do not depend on the hardware architecture:
1. Lexical Analysis
→ Converts source code into tokens
(Machine-independent because it only deals with character streams.)
2. Syntax Analysis (Parsing)
→ Analyzes token structure according to grammar
(Independent of the target machine.)
3. Semantic Analysis
→ Checks for semantic consistency (e.g., type checking)
(Works with language rules, not machine specifics.)
⚙️ Machine-Dependent Phases
These phases rely on the target machine's architecture:
🔄 Summary Table
Phase Machine Dependency
Lexical Analysis Machine-Independent
Syntax Analysis Machine-Independent
Semantic Analysis Machine-Independent
Intermediate Code Generation Machine-Independent
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 23/76
Phase Machine Dependency
Code Optimization (generic) Machine-Independent
Code Optimization (low-level) Machine-Dependent
Target Code Generation Machine-Dependent
Linking/Assembly Machine-Dependent
Would you like a diagram showing how these phases connect in a compiler?
In shift-reduce parsing, conflicts occur when the parser cannot decide what action to take at a certain
point in the parse due to ambiguity or grammar design issues. The two main types of conflicts are:
⚔️ 1. Shift-Reduce Conflict
🔹 Definition:
A shift-reduce conflict occurs when the parser cannot decide whether to shift (read the next input
symbol) or reduce (apply a grammar rule) at a particular step.
🔸 Why It Happens:
Happens in ambiguous grammars or in situations where multiple parse paths seem valid.
Typically seen in LR parsers during construction of the parsing table.
📌 Example:
Consider the classic "dangling else" problem:
grammar
S → if E then S
S → if E then S else S
S → other
For input:
mathematica
Shift the `else` and associate it with the nearest unmatched `if`
or Reduce the previous `if E then S` as a complete statement
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 24/76
👉 This is a shift-reduce conflict.
⚔️ 2. Reduce-Reduce Conflict
🔹 Definition:
A reduce-reduce conflict occurs when the parser can reduce by two or more different grammar
rules and cannot decide which to choose.
🔸 Why It Happens:
This usually indicates a serious problem in the grammar (e.g., ambiguous or improperly factored).
Typically arises during parser table construction when multiple items in a state indicate reductions
by different rules.
📌 Example:
grammar
S → A
S → B
A → α
B → α
Here, for the same input that matches `α`, the parser could reduce by either `A → α` or `B → α`. It
doesn't know which rule to choose — this is a reduce-reduce conflict.
🧠 Key Differences
Aspect Shift-Reduce Conflict Reduce-Reduce Conflict
Decision Between Shift or Reduce Reduce by Rule 1 or Reduce by Rule 2
Cause Ambiguity in continuation or grouping Ambiguity in rule matching
Would you like a visual parse table example to demonstrate these conflicts?
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 25/76
🟩 1. S-Attributed Definitions
🔹 Definition:
An S-attributed definition uses only synthesized attributes.
🔸 Synthesized Attribute:
Computed from children in the parse tree.
Flow is bottom-up.
📌 Example:
grammar
Explanation:
🟦 2. L-Attributed Definitions
🔹 Definition:
An L-attributed definition uses both synthesized and inherited attributes, but inherited ones are
restricted.
🔸 Inherited Attribute:
Passed from parent or left siblings.
Flow is left-to-right.
1. `A`’s attributes
2. Attributes of symbols to the left of Xi (i.e., `X1` to `Xi−1`)
3. Its own attributes, as long as there’s no cycle
📌 Example:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 26/76
grammar
Production: A → B C
Semantic Rules:
[Link] = [Link]
[Link] = [Link]
🧠 Summary Table:
Feature S-Attributed L-Attributed
Uses Inherited Attr? ❌ No ✅ Yes
Uses Synthesized Attr? ✅ Yes ✅ Yes
Suitable for Bottom-Up Parsing (LR) Top-Down Parsing (LL) or Bottom-Up Parsing
Evaluation Order Bottom-Up Left-to-Right with restrictions
Would you like me to generate a dependency graph example for one of these?
Let’s analyze and justify both of the statements using concepts from LR parsing theory.
✅ Justification:
CLR(1) (Canonical LR(1)) parsers are more powerful than LALR(1) parsers because they keep full
lookahead information in each item.
LALR(1) merges states with same cores but possibly different lookaheads.
When merging states in LALR(1), shift actions are determined only by the core items. That
means no new shift actions are introduced when merging.
Hence:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 27/76
If no shift-reduce conflict exists in CLR(1), the same shift actions will not create a conflict in
LALR(1), because merging cannot introduce a new shift action where none existed.
✅ Conclusion:
True. No shift-reduce conflict in CLR(1) implies none in LALR(1) for the same grammar.
🔹 (ii) Even if there is no reduce-reduce conflict in CLR(1) parsing table, there may be
a reduce-reduce conflict in LALR(1) parsing table of the same grammar.
✅ Justification:
In CLR(1), the lookaheads in each state are precise and separated.
In LALR(1), merging two CLR(1) states with the same core but different lookaheads combines
their lookaheads.
This can lead to a case where two reductions, which were in separate CLR(1) states with non-
overlapping lookaheads, now appear in the same LALR(1) state with combined
lookaheads.
If both reductions are valid for a common terminal, this causes a reduce-reduce conflict.
✅ Conclusion:
True. Reduce-reduce conflicts may appear in LALR(1) even if CLR(1) is conflict-free, because merging
states may lead to overlapping lookaheads for multiple reductions.
Write quadruple, triple , and indirect triple for the following statement
x = y*-z + y*-z
ini
x = y * -z + y * -z
ini
t1 = -z
t2 = y * t1
t3 = -z
t4 = y * t3
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 28/76
t5 = t2 + t4
x = t5
✅ 1. Quadruples
Each instruction is represented as a 4-tuple:
(Operator, Arg1, Arg2, Result)
6 = t5 x
✅ 2. Triples
Each instruction is stored using indices (instead of temporary variables).
Triples: (Operator, Arg1, Arg2) — Arg1 and Arg2 can be constants, variables, or pointers to other rows.
0 UMINUS z
1 * y (0)
2 UMINUS z
3 * y (2)
4 + (1) (3)
5 = (4) x
✅ 3. Indirect Triples
Indirect triples use a pointer table to the actual triples, making it easier to reorder instructions if needed.
Pointer Table:
Index Points to
0 Instruction 0
1 Instruction 1
2 Instruction 2
3 Instruction 3
4 Instruction 4
5 Instruction 5
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 29/76
No Operator Arg1 Arg2
0 UMINUS z
1 * y (0)
2 UMINUS z
3 * y (2)
4 + (1) (3)
5 = (4) x
Would you like this expression optimized to avoid redundant calculations (like computing `y*-z` twice)?
Here’s a clear comparison of SLR(1), CLR(1), and LALR(1) parsers, focusing on their power, table size,
complexity, and conflict handling:
🔶 1. SLR(1) – Simple LR
Aspect Details
Full Form Simple LR
Lookahead 1 token
Parsing Table Smallest among the three
Uses Follow sets to construct parsing table
Strength Least powerful (can’t handle many real-world grammars)
Conflicts May have shift-reduce or reduce-reduce conflicts in many valid cases
Ease of Construction Easiest to build manually
Example Good for simple grammars but fails on ambiguous constructs
🔶 2. CLR(1) – Canonical LR
Aspect Details
Full Form Canonical LR
Lookahead 1 token
Parsing Table Largest, because it tracks full LR(1) items
🔶 3. LALR(1) – LookAhead LR
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 30/76
Aspect Details
Full Form LookAhead LR
Lookahead 1 token
Parsing Table Same size as SLR(1), but more precise than SLR
Uses Merges similar LR(1) states with same core, combines lookaheads
Strength Middle ground: more powerful than SLR, less than CLR
🔽 Summary Table
Feature SLR(1) LALR(1) CLR(1)
Would you like a visual diagram to better understand how their state diagrams differ?
perl
✅ Original Statement:
`c[i] = a[i] + b[i]`
🔧 Triples Representation:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 31/76
Index Operation Arg1 Arg2
(0) `[]` a i
(1) `[]` b i
(2) `+` (0) (1)
(3) `[]= ` c i, (2)
🧠 Explanation:
(0): Access `a[i]`
(1): Access `b[i]`
(2): Add the two values from above
(3): Store the result in `c[i]`
Note: `[]` represents array access, and `[]=` indicates array assignment.
Would you also like the quadruples and indirect triples for this?
Sure! Let's begin with the Three-Address Code (TAC) for the statement:
`c[i] = a[i] + b[i]`
t1 = a[i]
t2 = b[i]
t3 = t1 + t2
c[i] = t3
🧠 Explanation:
`t1` holds the value of `a[i]`
`t2` holds the value of `b[i]`
Let me know if you'd like to see the quadruples, triples, or indirect triples for this as well.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 32/76
What are different intermediate code forms? Discuss different Three Address
code types and implementations of Three Address statements.
3. Triples
4. Indirect Triples
5. Polish Notation (Prefix and Postfix forms)
6. Syntax Trees
7. Directed Acyclic Graphs (DAG)
8. Stack-based Code (used in stack machines or JVM)
Example:
For:
`x = a + b * c`
TAC:
text
t1 = b * c
t2 = a + t1
x = t2
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 33/76
3. Copy/Assignment statement
`x = y`
4. Unconditional jump
`goto L`
5. Conditional jump
`if x relop y goto L`
6. Indexed assignment
`x = y[i]` or `x[i] = y`
7. Pointer assignment
`x = *y` or `*x = y`
✅ Implementations of TAC
There are three main ways to represent TAC in compilers:
1. Quadruples
Example:
`x = y + z`
2. Triples
Example:
3. Indirect Triples
Used to avoid complications in code optimization due to position-based references. It uses a pointer
table.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 34/76
✅ Summary
Form Structure Advantage
Three-Address Code Human-readable, high-level IR Easier for analysis and optimization
Quadruples Explicit temporary storage Easy to translate to machine code
Triples Saves space by avoiding result names Harder to reorder/optimize
Would you like a visual example comparing all three (TAC, quadruples, triples) for a specific statement
like `x = (a + b) * (c - d)`?
Write a note on simple type checker and list the different types of type
checking.
Strong Type Checking Enforces strict rules to avoid type errors (e.g., no implicit conversions).
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 35/76
Type Checking Description
Weak Type Checking Allows more implicit conversions and flexibility (risk of subtle bugs).
int x = 5;
x = "hello"; // Error: assigning string to int
x = 5
x = "hello" # No error; type is checked during execution
✅ Summary:
A simple type checker ensures program correctness by enforcing type rules early in compilation. Type
checking can be static or dynamic, strong or weak, depending on the language's design philosophy.
Would you like an example of a type checker implemented in a mini-compiler or interpreter (in Python
or C)?
What do you mean by attributed grammars? Discuss the translation scheme for
Converting an infix expression to its equivalent postfix form.
E → E + T
E → T
T → T * F
T → F
F → (E)
F → id
Attributes:
Each non-terminal has a synthesized attribute `postfix` which stores the postfix form of the
expression.
Semantic Rules:
E
/|\
E + T
| |
T T * F
| |
F F
| |
a b
|
c
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 37/76
🔁 Summary:
Feature Attributed Grammar
Base Structure Context-Free Grammar
Added Elements Attributes + Semantic Rules
Purpose Perform syntax-directed translation
Would you like this process illustrated with a parse tree diagram or in code form (e.g., Python)?
To show that the given grammar is LR(1) but not LALR(1), we need to follow these steps:
📌 Grammar:
markdown
1. S → Aa
2. S → bAc
3. S → Bc
4. S → bBa
5. A → d
6. B → d
csharp
[S → • Aa, $]
[A → • d, a]
[B → • d, a]
This grammar does not produce conflicts in its full LR(1) parsing table:
Hence, it is LR(1).
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 38/76
❌ Step 2: Show it's not LALR(1)
To check if it’s not LALR(1):
LALR(1) parsers are created by merging LR(1) items with the same core, i.e., same production
and dot position, ignoring lookahead.
`[A → d, a]`
`[B → d, c]`
If we merge them for LALR(1) (both have core `[d → •]`), their lookaheads combine: `a` and `c`.
Now consider:
After seeing `d`, we must decide whether to reduce using `A → d` or `B → d`, based on the
lookahead (`a` or `c`).
Merging these would cause a reduce-reduce conflict on lookaheads `a` and `c`.
🛑 Conflict:
In LR(1): No conflict due to distinct lookaheads.
In LALR(1): Conflict occurs because both rules are reduced from the same core `[d → •]`.
✅ Conclusion:
✔ The grammar is LR(1) — no conflicts when using full lookaheads.
❌ The grammar is not LALR(1) — because merging items causes a reduce-reduce conflict.
Would you like a step-by-step construction of the LR(1) and LALR(1) tables for this grammar?
Generate the three address code for the following code fragment.
a=b+1 x=y+3 y=a/b a=b+c
Here's how we generate three-address code (TAC) for the given code fragment:
🧾 Given Statements:
ini
1. a = b + 1
2. x = y + 3
3. y = a / b
4. a = b + c
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 39/76
✅ Three Address Code (TAC):
We introduce temporary variables (`t1`, `t2`, etc.) for intermediate computations.
plaintext
t1 = b + 1
a = t1
t2 = y + 3
x = t2
t3 = a / b
y = t3
t4 = b + c
a = t4
💡 Explanation:
Each arithmetic operation is isolated into a separate step using a temporary variable.
This makes optimization and code generation easier in compiler backends.
Would you like me to also generate triples, quadruples, or indirect triples for this code?
When the function returns, its activation record is popped off the stack.
Return address
Parameters
Local variables
Temporaries
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 40/76
Control link (pointer to the caller’s frame)
📌 Example:
Consider the following C-like code:
int main() {
int a = 5;
int result = square(a);
return result;
}
int square(int x) {
int y = x * x;
return y;
}
🧱 Stack Behavior:
Let’s trace stack allocation during execution:
1. Call `main()`
❌ Limitations
Not suitable for values that must persist beyond the function call (like dynamic objects or closures).
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 41/76
Stack size is limited (stack overflow is possible).
Would you like a diagram of how the stack frames are organized for this example?
An edge is drawn between two nodes if the variables interfere (i.e., are live at the same time).
The number of colors corresponds to the number of available registers.
📌 Example
Consider the following simplified code:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 42/76
c
a = 1
b = 2
c = a + b
d = c + b
e = d + a
Live variables:
Interference graph:
lua
a -- b
a -- d
b -- c
c -- d
d -- e
a -- e
Coloring:
🔍 Key Terms
Spilling: Moving a variable from a register to memory due to lack of registers.
Coalescing: Removing unnecessary move instructions by assigning the same register to both
variables.
✅ Advantages
Produces efficient code with fewer loads/stores.
Works well in optimizing compilers.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 43/76
Explain about different forms object code forms used as target code in target
code generation.
In target code generation, the compiler translates intermediate representation (IR) into target
(machine) code. This target code may appear in several object code forms, each suitable for a different
stage of compilation or linking.
Here are the major forms of object code used as target code:
1. Absolute Code
Description: Machine code that is ready to be loaded into a fixed memory location and executed.
Fixed addresses are already assigned.
No linking or relocation is needed.
Example:
asm
Advantages:
Disadvantages:
2. Relocatable Code
Description: Object code that includes relative addresses instead of fixed ones.
Uses a relocation table that tells the loader how to adjust addresses based on where the program
is loaded into memory.
Example:
Code might use `LOAD R1, VAR1` where `VAR1` is defined as an offset (e.g., +20) in a table.
Advantages:
Disadvantages:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 44/76
Loader must handle relocation.
Description: Code that can execute correctly regardless of its absolute memory address.
Often used in shared libraries and dynamic linking.
Mechanism:
Uses relative addressing (e.g., address of a variable is computed relative to current instruction
pointer).
Advantages:
Disadvantages:
4. Assembly Code
Example:
asm
MOV R1, #5
ADD R1, R2
Advantages:
Disadvantages:
Description: Encoded machine instructions stored in binary format, produced by the compiler or
assembler.
May include:
Code section
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 45/76
Data section
Symbol table
Relocation info
🔁 Summary Table
Code Form Fixed Address Relocatable Human-Readable Use Case
Absolute Code Yes No No Simple systems, embedded code
Relocatable Code No Yes No Modular programming, linking
Would you like an example of how a high-level statement (like `x = a + b`) is translated into one of
these forms?
In simple code generation, particularly for intermediate code like three-address code (TAC), we aim
to generate efficient target code using a limited number of registers. To do this effectively, we use three
important data structures:
🔹 1. Next-Use Information
Purpose:
Tells whether a variable will be used again in the future and when.
Helps in deciding whether to keep a variable in a register or store it back to memory.
Use:
If a variable is not used again, the register holding it can be reused immediately.
If a variable will be used later, it may need to be preserved.
Computed During:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 46/76
Liveness analysis on the intermediate code (e.g., TAC).
Format Example:
Statement x y z
x=y+z no yes yes
🔹 2. Register Descriptor
Purpose:
Structure:
Example:
Use:
🔹 3. Address Descriptor
Purpose:
Tells where the current value of each variable is stored (register, memory, or both).
Used to generate correct code and avoid reloading values unnecessarily.
Structure:
Example:
Variable Location(s)
x R1, Memory[x]
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 47/76
Variable Location(s)
y Memory[y]
z R2
Use:
x = y + z;
Steps:
✅ Summary Table
Data Structure Tracks Purpose
Next-Use Info Future uses of variables Help decide register reuse or spilling
Register Descriptor Variable(s) in each register Avoid redundant loads
Address Descriptor Location(s) of each variable’s value Generate correct and efficient code
Would you like to see a worked-out example of how these are used in code generation for a specific TAC
block?
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 48/76
Write simple code generation algorithm.
ini
x = y op z
Steps:
ini
r = y op z
4. Update descriptors:
`Register Descriptor[r] = x`
`Address Descriptor[x] = r`
🧾 Pseudocode:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 49/76
plaintext
Generate: R = Ry op Rz
💡 Example:
Given: `a = b + c`, with initial Address Descriptor:
css
b in R1
c in memory
Steps:
Would you like this implemented for a full TAC block or translated to a specific target like x86 or MIPS?
Generate simple assembly language target code for the following intermediate
code statements using simple code generator. Assume that target machine has
three registers (R1, R2, R3). Initially all registers are empty and no variable is
live at the end of all statements
T1 = a*b
T2 = a*c
T3 = T1 + b
T4 = a+T2
T5 = b + T4
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 50/76
To generate simple assembly language target code using a simple code generator, we assume the
following:
📌 Intermediate Code:
ini
T1 = a * b
T2 = a * c
T3 = T1 + b
T4 = a + T2
T5 = b + T4
nginx
2. `T2 = a * c`
nginx
nginx
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 51/76
nginx
nginx
MOV R1, a ; R1 = a
MUL R1, b ; R1 = a * b (T1)
MOV R2, a ; R2 = a
MUL R2, c ; R2 = a * c (T2)
MOV R3, R1 ; R3 = T1
ADD R3, b ; R3 = T1 + b (T3)
Notes:
Would you like this code adapted for a specific real-world architecture like x86 or MIPS?
In compiler design, the object code is the output of the code generation phase. It is the machine-level
representation of the source code or an intermediate form ready for linking and execution. There are
several object code forms, depending on the target machine, the execution model, and the compiler
design strategy.
This is complete machine code that can be loaded directly into memory and executed.
Memory addresses are fixed.
No linking or relocation is required.
Usually generated when the exact memory location is known at compile time.
Example:
asm
Advantages:
Disadvantages:
2. Relocatable Code
Code generated with relative addresses (e.g., offsets), not absolute addresses.
A relocation loader adjusts the addresses at load time.
Useful when code location in memory isn't known at compile time.
Example:
asm
Advantages:
Disadvantages:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 53/76
Code that executes correctly regardless of where in memory it is loaded.
Often used in shared libraries and dynamic loading.
Uses registers or relative addressing rather than fixed addresses.
Example:
asm
Advantages:
Disadvantages:
Example:
asm
MOV R1, a
ADD R1, b
MOV c, R1
Advantages:
Disadvantages:
The final object code stored in a standard file format (like ELF, COFF, or Mach-O).
Contains:
Machine code
Symbol tables
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 54/76
Relocation info
Debugging info
Used by the linker to generate the final executable.
🔸 Summary Table
Fixed Needs Human-
Form Use Case
Address Linking readable
Embedded systems,
Absolute Code Yes No No
bootloaders
Relocatable Code No Yes No Modular applications
Position-
No Optional No Shared libraries, ASLR
Independent
Assembly Code No Yes Yes Debugging, optimization
Would you like a diagram to visualize these object code forms and how they fit into the compilation
pipeline?
Register allocation is a key optimization technique in the code generation phase of a compiler. It
determines how variables and intermediate values are assigned to a limited number of CPU registers.
Since registers are the fastest form of storage, efficient register allocation significantly improves
program performance.
With optimal register usage, intermediate results can stay in registers across multiple instructions.
Reduces the total number of load, store, and move instructions.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 55/76
Example:
Without good allocation:
asm
LOAD R1, a
STORE R1, temp1
LOAD R2, b
ADD R1, R2
STORE R1, temp2
asm
LOAD R1, a
ADD R1, b
🔹 Conclusion
Efficient register allocation:
Peephole optimization is a local code optimization technique that examines a small set of
instructions (called a peephole, usually a few lines at a time) and attempts to replace inefficient
instruction sequences with more efficient ones without changing the program's semantics.
It is called "peephole" because it only looks at a small portion of the code at a time — like looking
through a peephole.
asm
LOAD R1, a
LOAD R1, a ; redundant
After:
asm
LOAD R1, a
2. Algebraic Simplifications
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 57/76
Replace operations with simpler equivalents.
Before:
asm
MUL R1, 2 ; R1 = R1 * 2
After:
asm
3. Strength Reduction
asm
MUL R1, 8
After:
asm
Before:
asm
JMP L1
MOV R1, 0 ; Unreachable
L1:
After:
asm
JMP L1
L1:
5. Jump-to-Jump Optimization
asm
JMP L1
...
L1: JMP L2
After:
asm
JMP L2
asm
LOAD R1, #0
ADD R1, R2
After:
asm
asm
LOAD R1, a
LOAD R2, b
ADD R1, R2
STORE R1, a
LOAD R1, a ; redundant
After Optimization:
asm
LOAD R1, a
LOAD R2, b
ADD R1, R2
STORE R1, a
🔸 Summary
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 59/76
Feature Description
Scope Small window of instructions
Speed Fast and simple
Type Local optimization
Examples Redundant loads, jump simplification, algebraic optimizations
Peephole optimizations are often the last phase of compilation, applied just before or during code
generation, and can significantly improve performance and reduce code size.
Would you like a live example or illustration of how a peephole optimization engine might work?
🔹 How It Works
When a program is compiled:
✅ Modular Compilation Code modules compiled separately can be linked on different systems.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 60/76
Feature How It Helps
✅ Dynamic Linking & Loading Shared libraries or DLLs can be loaded and relocated at runtime.
✅ Portability Across Systems The same object code can be linked differently on different architectures.
✅ Ease of Integration Object files from different compilers/platforms can be integrated into the
same program.
🔸 Example
Suppose we compile the following C code:
The compiler will produce relocatable object code for the `sum` function, where:
🔹 Summary
Relocatable object code:
Would you like a diagram showing the relocation process from compilation to execution?
Explain which phases of compilation are machine independent and which are
machine dependent?
In a compiler, different phases handle various aspects of translating source code into machine code.
These phases can be classified into machine-independent and machine-dependent based on whether
they depend on the details of the target hardware architecture.
🔷 Machine-Independent Phases
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 61/76
These phases deal with language syntax and semantics, not the target hardware. They are usually the
front end of the compiler:
1. Lexical Analysis
Converts the source code into tokens.
Example: Identifies keywords, identifiers, operators, etc.
❌ No hardware dependency.
2. Syntax Analysis (Parsing)
Checks for grammatical structure using context-free grammars.
Builds a parse tree or syntax tree.
❌ Hardware independent.
3. Semantic Analysis
1. Code Generation
Converts intermediate code to machine code or assembly.
Takes into account registers, addressing modes, and instruction formats.
✅ Tightly tied to hardware.
2. Optimization (Machine-Dependent)
Improves performance based on specific hardware.
Example: Instruction scheduling, pipeline optimizations.
✅ Depends on target CPU.
3. Register Allocation and Assignment
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 62/76
Maps variables to machine registers.
✅ Depends on the number and usage of CPU registers.
4. Instruction Selection and Scheduling
Chooses optimal instructions and reorders them for better CPU utilization.
✅ Target-machine-specific.
🔸 Summary Table
Phase Machine Independent Machine Dependent
Lexical Analysis ✅ Yes ❌ No
Syntax Analysis ✅ Yes ❌ No
Semantic Analysis ✅ Yes ❌ No
Intermediate Code Generation ✅ Yes ❌ No
Machine-Independent Optimization ✅ Yes ❌ No
Code Generation ❌ No ✅ Yes
Machine-Dependent Optimization ❌ No ✅ Yes
Register Allocation ❌ No ✅ Yes
Would you like a visual diagram showing how front-end and back-end phases interact?
To perform available expression analysis, we analyze where expressions like `a*b` or `b*c` are
available — meaning they have already been computed earlier and none of the variables involved
have been redefined in the meantime.
🔹 At Statement 6:
Path: From statement 1 → 2 → 3 → 6
`a*b` is computed at line 1 and again at line 4 (which is skipped if line 3 jumps directly to 6).
Between line 1 and line 6, neither `a` nor `b` are redefined.
✅ Yes, `a*b` is available at statement 6
🔹 At Statement 8:
Path: Statement 7 modifies `a`
Since `a` is changed at line 7, any previously computed `a*b` is no longer valid
❌ No, `a*b` is not available at statement 8
🔹 At Statement 5:
`b*c` is computed at line 2
No change to `b` or `c` between line 2 and line 5
✅ Yes, `b*c` is available at statement 5
🔹 At Statement 9:
No redefinition of `b` or `c` occurs between line 2 and line 9
Thus, the result of `b*c` at line 2 is still valid
✅ Yes, `b*c` is available at statement 9
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 64/76
✅ Final Answers
(i)
(ii)
Would you like a control flow graph (CFG) to visualize these paths and dataflow?
To analyze redundancy in the computation of `4*i` in statements 7 and 11, we need to:
cpp
1 i = 0
2 if i < n goto 4
3 goto 11
4 t1 = 4 * i
5 t2 = a[t1]
6 t3 = t2 + 1
7 t4 = 4 * i
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 65/76
8 b[t4] = t3
9 i = i + 1
10 goto 2
11 t5 = 4 * i
12 b[t5] = 0
lua
+------+
| 1 |
+--+---+
|
v
+--+---+
| 2 |<-----------+
+--+---+ |
/ \ |
true false |
/ \ |
v v |
+---+ +----+ |
| 4 | | 11 | |
+---+ +----+ |
| | |
v v |
+---+ +----+ |
| 5 | | 12 | |
+---+ +----+ |
| |
v |
+---+ |
| 6 | |
+---+ |
| |
v |
+---+ |
| 7 | |
+---+ |
| |
v |
+---+ |
| 8 | |
+---+ |
| |
v |
+---+ |
| 9 | |
+---+ |
| |
v |
+---+ |
|10 |----------------------+
+---+
✅ Statement 4: `t1 = 4 * i`
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 66/76
First computation of `4*i`
✅ Statement 7: `t4 = 4 * i`
✅ Final Answer
Statement 7: Redundant (same `4*i` already computed in statement 4, and `i` has not changed)
Statement 11: Not Redundant (`i` has been modified in statement 9)
Would you like a diagram of the control flow graph to visualize this better?
Here's the Control Flow Graph (CFG) with clearly defined basic blocks:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 67/76
B1: Statements 1 and 2 (initialization and loop condition)
B2: Statement 3 (exit jump if condition fails)
B3: Statements 4 to 10 (loop body)
B4: Statements 11 and 12 (post-loop actions)
Statement 7 (`t4 = 4*i`) is redundant because `4*i` was already computed in statement 4 and `i`
hasn't changed between 4 and 7.
Statement 11 (`t5 = 4*i`) is not redundant since `i` is updated in statement 9 (`i = i + 1`) before
reaching 11, hence the value of `4*i` has changed.
Would you like this CFG saved as an image or explained in more detail?
Show the contents of the activation record for the function call f(6) where
definition of f is given as follows and f is called from main function
int f(int n){
if n==0 return1
else returnn* f(n-1)
}
To show the activation record (AR) for the function call `f(6)` from `main`, we need to understand how
recursion builds a stack of activation records, one for each call to `f(n)` until `n == 0`.
📌 Given Function:
c
int f(int n) {
if(n == 0) return 1;
else return n * f(n - 1);
}
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 68/76
🧠 Function Call: `f(6)`
Each call creates an activation record and pushes it onto the call stack.
markdown
Top of Stack
-------------
| f(0) AR |
| n = 0 |
-------------
| f(1) AR |
| n = 1 |
-------------
| f(2) AR |
| n = 2 |
-------------
| f(3) AR |
| n = 3 |
-------------
| f(4) AR |
| n = 4 |
-------------
| f(5) AR |
| n = 5 |
-------------
| f(6) AR |
| n = 6 |
-------------
| main AR |
-------------
Bottom of Stack
Each activation record holds the parameter `n`, a return address, and a link to the caller's frame.
Would you like a visual diagram of the activation record stack for this recursive call?
🔍 Key Characteristics:
Do not alter the flow of control.
Preserve program structure, making them safe and easy to apply.
Usually applied locally (within a basic block) or globally (across basic blocks).
// Before
x = a + b;
y = a + b;
// After
x = a + b;
y = x;
2. Copy Propagation
Replaces the occurrences of variables that are simply copies of other variables.
// Before
x = y;
z = x + 1;
// After
z = y + 1;
// Before
x = a + b;
// x is never used
// After
// (x = a + b;) is removed
4. Constant Folding
Evaluates constant expressions at compile time.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 70/76
c
// Before
x = 3 * 4;
// After
x = 12;
5. Strength Reduction
Replaces expensive operations with cheaper ones.
// Before
x = y * 2;
// After
x = y + y;
📌 Why Important?
Improves performance (speed and/or size)
Keeps program semantics unchanged
Makes further optimizations possible
Would you like examples or a visual showing how these work in a control flow graph?
Draw the activation tree for the function call f(5) where definition of f is given
as follows
int f(int n){
if n==0 return 0
else if n==1 return 1
else return f(n-1) + f(n-2)
}
To draw the activation tree for the function call `f(5)` for the Fibonacci-like function:
We visualize the function calls as a tree, where each node represents a call to `f(n)` and its children
represent the recursive calls it makes.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 71/76
markdown
f(5)
/ \
f(4) f(3)
/ \ / \
f(3) f(2) f(2) f(1)
/ \ / \ / \
f(2) f(1) f(1)f(0) f(1)
/ \
f(1) f(0)
🔍 Explanation:
Each node `f(n)` makes:
A call to `f(n-1)`
A call to `f(n-2)`
The recursion terminates when `n == 0` or `n == 1`, which are the base cases.
Draw the activation tree for the function call f(5) where definition of f is given
as follows
int f(int n){
if n==0 return 0
else if n==1 return 1
else return f(n-1) + f(n-2)
}
Here is the activation tree for the function call `f(5)` based on the recursive Fibonacci-like function:
plaintext
f(5)
/ \
f(4) f(3)
/ \ / \
f(3) f(2) f(2) f(1)
/ \ / \ / \
f(2) f(1) f(1) f(0) f(1)
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 72/76
/ \
f(1) f(0)
Explanation:
Each node `f(n)` branches into two children: `f(n-1)` and `f(n-2)`.
The base cases are `f(0)` and `f(1)`, which do not branch further.
This tree illustrates the recursive calls that occur during the evaluation of `f(5)`.
Would you like a table showing the number of times each `f(n)` is called?
Image created
Write quadruple, triple , and indirect triple for the following statement
x = y*-z + y*-z
Let's break the expression `x = y * -z + y * -z` into smaller parts and generate the Three Address
Code (TAC), then derive:
1. Quadruples
2. Triples
3. Indirect Triples
ini
t1 = -z
t2 = y * t1
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 73/76
t3 = -z
t4 = y * t3
t5 = t2 + t4
x = t5
📘 Quadruples
Index Operator Arg1 Arg2 Result
1 UMINUS z — t1
2 * y t1 t2
3 UMINUS z — t3
4 * y t3 t4
5 + t2 t4 t5
6 = t5 — x
📘 Triples
Index Operator Arg1 Arg2
1 UMINUS z —
2 * y (1)
3 UMINUS z —
4 * y (3)
5 + (2) (4)
6 = (5) —
📘 Indirect Triples
We store pointers (indexes) into the triple table:
Pointer Table
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 74/76
What is an induction variable and identify the induction variables in the
following statements in C
for( i=0,j=1,k=2,l=4 ; i< n ; i++){
j= j+2
k = k+3
l = l+4
sum = sum + i*j*k
}
i = i + c; // linear induction
i = i * c; // sometimes non-linear induction
🧠 Let's analyze:
`i` starts at `0` and is incremented by 1 → induction variable.
`j` starts at `1` and is incremented by `2` each iteration → induction variable.
`k` starts at `2` and is incremented by `3` each iteration → induction variable.
`l` starts at `4` and is incremented by `4` each iteration → induction variable.
`sum` accumulates a computed value → not an induction variable, it's an accumulator.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 75/76
Would you like to see how a compiler might optimize this using strength reduction?
Differentiate between
i) Phase and a pass ii) single-pass and multi-pass compiler.
Here's the difference between Phase and Pass, and between Single-pass and Multi-pass compilers:
✅ i) Phase vs Pass
Aspect Phase Pass
A phase is a logical step or component of
A pass refers to a complete traversal of the
Definition compilation (e.g., lexical analysis, parsing,
source code or its intermediate representation.
etc.).
More fine-grained – each compilation task is More coarse-grained – may include multiple
Granularity
a phase. phases in one pass.
Lexical analysis, Syntax analysis, Semantic First pass for syntax + symbol table creation,
Examples
analysis, Code generation. second pass for code generation.
Each phase transforms the code in some way Each pass processes the entire code once and
Objective
for the next phase. may combine several phases.
Summary:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 76/76