Sub: Compiler Design
ASSIGNMENT
1. Lexical Analyzer vs Syntax Analyzer
The lexical analyzer (lexer) reads the source code and converts it into tokens like keywords,
identifiers, and symbols.
The syntax analyzer (parser) takes these tokens and checks if they follow the correct
grammar rules of the language.
The lexer checks spelling-level errors, while the parser checks sentence-level structure.
Example: Lexer breaks int a = 5; into tokens: int, a, =, 5, ;.
Parser checks if these tokens form a valid statement.
2. DFA for strings where number of a’s is divisible by 2 and b’s by 3
We need a machine that accepts strings from {a, b} where:
Number of as is even (like 0, 2, 4, ...)
Number of bs is divisible by 3 (like 0, 3, 6, ...)
There will be 6 states because:
2 states for even/odd a
3 states for b mod 3 = 0, 1, 2
The start and accepting state is the one with even a and b mod 3 = 0.
For each input symbol, you move to a new state based on the current counts of a and b.
3. SLR, CLR, and LALR Parsers
SLR (Simple LR) is the easiest and uses Follow sets to decide parsing actions. It may face
more conflicts.
CLR (Canonical LR) is the most powerful and uses detailed lookahead symbols. It’s bigger and
slower.
LALR (Lookahead LR) is a mix. It uses the same states as SLR but improves lookahead to
reduce conflicts.
It's widely used in tools like YACC.
4. Declaration in Procedure using Syntax Directed Translation
When we declare variables like int x;, the parser uses rules to add x with type int to the
symbol table.
For example, during parsing:
Rule: Decl → Type id ;
Action: Add id with Type to symbol table
This way, variables and their types are stored during compilation.
5. Intermediate Code for Flow of Control Statements
if (a < b)
x = y + z;
else
x = y - z;
Intermediate code (three-address format):
t1 = a < b
ifFalse t1 goto L1
t2 = y + z
x = t2
goto L2
L1:
t3 = y - z
x = t3
L2:
• Here, t1, t2, and t3 are temporary variables.
• L1 and L2 are labels to handle jump and branching.
6. Storage Organization in Runtime Environment
• Code segment: stores program code.
• Static area: holds global variables and constants.
• Stack: stores local variables, function call data, return address.
• Heap: used for dynamic memory allocation (malloc, new).
• Registers: used for fast access to variables during execution.
7. Name and Structure Equivalence in Types
• Name equivalence: Two types are equal only if they have the same name.
Even if their structure is same, they are not equal unless declared with same name.
• Structure equivalence: Two types are equal if their components or layout are the
same, even if names are different.
8. Ordered, Unordered, and Binary Search Tree in Symbol Table
• Unordered: No specific order. Searching is slow as you may need to check all entries.
• Ordered: Items are stored in sorted order. Searching is faster but inserting takes time.
• Binary Search Tree (BST): Uses a tree structure. Searching, inserting, and deleting are
efficient (if the tree is balanced).
9. DAG (Directed Acyclic Graph)
• A DAG is used in compilers for optimization.
• It helps avoid repeated calculations of the same expression.
• For example:
• a = b + c;
• d = b + c;
• Instead of computing b + c twice, DAG computes once and uses it for both a and d.
• DAG has no cycles and shows how values are derived.
• 10. Register Allocation and Assignment
• Register allocation is about choosing which variables should be kept in fast CPU
registers instead of memory.
• Register assignment is giving those variables actual register names (like R1, R2).
• This improves the speed of the program because using registers is faster than
memory.