COMPILER CONSTRUCTION (COM 414)
PRACTICAL ONE
1) Install a high level language compiler and identify elements of programming environments
2) Run a simple program using the computer installed and state that compilation process of the program
1) Install a High-Level Language Compiler and Identify Elements of
Programming Environments
Example: C Language (using GCC or Turbo C), or Python (using Python IDLE).
• Steps to Install (Example: Python):
1. Download Python from the official site (python.org).
2. Run the installer and check "Add to PATH".
3. Complete the installation.
4. Open IDLE (Python’s Integrated Development and Learning Environment)
or Command Prompt.
• Programming Environment Elements:
o Editor → where code is written (e.g., IDLE editor, VS Code).
o Compiler/Interpreter → translates source code into machine code (e.g., GCC
for C, Python Interpreter).
o Debugger → helps find and fix errors in the code.
o Linker → combines compiled files into an executable (for languages like
C/C++).
o Libraries → pre-written code modules (e.g., math, string, I/O).
o Runtime Environment → where the program is executed on the computer.
COMPILER CONSTRUCTION (COM 414)
2) Run a Simple Program and State the Compilation Process
Example Program in C (Hello World):
Compilation Process (C Language Example):
1. Writing the Code: Program written in a text editor.
2. Preprocessing: The preprocessor handles header files (#include <stdio.h>).
3. Compilation: The compiler converts source code into assembly language.
4. Assembly: The assembler translates assembly into machine code (object file).
5. Linking: The linker combines object code with libraries → produces executable file
(a.exe or a.out).
6. Execution: The program is run by the operating system, output is displayed.
COMPILER CONSTRUCTION (COM 414)
PRACTICAL TWO
Identify errors generated by the compiler during the compilation process of the sample
program run in Practical One
1. Types of Errors in Compilation
1. Syntax Errors
o Occur when code violates programming language grammar rules.
o Example (C):
Error Message: error: expected ';' before 'return'
Missing Header File Error
• Occurs if required library/header is not included.
• Example (C):
Error Message: implicit declaration of function 'printf'
COMPILER CONSTRUCTION (COM 414)
PRACTICAL THREE
Write and run a high level language to implement a scanner
import re
# Sample source code
source_code = "int x = 100 + 20;"
# Define token patterns using regular expressions
token_specification = [
("NUMBER", r'\d+'), # Integer
("ID", r'[A-Za-z_]\w*'), # Identifier
("ASSIGN", r'='), # Assignment operator
("OP", r'[+\-*/]'), # Arithmetic operators
("SEMI", r';'), # Semicolon
("SKIP", r'[ \t]+'), # Skip spaces/tabs
("MISMATCH", r'.'), # Any other character (error)
# Build the regex
token_regex = '|'.join(f'(?P<{name}>{pattern})' for name, pattern in
token_specification)
# Scanner implementation
for match in re.finditer(token_regex, source_code):
kind = match.lastgroup
COMPILER CONSTRUCTION (COM 414)
value = match.group()
if kind == "SKIP":
continue
elif kind == "MISMATCH":
print(f"Unexpected token: {value}")
else:
print(f"{kind}: {value}")
output
COMPILER CONSTRUCTION (COM 414)
PRACTICAL FOUR
1) Write a top-dowj larder for the output of the scanner implemented in PRACTICAL
THREE
What is a Top-Down Parser?
• A top-down parser starts from the start symbol (root of grammar) and tries to rewrite
it until the input tokens are matched.
• It uses recursive descent parsing based on grammar rules.
Grammar (simplified for the scanner’s output int x = 100 + 20;)
tokens = [
("ID", "int"),
("ID", "x"),
("ASSIGN", "="),
("NUMBER", "100"),
("OP", "+"),
("NUMBER", "20"),
("SEMI", ";")
pos = 0
def match(expected_type):
global pos
if pos < len(tokens) and tokens[pos][0] == expected_type:
pos += 1
return True
return False
COMPILER CONSTRUCTION (COM 414)
def Stmt():
if Decl() and match("ID") and match("ASSIGN") and Expr() and match("SEMI"):
print("Valid Statement")
return True
return False
def Decl():
if tokens[pos][1] == "int":
match("ID")
return True
return False
def Expr():
if Term():
while pos < len(tokens) and tokens[pos][0] == "OP":
match("OP")
if not Term():
return False
return True
return False
def Term():
if match("NUMBER") or match("ID"):
COMPILER CONSTRUCTION (COM 414)
return True
return False
# Run parser
if Stmt() and pos == len(tokens):
print("Parsing successful!")
else:
print("Parsing failed!")
output
2) Write a bottom up parser for output of the scanner implemented in PRACTICAL THREE
What is a Bottom-Up Parser?
• A bottom-up parser starts with the tokens (leaves) and tries to reduce them to the start
symbol by applying grammar rules in reverse.
• Common examples are Shift-Reduce parsers.
Shift-Reduce Parsing Example (for int x = 100 + 20;)
Steps:
1. Input tokens: int x = 100 + 20 ;
2. Shift until a rule matches, then Reduce.
COMPILER CONSTRUCTION (COM 414)
Step Stack Input Action
1 int x = 100 + 20 ; Shift
2 Decl x = 100 + 20 ; Reduce (int→Decl)
3 Decl x = 100 + 20 ; Shift
4 Decl ID = 100 + 20 ; Reduce
5 Decl ID = 100 + 20 ; Shift
6 Decl ID = NUMBER + 20 ; Reduce (NUMBER→Term)
7 Decl ID = Term + 20 ; Reduce
8 Decl ID = Expr ; Reduce
9 Stmt `` (empty) Reduce
COMPILER CONSTRUCTION (COM 414)
PRACTICAL FIVE
Write simple program and matched date type and array bounds
# PRACTICAL FIVE: Data types and Array bounds
# Matching data types
age = 20 # integer
height = 5.9 # float
name = "Joel" # string
is_student = True # boolean
print("Age:", age, "->", type(age))
print("Height:", height, "->", type(height))
print("Name:", name, "->", type(name))
print("Is Student:", is_student, "->", type(is_student))
# Array (list in Python)
scores = [10, 20, 30, 40, 50] # size = 5
COMPILER CONSTRUCTION (COM 414)
print("\nArray Elements:")
for i in range(len(scores)):
print("Index", i, ":", scores[i])
# Testing array bounds
try:
print("\nAccessing index 5 (out of bounds):", scores[5])
except IndexError:
print("\nError: Tried to access index outside array bounds!")
output
COMPILER CONSTRUCTION (COM 414)
COMPILER CONSTRUCTION (COM 414)
PRACTICAL SIX
Implement stack machine code for postfix stack machine
# PRACTICAL SIX: Postfix Stack Machine Implementation
def evaluate_postfix(expression):
stack = []
for token in expression.split():
if token.isdigit(): # if operand, push to stack
stack.append(int(token))
else: # operator
b = stack.pop()
a = stack.pop()
if token == '+':
stack.append(a + b)
elif token == '-':
stack.append(a - b)
elif token == '*':
stack.append(a * b)
elif token == '/':
stack.append(a / b)
return stack.pop()
# Example
expr1 = "3 4 +"
expr2 = "5 1 2 + 4 * + 3 -"
COMPILER CONSTRUCTION (COM 414)
print("Postfix Expression:", expr1, "=", evaluate_postfix(expr1)) # 7
print("Postfix Expression:", expr2, "=", evaluate_postfix(expr2)) # 14
output
COMPILER CONSTRUCTION (COM 414)
PRACTICAL SEVEN
Implement code optimization
# PRACTICAL SEVEN: Non-optimized code
def sum_of_squares(n):
total = 0
for i in range(0, n):
total = total + i * i
return total
print(sum_of_squares(1000))
COMPILER CONSTRUCTION (COM 414)
PRACTICAL EIGHT
Using Assembly language generate machine code from Three Address statements, abstract syntax tree,
DAG
1. Three Address Statements (TAC)
TAC is an intermediate code used by compilers.
Example expression:
x=(a+b)∗(c−d)x = (a + b) * (c - d)x=(a+b)∗(c−d)
2. Abstract Syntax Tree (AST)
AST represents the structure of the expression.
COMPILER CONSTRUCTION (COM 414)
3. Directed Acyclic Graph (DAG)
DAG is like AST but avoids repeating common subexpressions.
If the expression had (a+b) used multiple times, DAG would reuse it once.
For our case:
4. Assembly Code Generation
Let’s convert the TAC into x86-like assembly:
COMPILER CONSTRUCTION (COM 414)
PRACTICAL NINE
Implement the Symbol Table using Linear List, Binary Tree, and Hash Table
techniques
1. Linear List Implementation
• Stores identifiers in a simple array or linked list.
• Search time = O(n) (sequential search).
Python Example:
class Node:
def __init__(self, key, type_, value):
self.key = key
self.type = type_
self.value = value
self.left = None
self.right = None
class SymbolTableBST:
def __init__(self):
self.root = None
def insert(self, root, key, type_, value):
if root is None:
return Node(key, type_, value)
if key < root.key:
root.left = self.insert(root.left, key, type_, value)
elif key > root.key:
root.right = self.insert(root.right, key, type_, value)
return root
COMPILER CONSTRUCTION (COM 414)
def lookup(self, root, key):
if root is None or root.key == key:
return root
if key < root.key:
return self.lookup(root.left, key)
return self.lookup(root.right, key)
# Example
bst = SymbolTableBST()
bst.root = bst.insert(bst.root, "x", "int", 10)
bst.insert(bst.root, "y", "float", 3.14)
print(bst.lookup(bst.root, "y").value)
COMPILER CONSTRUCTION (COM 414)
2. Binary Tree Implementation
• Organizes identifiers in a BST for faster lookup.
• Search time = O(log n) (on average).
Python Example:
class Node:
def __init__(self, key, type_, value):
self.key = key
self.type = type_
self.value = value
self.left = None
self.right = None
class SymbolTableBST:
def __init__(self):
self.root = None
def insert(self, root, key, type_, value):
if root is None:
return Node(key, type_, value)
if key < root.key:
root.left = self.insert(root.left, key, type_, value)
elif key > root.key:
root.right = self.insert(root.right, key, type_, value)
return root
def lookup(self, root, key):
if root is None or root.key == key:
return root
if key < root.key:
return self.lookup(root.left, key)
return self.lookup(root.right, key)
# Example
bst = SymbolTableBST()
bst.root = bst.insert(bst.root, "x", "int", 10)
bst.insert(bst.root, "y", "float", 3.14)
print(bst.lookup(bst.root, "y").value)
COMPILER CONSTRUCTION (COM 414)
3. Hash Table Implementation
• Uses hashing for O(1) average search/insert time.
Python Example:
symbol_table_hash = {}
def insert_hash(name, type_, value):
symbol_table_hash[name] = {"type": type_, "value": value}
def lookup_hash(name):
return symbol_table_hash.get(name, None)
# Example
insert_hash("x", "int", 10)
insert_hash("y", "float", 3.14)
print(lookup_hash("x"))
COMPILER CONSTRUCTION (COM 414)
PRACTICAL TEN
1. Trap Syntactically Wrong Statement in HLL
Using Python (try-except to catch syntax errors dynamically).
code_snippet = "a = 5 + " # Wrong statement
try:
exec(code_snippet)
except SyntaxError as e:
print("Syntax Error caught:", e)
2. Generate Error List of Wrong Statements
Check multiple code snippets and list errors.
snippets = [
"a = 5 + ", # Error
"b = 10", # Correct
"for i in range(5" # Error
]
error_list = []
for code in snippets:
try:
exec(code)
except Exception as e:
error_list.append({"statement": code, "error": str(e)})
print("Error List:")
for err in error_list:
print(err)
COMPILER CONSTRUCTION (COM 414)
PRACTICAL ELEVEN
Implement Bootstrap and Use Compiler Writing Tools
1. Bootstrap Compiler Concept
• A bootstrap compiler is a compiler written in the same programming language it
compiles.
• Example: Writing a C compiler in C, or a Python interpreter in Python.
• Steps:
1. Write a small simple compiler in another language (like Assembly).
2. Use it to compile a more advanced version of the compiler written in the target
language.
3. Repeat until a fully functional self-hosting compiler exists.
2. Compiler Writing Tools
• Lex/Flex → Used for lexical analysis (scanners).
• Yacc/Bison → Used for parsing (grammar to parser generator).
• ANTLR → Generates lexer and parser in Java, Python, C#, etc.
• LLVM → Used for code generation and optimization.
Example using Python’s PLY (Python Lex-Yacc):
import ply.lex as lex
tokens = ('NUMBER','PLUS','MINUS')
t_PLUS = r'\+'
t_MINUS = r'-'
t_NUMBER = r'\d+'
t_ignore = ' \t\n'
def t_error(t):
print("Illegal character:", t.value[0])
t.lexer.skip(1)
lexer = lex.lex()
lexer.input("3 + 5 - 2")
for tok in lexer:
print(tok)