0% found this document useful (0 votes)
28 views24 pages

Lab Terminal

The Mini Compiler project is an educational tool designed to simulate the core functionalities of a compiler, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, optimization, and target code generation, using a simplified programming language. Developed in C#, it features a console-based interface that allows users to input code, run sample programs, and view detailed outputs of each compilation stage. Key challenges faced during development included understanding compiler concepts, building a robust lexer, handling semantic analysis, and implementing stack-based code generation.

Uploaded by

riddaazainab512
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views24 pages

Lab Terminal

The Mini Compiler project is an educational tool designed to simulate the core functionalities of a compiler, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, optimization, and target code generation, using a simplified programming language. Developed in C#, it features a console-based interface that allows users to input code, run sample programs, and view detailed outputs of each compilation stage. Key challenges faced during development included understanding compiler concepts, building a robust lexer, handling semantic analysis, and implementing stack-based code generation.

Uploaded by

riddaazainab512
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

QUESTION # O1:

Briefly explain your project

Project Title: Mini Compiler

Project Overview:
A Mini Compiler is a simplified version of a real-world compiler, created for educational purposes to demonstrate how
each stage of compilation works. The goal of this project is to simulate the core functionalities of a full compiler on a
smaller scale, focusing on the most essential compiler phases like lexical analysis, syntax analysis, semantic analysis,
intermediate code generation, optimization, and target code generation.

This project has been developed using C# and runs through a console-based interface. It supports a basic programming
language that includes variable declarations (int, float), arithmetic operations (+, -, *, /), conditional statements (if,
else), loops (while), and output statements (print).

The compiler processes source code written in this mini-language and translates it step-by-step into machine-like stack-
based virtual instructions, just like real compilers translate high-level code into machine code.

Key Functionalities and Phases:


1. Lexical Analysis:

This is the first phase of the compilation process. The source code entered by the user is scanned and broken down into
tokens, which are the smallest units of meaning in a program—such as keywords, identifiers, operators, numbers, and
symbols. The Lexer class handles this step. It removes whitespaces and comments and generates a list of valid tokens.
Invalid symbols are flagged as errors.

2. Syntax Analysis:

In this phase, the compiler checks whether the sequence of tokens follows the grammatical rules of the language. It
constructs an Abstract Syntax Tree (AST) to represent the program’s structure. This is handled by the Parser class. If a
syntax rule is violated, errors are generated and displayed with line and column numbers.

3. Semantic Analysis:

Semantic analysis verifies that the code makes logical sense. It checks:

• If variables are declared before use


• If types match in expressions (e.g., int + float)
• If variables are initialized before they are used

The SemanticAnalyzer class performs this step and maintains a Symbol Table that stores variable names, their data
types, and whether they are initialized. Errors and warnings are provided accordingly.
4. Intermediate Code Generation:

Once the code passes semantic checks, an intermediate representation is generated, often called
Three-Address Code (TAC). This is machine-independent and easy to optimize. For example:

ini CopyEdit t1 = y * 2
t2 = x + t1 result = t2

The IntermediateCodeGenerator class is responsible for this phase.

5. Optimization (Optional Phase):

Simple optimizations are applied to improve efficiency, like:

• Constant folding: Replacing expressions with their evaluated values (e.g., 3 + 2 becomes 5)
• Dead code elimination: (Not implemented here, but can be added)

The Optimizer class applies such improvements where applicable.

6. Target Code Generation:

In this final step, the compiler generates stack-based virtual machine instructions which can be run on a virtual
machine simulator. Instructions include operations like PUSH, POP, LOAD, STORE, ADD, SUB, etc. This code mimics how
machine instructions would be generated in a real compiler.

Console Application Features:

The Mini Compiler offers a user-friendly menu-based interface:

1. Enter Custom Code: Users can write and test their own code.
2. Run Sample Program: Pre-defined code snippet is compiled.
3. Language Reference: Displays supported keywords, operators, syntax, and rules.
4. Exit: Closes the compiler.

Each compilation stage is clearly displayed with timings and outputs, including:

• Tokens list
• Syntax Tree
• Symbol Table
• Errors/Warnings
• Intermediate Code
• Target VM Code

Example Program Supported:


int x; float y; x = 10; y =
20.5; int result; result = x +
y * 2; print(result);
if (x < y && y > 15.0) {
print(1); } else { print(0);
} int i; i = 0; while (i <
3) { print(i); i =
i + 1; }

The above program is correctly parsed, checked semantically, and compiled into stack-based code.

QUESTION # O2:

Explain any 2 analysis functionalities along with screenshots ( function code +output)

1. Lexical Analysis – Tokenization Phase


Lexical analysis is the first phase of the compilation process. It involves scanning the source code and
converting it into tokens, which are the smallest meaningful units of the program. These tokens include
keywords (int, float), identifiers (x, value1), operators (+, =, ==), punctuation (;, (, )), and more.

This task is performed by the Lexer class. It eliminates whitespaces and comments, and ensures only
meaningful parts of the code are passed forward to the next phase (Syntax Analysis).

Function Code (Tokenize Function):

public List<Token> Tokenize()


{
var tokens = new List<Token>();
while (currentChar != '\0')
{
SkipWhitespace();

if (char.IsDigit(currentChar))
{
string number = ReadNumber();
tokens.Add(new Token(TokenType.NUMBER, number, line, column));
}
else if (char.IsLetter(currentChar) || currentChar == '_')
{
string identifier = ReadIdentifier();
TokenType type = keywords.ContainsKey(identifier) ? keywords[identifier] : TokenType.IDENTIFIER;
tokens.Add(new Token(type, identifier, line, column));
}
else
{
switch (currentChar)
{
case '+':
tokens.Add(new Token(TokenType.PLUS, "+", line, column)); Advance(); break;
case '-':
tokens.Add(new Token(TokenType.MINUS, "-", line, column)); Advance(); break;
case '*':
tokens.Add(new Token(TokenType.MULTIPLY, "*", line, column)); Advance(); break;
case '/':
tokens.Add(new Token(TokenType.DIVIDE, "/", line, column)); Advance(); break;
case '=':
Advance();
if (currentChar == '=')
{
tokens.Add(new Token(TokenType.EQUAL, "==", line, column)); Advance();
}
else
{
tokens.Add(new Token(TokenType.ASSIGN, "=", line, column));
}
break; default:
tokens.Add(new Token(TokenType.UNKNOWN, currentChar.ToString(), line, column)); Advance();
break;
}
}
}
tokens.Add(new Token(TokenType.EOF, "", line, column));
return tokens;

2. Semantic Analysis – Logical Validity Checking


Semantic analysis is the third phase of compilation. Once the code is syntactically correct (grammar-wise),
semantic analysis ensures that the code is logically meaningful and contextually valid.

Key responsibilities:

 Ensure variables are declared before use.


 Prevent assigning float to int without casting.
 Track if variables are initialized before being used.
 Build and manage the Symbol Table.

This logic is handled by the SemanticAnalyzer class.

Function Code (Tokenize Function):

private void AnalyzeStatement(Statement stmt) {

switch (stmt)

case DeclarationStatement decl:

if (symbolTable.ContainsKey(decl.Variable))

Errors.Add($"Variable '{decl.Variable}' already declared");

else

symbolTable[decl.Variable] = new Symbol(decl.Variable, decl.Type);

break;

case AssignmentStatement assign:

if (!symbolTable.ContainsKey(assign.Variable))

Errors.Add($"Variable '{assign.Variable}' not declared");

else

string exprType = AnalyzeExpression(assign.Value);

if (symbolTable[assign.Variable].Type == "int" && exprType == "float")

Errors.Add($"Cannot assign float to int variable '{assign.Variable}'");

else

double value = EvaluateExpression(assign.Value);

symbolTable[assign.Variable].IsInitialized = true;
symbolTable[assign.Variable].Value =

symbolTable[assign.Variable].Type == "int" ? (int)value : value;

break;

case PrintStatement print:

AnalyzeExpression(print.Expression);

break

case IfStatement ifStmt:

AnalyzeExpression(ifStmt.Condition);

foreach (var s in ifStmt.ThenBlock) AnalyzeStatement(s);

foreach (var s in ifStmt.ElseBlock) AnalyzeStatement(s);

break;

case WhileStatement whileStmt:

AnalyzeExpression(whileStmt.Condition);

foreach (var s in whileStmt.Body) AnalyzeStatement(s);

break;

} }
QUESTION # O3:
For any given input give detail of how you arrive at the output.(attach relevant code segements
and give screenshot of input and output)

Lexical Analysis:

Purpose:
This is the first phase where the raw source code is read character-by-character and broken into tokens. These
tokens include keywords (int, float), identifiers (x, value), literals (2.3, 5), operators (+, -), and punctuation
(;, ()).

Example:
int a; → INT, IDENTIFIER, SEMICOLON

Handled By: Lexer class


Semantic Analysis

Purpose:
Ensures that the code is logically correct. It checks:

 If variables are declared before use


 If types match during assignments
 If variables are initialized before being used

It also builds a Symbol Table to track variable names, types, and values.

Handled By: SemanticAnalyzer class


Syntax Analysis
Syntax analysis, also called parsing, is the second phase of compilation where the parser checks whether the sequence
of tokens (from lexical analysis) forms valid sentences according to the grammar of the programming language.
Intermediate Code Generation (IR)

Target Code Generation


public void GenerateAssignment(string variable, Expression expr)
{
GenerateExpression(expr);
instructions.Add($"STORE {variable}");
}
public void GenerateExpression(Expression expr)
{
if (expr is NumberExpression num)
{
instructions.Add($"PUSH {num.Value}");
}
else if (expr is IdentifierExpression id)
{
instructions.Add($"LOAD {id.Name}");
}
else if (expr is BinaryExpression bin)
{
GenerateExpression(bin.Left);
GenerateExpression(bin.Right);
switch (bin.Operator)
{
case "+": instructions.Add("ADD"); break;
case "-": instructions.Add("SUB"); break;
case "*": instructions.Add("MUL"); break;
case "/": instructions.Add("DIV"); break; } } }
public void GeneratePrint(Expression expr) {
GenerateExpression(expr);
instructions.Add("PRINT"); }
QUESTION # O4:
What Challenges Did You Face During the Project?

During the development of the Mini Compiler project, we faced a variety of technical and logical challenges across
different phases. These challenges helped us improve our understanding of compiler architecture, programming logic, and
problem-solving strategies.

Understanding Compiler Concepts


One of the first challenges was to understand the theoretical concepts of compiler design such as lexical analysis, syntax
parsing, semantic checking, and code generation. These topics were new and abstract, and it was difficult to grasp how
each phase worked together to form a complete compiler.

Building a Robust Lexer (Tokenization)


Creating the lexical analyzer that could identify keywords, identifiers, numbers, symbols, and operators was tricky.
Handling multi-character operators (like ==, !=) and floating-point numbers required additional logic and testing.

Semantic Analysis and Type Checking


Creating a symbol table to keep track of declared variables and their types was essential. One of the major challenges was
handling type mismatches, such as assigning a float value to an integer variable (e.g., int x = 2.5;). This required strict
type-checking logic to prevent logical errors.

Tracking Line and Column Numbers


To make error messages user-friendly, we had to show the exact line and column of each token or error. Tracking this
information while handling whitespaces, newlines, and complex expressions was challenging and required careful
implementation in the lexer.

Stack-Based Code Generation


Translating the high-level code into stack-based virtual machine instructions (like PUSH, POP, LOAD, ADD) was
another complex phase. Generating correct instruction sequences based on operator precedence and expression depth
needed thorough testing.

Handling Expression Evaluation and Temporary Variables


While generating intermediate code (three-address code), we faced issues in correctly handling nested expressions. We
had to manage temporary variables (like t1, t2) and ensure the correct order of operations, such as evaluating
multiplication before addition.
Console User Interface Design
We wanted the compiler to be user-friendly, even though it was a console application. Designing a clear, interactive menu
and showing phase-wise outputs like tokens, syntax tree, and symbol table required formatting and output structuring
effort.

Implementing Basic Optimization


Although optional, we attempted to implement basic optimization techniques such as constant folding. Detecting and
simplifying constant expressions during compile-time (e.g., 2 + 3 becoming 5) improved performance but required
additional logic in the optimizer.

Debugging and Testing


Debugging the compiler, especially the parser and semantic analyzer, was time-consuming. One small mistake in
grammar implementation or symbol tracking would cause incorrect outputs or crashes. Extensive testing with different
code samples helped us identify and fix issues.

Conclusion:
Each of these challenges contributed to making the project a complete learning experience. Solving them stepby-step not
only strengthened our technical skills but also gave us deep insight into how real-world compilers work internally. The
experience of building a compiler from scratch improved our confidence in both programming and system-level thinking.

QUESTION # O5:
Design a Domain-Specific Language (DSL) in C# to define and generate gameplay elements
like police units, criminal waves, backup support, and city levels for a dynamic police shooter
OUTPUT:

Load Script:
Killing The Criminals:

Backup:

You might also like