0% found this document useful (0 votes)

49 views

Chapter 1 - Introduction To Comp

The document discusses the basics of compiler design including definitions, phases, and concepts. It defines a compiler as a program that translates code from one language to another. It describes the main phases of compilation as analysis and synthesis, and covers the various steps within each phase like lexical analysis, syntax analysis, and code generation.

Uploaded by

Aschalew Ayele

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views

Chapter 1 - Introduction To Comp

Uploaded by

Aschalew Ayele

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Compiler Design

(CoSc-3112)

Chapter One
Introduction

Dawit Kassahun (MSc.)

Department of Computer Science
Dilla University

Mar 2024
Objective
At the end of this session students will be able to:
 Understand the basic concepts and principles of Compiler Design

 Understand the term compiler, its functions and how it works.

 Be familiar with the different classification of compilers.

 Be familiar with cousins of compiler: Linkers, Loaders, Interpreters, Assemblers

 Understand the need of studying compiler Design and construction

 Understand the phases of compilation and the steps of compilation

Outline

• Introduction
• Analysis and Synthesis in compilation
• Various phases in a compilation
• Compiler construction tools
Introduction to language Processors

 Compiler is an executable program that can read a program in one high-level language

and translate it into an equivalent executable program in machine language.

 A compiler is a computer program that translates an executable program in a source

language into an equivalent program in a target language.

A source program/code is a program/code written in the source language, which is

usually a high-level language.

A target program/code is a program/code written Source Target
program Compiler
program
in the target language, which often is a machine e.g. C++ e.g.
Assembly
language or an intermediate code (Object Code). Error
message
Cousins of Compilers

A. Interpreter:- is another common kind of language processor and instead of producing a

target program an interpreter appears to directly executes the operations specified in the
source program on inputs supplied by the user.
 It produces output of statement during the translation
 It generally uses one of the following strategies for program execution:
i. Execute the source code directly
ii. Translate source code into some efficient intermediate representation and immediately execute this
iii. Explicitly execute stored precompiled code made by a compiler which is part of the interpreter system

B. Assembler:- is a translator that converts programs written in assembly language into

machine code.
 Translate mnemonic operation codes to their machine language equivalents.
 Assigning machine addresses to symbolic labels.
C. Linker:- is a program that takes one or more objects generated by a compiler and combines
them into a single executable program.
D. Loader:- is the part of an operating system that is responsible for loading programs from
executables (i.e., executable files) into memory, preparing them for execution and then
executing them source program
preprocessor
There are four Language translator phase
modified source program
compiler

target assembly program

assembler

Relocatable machine code

linker/loader Library files
Relocatable object files
target machine code
Analysis and Synthesis in compilation

There are two parts to compilation: analysis & synthesis.

 During analysis, the operations implied by the source program are determined and

recorded in a hierarchical structure called a tree.

 During synthesis, the operations involved in producing translated code.

 Breaks up source program into constituent pieces
1. Lexical Analysis  Imposes a grammatical structure on these pieces
2. Syntax Analysis
Front

Analysis
End

 Creates intermediate representation of source program

3. Semantic Analysis
 Collects information about the source program and stores it in a symbol
table.

4. Code Generation  Construct target program from intermediate representation

Back

Synthesis
End

5. Optimization
 Takes the tree structure and translates the operations into the target program
Various phases in a compilation
Analysis
1. Linear/Lexical analysis (Scanning)
 The stream of characters is read from left to right and grouped into tokens.
 A token is a sequence of characters having a collective meaning.
 Token is the basic lexical unit.
 Examples:
• Identifiers are variables
• Keywords
• Symbols (+, -, …)
• Numbers
• Etc…
 Blanks, new lines, tabulation marks will be removed during lexical analysis.
 Example
DIST1 = DIST2 + 5 * 4
<IDENT,1> <ASSIGN> <IDENT,2> <PLUS> <NUMB,3> <MULT> <NUMB,4>
2. Hierarchical/Syntax analysis (Parsing)
 Tokens are grouped hierarchically into nested collections with collective meaning.
 The result is generally a parse tree.
 Most syntactic errors in the source program are caught in this phase.
 Syntactic rules of the source language are given via a Grammar.
 Consider the previous example: DIST1
= DIST2 + 5 * 4
IDENT ASSIGN IDENT PLUS NUMB MULT NUMB
3. Semantic analysis
 Certain checks are performed to make sure that the components of the program fit
together meaningfully.
 Unlike parsing, this phase checks for semantic errors in the source program (e.g.
type mismatch)
 Semantic analysis uses the symbol table.
 Symbol table:- is a data structure with a record for each identifier and its attributes
• Attributes include storage allocation, type, scope, etc
• All the compiler phases insert and modify the symbol table
 The result of semantic analysis is Intermediate Code (IC).
 The IC can be represented using either abstract tree or Three address code
 The previous example in three address code:
TEMP1 = 5 * 4
TEMP2 = inttoreal( TEMP1 )
TEMP3 = IDENT2 + TEMP2
IDENT1 = TEMP3
Synthesis
Synthesis is composed of two phases:
1. Code optimization
2. Code generation
3. Code optimization
 This phase changes the IC so that the code generator produces a faster and less memory
consuming program.
 The optimized code does the same thing as the original (non-optimized) code but with less cost
in terms of CPU time and memory space.
 There are several techniques of optimizing code and they will be discussed in the last chapter.
 Example
Unnecessary lines of code in loops (i.e. code that could be executed outside of the loop) are moved out
of the loop.
for(i=1; i<10; i++){ x = y+1;
x = y+1; for(i=1; i<10; i++)
z = x+i; z = x+i;
}
2. Code generation
 The final phase of the compiler.
 Generates the target code in the target language (e.g. Assembly)
 The instructions in the IC are translated into a sequence of machine instructions that
perform the same task.
Phase I: Lexical Analysis

 The low-level text processing portion of the compiler

 The source file, a stream of characters, is broken into larger chunks called token.
For example:
void main()
{ It will be broken into 13 tokens as below:
int x;
x=3; void main ( ) { int x ; x = 3 ; }
}
 The lexical analyzer (scanner) reads a stream of characters and puts them together into some meaningful

(with respect to the source language) units called tokens.

 Typically, spaces, tabs, end-of-line characters and comments are ignored by the lexical analyzer.
 To design a lexical analyzer: input a description (regular expressions) of the tokens in the language, and
output a lexical analyzer (a program).
Phase II: Parsing (Syntax Analysis)

A parser gets a stream of tokens from the scanner, and determines if the syntax (structure) of the
program is correct according to the (context-free) grammar of the source language.
 Then, it produces a data structure, called a parse tree or an abstract syntax tree, which describes the

syntactic structure of the program.

 The parser ensures that the sequence of tokens returned by the lexical analyzer forms a
syntactically correct program
 It also builds a structured representation of the program called an abstract syntax tree that is
easier for the type checker to analyze than a stream of tokens
 It catches the syntax errors as the statement below:

if if (x > 3) then x = x + 1
 Context-free grammars will be used (as the input) by the parser generator to describe the syntax of
the compiling language
 Most compilers do not generate a parse tree explicitly but rather go to intermediate code directly as
syntax analysis takes place.
Parse Tree

Is output of parsing that shows the Top-down description of program syntax
Root node is entire program and leaves are tokens that were identified during lexical

analysis

Constructed by repeated application of rules in Context Free Grammar (CFG)

Syntax structures are analyzed by DPDA (Deterministic Push Down Automata)

Example: parse tree for position:=initial + rate*60

Phase III: Semantic Analysis

 It gets the parse tree from the parser together with information about some syntactic elements
 It determines if the semantics (meanings) of the program is correct.
 It detects errors of the program, such as using variables before they are declared, assign an
integer value to a Boolean variable, …

 This part deals with static semantic.

 semantic of programs that can be checked by reading off from the program only.
 syntax of the language which cannot be described in context-free grammar.
 Mostly, a semantic analyzer does type checking (i.e. Gathers type information for subsequent code
generation.)
 It modifies the parse tree in order to get that (static) semantically correct code
 In this phase, the abstract syntax tree that is produced by the parser is traversed, looking for
semantic errors
Contd.
 The main tool used by the semantic analyzer is a symbol table
 Symbol table:- is a data structure with a record for each identifier and its attributes
 Attributes include storage allocation, type, scope, etc
 All the compiler phases insert and modify the symbol table
 Discovery of meaning in a program using the symbol table
 Do static semantics check
 Simplify the structure of the parse tree ( from parse tree to abstract syntax tree (AST) )
Static semantics check
 Making sure identifiers are declared before use
 Type checking for assignments and operators
 Checking types and number of parameters to subroutines
 Making sure functions contain return statements
 Making sure there are no repeats among switch statement labels
Phase IV: Intermediate Code Generation

 An intermediate code generator

 takes a parse tree from the semantic analyzer
 generates a program in the intermediate language.

 In some compilers, a source program is translated into an intermediate code first and then the

intermediate code is translated into the target language.

 In other compilers, a source program is translated directly into the target language.

 Compiler makes a second pass over the parse tree to produce the translated code
 If there are no compile-time errors, the semantic analyzer translates the abstract syntax tree into the

abstract assembly tree

 The abstract assembly tree will be passed to the code optimization and assembly code generation

phase
Contd.

Using intermediate code is beneficial when compilers which translates a single source

language to many target languages are required.

 The front-end of a compiler:- scanner to intermediate code generator can be

used for every compilers.

 Different back-ends:- code optimizer and code generator is required for each

target language.

One of the popular intermediate code is three-address code.

 A three-address code instruction is in the form of x = y op z.

Phase V: Assembly Code Generation

 Code generator coverts the abstract assembly tree into the actual assembly code

 To do code generation

 The generator covers the abstract assembly tree with tiles (each tile represents a small portion of

an abstract assembly tree) and

 Output the actual assembly code associated with the tiles that we used to cover the tree

Phase VI: Machine Code Generation and Linking

 The final phase of compilation coverts the assembly code into machine code and links (by a linker) in

appropriate language libraries

Code Optimization

 Replacing an inefficient sequence of instructions with a better sequence of instructions.

 Sometimes called code improvement.

 Code optimization can be done:

 after semantic analyzing
performed on a parse tree
 after intermediate code generation
performed on a intermediate code
 after code generation
performed on a target code
 Two types of optimization

1. Local
2. Global
Local Optimization

 The compiler looks at a very small block of instructions and tries to determine how it
can improve the efficiency of this local code block

 Relatively easy; included as part of most compilers

Examples of possible local optimizations

1. Constant evaluation

2. Strength reduction

3. Eliminating unnecessary operations

Global Optimization

The compiler looks at large segments of the program to decide how to improve
performance
Much more difficult; usually omitted from all but the most sophisticated and
expensive production- level “optimizing compilers”
Optimization cannot make an inefficient algorithm efficient
Compiler construction tools

 Modern software development environments containing tools such as language editors,

debuggers, version managers, profilers, test harnesses, and so on.

 More specialized tools have been created to help implement various phases of a compiler.

Some commonly used compiler-construction tools include

 Parser generators:- that automatically produce syntax analyzers from a grammatical

description of a programming language.

 Compiler-construction toolkits:- that provide an integrated set of routines for

constructing various phases of a compiler.
Cont.…
 Scanner generators:- that produce lexical analyzers from a regular-expression
description of the tokens of a language.
 Syntax-directed translation engines:- that produce collections of routines for
walking a parse tree and generating intermediate code.
 Code-generator:- that produce a code generator from a collection of rules for
translating each operation of the intermediate language into the machine language for
a target machine.
 Data-flow analysis engines: that facilitate the gathering of information about how
values are transmitted from one part of a program to each other part. Data-flow
analysis is a key part of code optimization.

SAP EPC Enterprise Project Connection 2018
No ratings yet
SAP EPC Enterprise Project Connection 2018
19 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
27 pages
LT Unit 3 Notes 2017
No ratings yet
LT Unit 3 Notes 2017
28 pages
Compiler Design Ch1
No ratings yet
Compiler Design Ch1
13 pages
What Are Programming Languages
No ratings yet
What Are Programming Languages
154 pages
Compiler Design: Instructor: Mohammed O. Samara University
No ratings yet
Compiler Design: Instructor: Mohammed O. Samara University
28 pages
CD Unit1 Notes
No ratings yet
CD Unit1 Notes
28 pages
CD Introduction
No ratings yet
CD Introduction
32 pages
Programming Language Handout
No ratings yet
Programming Language Handout
75 pages
CD Unit I Part I Introduction
No ratings yet
CD Unit I Part I Introduction
67 pages
Unit-1: Introduction To Compilers
No ratings yet
Unit-1: Introduction To Compilers
8 pages
Unit 1
No ratings yet
Unit 1
29 pages
Unit 1
No ratings yet
Unit 1
29 pages
Compiler Design: Instructor: Mohammed O. Samara University
100% (1)
Compiler Design: Instructor: Mohammed O. Samara University
28 pages
Unit 1 - CD Cs3501
No ratings yet
Unit 1 - CD Cs3501
24 pages
ACD Unit-2 part-1
No ratings yet
ACD Unit-2 part-1
36 pages
Compiler
No ratings yet
Compiler
17 pages
Compiler Construction CS-4207: Lecture 1 & 2 Instructor Name: Atif Ishaq
No ratings yet
Compiler Construction CS-4207: Lecture 1 & 2 Instructor Name: Atif Ishaq
29 pages
SSCDNotes PDF
100% (1)
SSCDNotes PDF
53 pages
Compiler Design Question Bank-UNIT 1
No ratings yet
Compiler Design Question Bank-UNIT 1
12 pages
Compiler Construction and Phases
No ratings yet
Compiler Construction and Phases
8 pages
Compiler Design-Short Notes
No ratings yet
Compiler Design-Short Notes
61 pages
Compiler Design Question Bank-UNIT 1
No ratings yet
Compiler Design Question Bank-UNIT 1
12 pages
Ch1 Introduction
No ratings yet
Ch1 Introduction
12 pages
CD Unit - 1 Lms Notes
No ratings yet
CD Unit - 1 Lms Notes
58 pages
Compiler Construction
No ratings yet
Compiler Construction
244 pages
Chapter 1
No ratings yet
Chapter 1
4 pages
Chap1 (Minimized)
No ratings yet
Chap1 (Minimized)
23 pages
Compiler Design
No ratings yet
Compiler Design
34 pages
Compiler 1
No ratings yet
Compiler 1
28 pages
Compiler Design Module
No ratings yet
Compiler Design Module
120 pages
Compiler Design - Quick Guide: Language Processing System
No ratings yet
Compiler Design - Quick Guide: Language Processing System
51 pages
Core Course Viii Compiler Design Unit I
No ratings yet
Core Course Viii Compiler Design Unit I
27 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
13 pages
Learning Materials, CD, Unit-1 (Btech-5th Sem)
No ratings yet
Learning Materials, CD, Unit-1 (Btech-5th Sem)
12 pages
Introduction
No ratings yet
Introduction
46 pages
CD
No ratings yet
CD
25 pages
Unit I SRM
100% (1)
Unit I SRM
36 pages
Introduction to Compiler
No ratings yet
Introduction to Compiler
10 pages
Quick Book of Compiler
100% (1)
Quick Book of Compiler
66 pages
Compiler Construction - 01
No ratings yet
Compiler Construction - 01
57 pages
Compiler Design
No ratings yet
Compiler Design
56 pages
Compiler Construction
No ratings yet
Compiler Construction
63 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
9 pages
Assignment 2
No ratings yet
Assignment 2
8 pages
Lecture 1,2 Introduction
No ratings yet
Lecture 1,2 Introduction
40 pages
CD - Unit - 1 IPU
No ratings yet
CD - Unit - 1 IPU
121 pages
Introduction Compiler
No ratings yet
Introduction Compiler
47 pages
Compiler Design Note1
No ratings yet
Compiler Design Note1
111 pages
Unit I-Lexical Analysis Inroduction To Compiling Translator
No ratings yet
Unit I-Lexical Analysis Inroduction To Compiling Translator
19 pages
unit 1
No ratings yet
unit 1
43 pages
Compiler Notes
No ratings yet
Compiler Notes
68 pages
Notes Compile Complete
No ratings yet
Notes Compile Complete
117 pages
Compiler Design
No ratings yet
Compiler Design
65 pages
PCC All Units QuestionBank
No ratings yet
PCC All Units QuestionBank
121 pages
AT_Module6_Compiler and its phases_PS
No ratings yet
AT_Module6_Compiler and its phases_PS
32 pages
Compiler Design
No ratings yet
Compiler Design
11 pages
Compiler Design
No ratings yet
Compiler Design
118 pages
Compiler Construction CS-4207 Lecture - 01 - 02: Input Output Target Program
No ratings yet
Compiler Construction CS-4207 Lecture - 01 - 02: Input Output Target Program
8 pages
Ch1 IntroductiontoCompilerpdf 2023 12 18 08 57 18
No ratings yet
Ch1 IntroductiontoCompilerpdf 2023 12 18 08 57 18
71 pages
Compiler Design
From Everand
Compiler Design
Knowledge Flow
No ratings yet
Disease Detection and Consultation Using Django and Machine Learning
No ratings yet
Disease Detection and Consultation Using Django and Machine Learning
9 pages
Chapter 4 - Communication
No ratings yet
Chapter 4 - Communication
22 pages
Chapter 6 - Intermediate Code Generation
No ratings yet
Chapter 6 - Intermediate Code Generation
5 pages
Chapter 8 - Code Generation Part 1
No ratings yet
Chapter 8 - Code Generation Part 1
5 pages
Brute Force Attack
No ratings yet
Brute Force Attack
3 pages
Chapter 3 - Syntax Analysis (Parsers) Part Two
No ratings yet
Chapter 3 - Syntax Analysis (Parsers) Part Two
24 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
10 pages
Eth and The HornHistory Common Course, Lecture Note 3-1
No ratings yet
Eth and The HornHistory Common Course, Lecture Note 3-1
109 pages
Brute Force Attack Definition
No ratings yet
Brute Force Attack Definition
2 pages
Bikila Teshome 4337
No ratings yet
Bikila Teshome 4337
5 pages
Blackout
No ratings yet
Blackout
2 pages
Chapter2 CV
No ratings yet
Chapter2 CV
79 pages
Multi Ch-3
No ratings yet
Multi Ch-3
45 pages
Dilla University College of Engineering and Technology School of Computing and Informatics Department of Computer Science
100% (1)
Dilla University College of Engineering and Technology School of Computing and Informatics Department of Computer Science
13 pages
A I in Healthcare
No ratings yet
A I in Healthcare
5 pages
Gaussian Blur
No ratings yet
Gaussian Blur
5 pages
CH-1-Introduction To Wireless Communications
No ratings yet
CH-1-Introduction To Wireless Communications
42 pages
Breast Cancer P
No ratings yet
Breast Cancer P
13 pages
CG Chapter 1-5
No ratings yet
CG Chapter 1-5
79 pages
AI Chapter 6 and 7 New
No ratings yet
AI Chapter 6 and 7 New
48 pages
#Include #Define SIZE 99
No ratings yet
#Include #Define SIZE 99
3 pages
Code Coverage Tutorial
No ratings yet
Code Coverage Tutorial
5 pages
Module 3 & 4
No ratings yet
Module 3 & 4
11 pages
Resume - Brian Dear
No ratings yet
Resume - Brian Dear
1 page
Red Hat Enterprise Linux-9-Managing Idm Users Groups Hosts and Access Control Rules
No ratings yet
Red Hat Enterprise Linux-9-Managing Idm Users Groups Hosts and Access Control Rules
418 pages
Big Data Unit 4
No ratings yet
Big Data Unit 4
96 pages
FAANGPath
No ratings yet
FAANGPath
1 page
Pega Robotic Automation Agile Desktop Implementation Guide
No ratings yet
Pega Robotic Automation Agile Desktop Implementation Guide
60 pages
Vrinda Gupta: Achievements Work Experience
No ratings yet
Vrinda Gupta: Achievements Work Experience
1 page
UNIT2 - Logic Design With Behavioral Models of Combinational and Sequential Logic
No ratings yet
UNIT2 - Logic Design With Behavioral Models of Combinational and Sequential Logic
48 pages
Tourpk - Project Proposal
No ratings yet
Tourpk - Project Proposal
9 pages
Function Reference WordPress Codex PDF
0% (1)
Function Reference WordPress Codex PDF
19 pages
80 Preguntas Scrum
No ratings yet
80 Preguntas Scrum
10 pages
WWW Cariocavirtual
100% (1)
WWW Cariocavirtual
40 pages
Java Interview
No ratings yet
Java Interview
20 pages
Supermarket Management System
No ratings yet
Supermarket Management System
17 pages
Instant download (Ebook) jQuery Mobile by Jon Reid ISBN 9781449306687, 1449306683 pdf all chapter
100% (3)
Instant download (Ebook) jQuery Mobile by Jon Reid ISBN 9781449306687, 1449306683 pdf all chapter
81 pages
Java 3+ Exp Satyam
No ratings yet
Java 3+ Exp Satyam
5 pages
Computer 2
No ratings yet
Computer 2
4 pages
Data Structures - Python 3.10.4 Documentation
No ratings yet
Data Structures - Python 3.10.4 Documentation
11 pages
Onur Kilic CV
No ratings yet
Onur Kilic CV
1 page
ModelSim Users Manual v10.1c PDF
No ratings yet
ModelSim Users Manual v10.1c PDF
733 pages
MicroFrontends Final Guide
No ratings yet
MicroFrontends Final Guide
27 pages
Java Lab Assignments 2nd Year Engineering
No ratings yet
Java Lab Assignments 2nd Year Engineering
20 pages
Choosing Swing or HTML - Universal Robots
No ratings yet
Choosing Swing or HTML - Universal Robots
5 pages
Overview On DMEE Tree With PMW Config Steps
No ratings yet
Overview On DMEE Tree With PMW Config Steps
33 pages
2 Marks Questions
No ratings yet
2 Marks Questions
4 pages
Unit 5 Java
No ratings yet
Unit 5 Java
23 pages
Middle Wares
No ratings yet
Middle Wares
3 pages