0% found this document useful (0 votes)

99 views

Compiler Design Ch1

A compiler is a program that translates a program written in one language (the source language) into an equivalent program in another language (the target language). Compilers provide an essential interface between applications and architectures, and embody theoretical techniques. A compiler consists of analysis and synthesis phases. Analysis breaks down the source program into tokens, constructs a parse tree via syntax analysis, and performs type checking via semantic analysis. Synthesis constructs an output from the intermediate representation.

Uploaded by

Vuggam Venkatesh

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

99 views

Compiler Design Ch1

Uploaded by

Vuggam Venkatesh

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Chapter 1 Introduction

What is a compiler?

a program that reads a program written in one language (the source language) and translates it into an equivalent
program in another language (the target language).

Why we design compiler?

Why we study compiler construction techniques?

 Compilers provide an essential interface between applications and architectures

 Compilers embody a wide range of theoretical techniques

Output

Using a high-level language for programming has a large impact on how fast programs can be developed.

The main reasons for this are:

 Compared to machine language, the notation used by programming languages is closer to the way humans think
about problems.

 The compiler can spot some obvious programming mistakes.

 Programs written in a high-level language tend to be shorter than equivalent programs written in machine
language.

 The same program can be compiled to many different machine languages and, hence, be brought to run on
many different machines.

Since different platforms, or hardware architectures along with the operating systems (Windows, Macs, Unix), require
different machine code, you must compile most programs separately for each platform.

1
compi progra
le r m
compi compi
ler ler

Programs related to compilers

1) Interpreter

Is a program that reads a source program and executes it

Works by analyzing and executing the source program commands one at a time

Does not translate the whole source program into object code

Interpretation is important when:

o Programmer is working in interactive mode and needs to view and update variables

o Running speed is not important

o Commands have simple formats, and thus can be quickly analyzed and executed

o Modification or addition to user programs is required as execution proceeds

Well-known examples of interpreters:

o Basic interpreter, Lisp interpreter, UNIX shell command interpreter, SQL interpreter, java interpreter…

In principle, any programming language can be either interpreted or compiled:

o Some languages are designed to be interpreted, others are designed to be compiled

Interpreters involve large overheads:

o Execution speed degradation can vary from 10:1 to 100:1

o Substantial space overhead may be involved

E.g., Compiling Java Programs

The Java compiler produces bytecode not machine code

Bytecode is converted into machine code using a Java Interpreter

You can run bytecode on any computer that has a Java Interpreter installed

2
Java compil
Java bytecode
Program er Interpreter

2) Assemblers

Translator for the assembly language.

Assembly code is translated into machine code

Output is relocatable machine code.

3) Linker

o Links object files separately compiled or assembled

o Links object files to standard library functions

o Generates a file that can be loaded and executed

4) Loader

Loading of the executable codes, which are the outputs of linker, into main memory.

5) Pre-processors

A pre-processor is a separate program that is called by the compiler before actual translation begins.

Such a pre-processor:

 Produce input to a compiler

 can delete comments,

 Macro processing (substitutions)

 include other files...

Absolute machine code

3
A compiler consists of
internally of a number of
steps, or phases, that
perform distinct logical
operations.
The phases of a compiler
are shown in the next slide,
together with three
auxiliary components that
interact with some or all of
the phases:
The symbol table,
the literal table,
and error handler.

There are two important

parts in compilation
process:
Analysis and Synthesis

Analysis (front end)

Analysis
o and
Breaks up the source program into constituent pieces and

Synthesis.
o Creates an intermediate representation of the source program.

o During analysis, the operations implied by the source program are determined and recorded in hierarchical structure
called a tree.

Synthesis (back end)

o The synthesis part constructs the desired program from the intermediate representation.

4
Analysis of the source program

Analysis consists of three phases:

1. Linear/Lexical analysis

2. Hierarchical/Syntax analysis

3. Semantic analysis

1. Lexical analysis or Scanning

The stream of characters making up the source program is read from left to right and is grouped into tokens.

A token is a sequence of characters having a collective meaning.

A lexical analyzer, also called a lexer or a scanner, receives a stream of characters from the source program and groups
them into tokens.

Examples:

 Identifiers

 Keywords

 Symbols (+, -, …)

 Numbers …

Blanks, new lines, tabulation marks will be removed during lexical analysis.

Example

a[index] = 4 + 2;

a identifier

[ left bracket

index identifier

] right bracket

= assignment operator

4 number

+ plus operator

2 number

; semicolon

A scanner may perform other operations along with the recognition of tokens.

• It may enter identifiers into the symbol table, and

• It may enter literals into literal table

Lexical Analysis Tools

There are tools available to assist in the writing of lexical analyzers.

5
a) lex - produces C source code (UNIX/linux).

b) flex - produces C source code (gnu).

c) JLex - produces Java source code.

We will use Lex.

2. Syntax analysis or Parsing

The parser receives the source code in the form of tokens from the scanner and performs syntax analysis.

The results of syntax analysis are usually represented by a parse tree or a syntax tree.

Syntax tree à each interior node represents an operation and the children of the node represent the arguments of the
operation.

The syntactic structure of a programming language is determined by context free grammar (CFG).

Abstract syntax tree

Ex. Consider again the line of C code: a[index] = 4 + 2

6
Sometimes syntax trees are called abstract syntax trees, since they represent a further abstraction from parse trees.
Example is shown in the following figure.

Syntax Analysis Tools

There are tools available to assist in the writing of parsers.

a) yacc - produces C source code (UNIX/Linux).

b) bison - produces C source code (gnu).

c) CUP - produces Java source code.

We will use yacc

3. Semantic analysis

The semantics of a program are its meaning as opposed to syntax or structure

The semantics consist of:

o Runtime semantics – behavior of program at runtime

o Static semantics – checked by the compiler

Static semantics include:

o Declarations of variables and constants before use

o Calling functions that exist (predefined in a library or defined by the user)

o Passing parameters properly

o Type checking.

The semantic analyzer does the following:

o Checks the static semantics of the language

o Annotates the syntax tree with type information

7
Ex. Consider again the line of C code: a[index] = 4 + 2

Synthesis of the target program

I. Intermediate code generator

II. Intermediate code optimizer

III. The target code generator

IV. The target code optimizer

Code Improvement

 Code improvement techniques can be applied to:

o Intermediate code – independent of the target machine

o Target code – dependent on the target machine

 Intermediate code improvement include:

o Constant folding

o Elimination of common sub-expressions

o Improving loops

o Improving function calls

 Target code improvement include:

o Allocation and use of registers

o Selection of better (faster) instructions and addressing modes

I. Intermediate code generator

Comes after syntax and semantic analysis

Separates the compiler front end from its backend

Intermediate representation should have 2 important properties:

8
o Should be easy to produce

o Should be easy to translate into the target program

Intermediate representation can have a variety of forms:

o Three-address code, P-code for an abstract machine, Tree or DAG representation

Intermediate code

Three address code for the original C expression a[index]=4+2 is:

t1=2

t2 = 4 + t1

a[index] = t2

II. IC optimizer

An IC optimizer reviews the code, looking for ways to reduce:

o the number of operations and

o the memory requirements.

A program may be optimized for speed or for size.

This phase changes the IC so that the code generator produces a faster and less memory consuming program.

The optimized code does the same thing as the original (non-optimized) code but with less cost in terms of CPU time and
memory space.

Intermediate code
There are several techniques of optimizing code and they will be discussed in the forthcoming chapters.

Ex. Unnecessary lines of code in loops (i.e. code that could be executed outside of the loop) are moved out of the loop.

for(i=1; i<10, i++){

x = y+1;

z = x+i; }

x = y+1;

for(i=1; i<10, i++)

z = x+i;

In our previous example, we have included an opportunity for source level optimization; namely, the expression 4 + 2 can
be recomputed by the compiler to the result 6(This particular optimization is called constant folding).

9
This optimization can be performed directly on the syntax tree as shown below.

Many optimizations can be performed directly on the tree.

However, in a number of cases, it is easier to optimize a linearized form of the tree that is closer to assembly code.

A standard choice is Three-address code, so called because it contains the addresses of up to three locations in memory.

In our example, three address code for the original C expression might look like this:

o t1=2
o t2 = 4 + t 1
o a[index] = t2

Now the optimizer would improve this code in two steps, first computing the result of the addition

o t = 4+2
o a[index] = t

And then replacing t by its value to get the three-address statement

o a[index] = 6

III. Code generator

The machine code generator receives the (optimized) intermediate code, and then it produces either:

o Machine code for a specific machine, or

o Assembly code for a specific machine and assembler.

Code generator

o Selects appropriate machine instructions

o Allocates memory locations for variables

o Allocates registers for intermediate computations

10
The code generator takes the IR code and generates code for the target machine.

Here we will write target code in assembly language: a[index]=6

MOV R0, index ;; value of index -> R0

MUL R0, 2 ;; double value in R0

MOV R1, &a ;; address of a ->R1

ADD R1, R0 ;; add R0 to R1

MOV *R1, 6 ;; constant 6 -> address in R1

&a –the address of a (the base address of the array)

*R1-indirect registers addressing (the last instruction stores the value 6 to the address contained in R1)

IV. The target code optimizer

In this phase, the compiler attempts to improve the target code generated by the code generator.

Such improvement includes:

 Choosing addressing modes to improve performance

 Replacing slow instruction by faster ones

 Eliminating redundant or unnecessary operations

In the sample target code given, use a shift instruction to replace the multiplication in the second instruction.

Another is to use a more powerful addressing mode, such as indexed addressing to perform the array store.

With these two optimizations, our target code becomes:

MOV R0, index ;; value of index -> R0

SHL R0 ;; double value in R0

MOV &a [R0], 6 ;; constant 6 -> address a + R0

Grouping of phases

The discussion of phases deals with the logical organization of a compiler.

In practice most compilers are divided into:

Front end - language-specific and machine-independent.

11
Back end - machine-specific and language-independent.

Compiler passes:

A pass consists of reading an input file and writing an output file.

Several phases may be grouped in one pass.

For example, the front-end phases of lexical analysis, syntax analysis, semantic analysis, and intermediate code
generation might be grouped together into one pass.

Single pass

o is a compiler that passes through the source code of each compilation unit only once.

o a one-pass compiler does not "look back" at code it previously processed.

o A one-pass compilers is faster than multi-pass compilers

o they are unable to generate as efficient programs, due to the limited scope available.

Multi pass

o is a type of compiler that processes the source code or abstract syntax tree of a program several times.

o A collection of phases is done multiple times

Major Data and Structures in a Compiler

Token

o Represented by an integer value or an enumeration literal

o Sometimes, it is necessary to preserve the string of characters that was scanned

o For example, name of an identifiers or value of a literal

Syntax Tree

o Constructed as a pointer-based structure

o Dynamically allocated as parsing proceeds

Nodes have fields containing information collected by the parser and semantic analyzer

Symbol Table

o Keeps information associated with all kinds of tokens:

 Identifiers, numbers, variables, functions, parameters, types, fields, etc.

o Tokens are entered by the scanner and parser

o Semantic analyzer adds type information and other attributes

o Code generation and optimization phases use the information in the symbol table

Performance Issues

o Insertion, deletion, and search operations need to be efficient because they are frequent

o Hash table with constant-time operations is usually the preferred choice

12
o More than one symbol table may be used

Literal Table

o Stores constant values and string literals in a program.

o One literal table applies globally to the entire program.

o Used by the code generator to:

 Assign addresses for literals.

o Avoids the replication of constants and strings.

o Quick insertion and lookup are essential.

Compiler construction tools

Various tools are used in the construction of the various parts of a compiler.

Scanner generators

o Ex. Lex, flex, JLex

o These tools generate a scanner /lexical analyzer/ if given a regular expression.

Parser Generators

o Ex. Yacc, Bison, CUP

o These tools produce a parser /syntax analyzer/ if given a Context Free Grammar (CFG) that describes the syntax
of the source language.

Syntax directed translation engines

o Ex. Cornell Synthesizer Generator

o It produces a collection of routines that walk the parse tree and execute some tasks.

Automatic code generators

o Take a collection of rules that define the translation of the IC to target code and produce a code generator.

CheatSheet Magento 2
50% (2)
CheatSheet Magento 2
1 page
Compiler Design Chapter-1
No ratings yet
Compiler Design Chapter-1
41 pages
Principles of Compiler Design: Million G/her
No ratings yet
Principles of Compiler Design: Million G/her
40 pages
Chapter 1
No ratings yet
Chapter 1
42 pages
CH 02 - PL
No ratings yet
CH 02 - PL
92 pages
Compiler Design Note1
No ratings yet
Compiler Design Note1
111 pages
Introduction To Compilation
No ratings yet
Introduction To Compilation
33 pages
Debre Markos University Burie Campus Departement of Computer Science
No ratings yet
Debre Markos University Burie Campus Departement of Computer Science
44 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
13 pages
1-Phases of compiler
No ratings yet
1-Phases of compiler
68 pages
CD_ UNIT-1
No ratings yet
CD_ UNIT-1
10 pages
1 Compiler Phases
No ratings yet
1 Compiler Phases
30 pages
CD Unit - 1 Lms Notes
No ratings yet
CD Unit - 1 Lms Notes
58 pages
Compiler Lecture 3 4 5
No ratings yet
Compiler Lecture 3 4 5
14 pages
Lecture#1 2
No ratings yet
Lecture#1 2
54 pages
Chapter-1[1]
No ratings yet
Chapter-1[1]
49 pages
Unit 1 Introduction To Compiler 1. Introduction To Compiler
No ratings yet
Unit 1 Introduction To Compiler 1. Introduction To Compiler
134 pages
CD Unit-1 (Complete)
No ratings yet
CD Unit-1 (Complete)
90 pages
CD All Units
No ratings yet
CD All Units
117 pages
CD Unit-1
No ratings yet
CD Unit-1
37 pages
Compiler Desining Complete Notes
No ratings yet
Compiler Desining Complete Notes
175 pages
Compiler Design
No ratings yet
Compiler Design
11 pages
Unit 1
No ratings yet
Unit 1
29 pages
Unit 1
No ratings yet
Unit 1
29 pages
1 - Introduction To Compilers
No ratings yet
1 - Introduction To Compilers
21 pages
Compiler Design Material
No ratings yet
Compiler Design Material
107 pages
CD Sanchit Sir Notes
No ratings yet
CD Sanchit Sir Notes
115 pages
Compiler 1
No ratings yet
Compiler 1
33 pages
Compiler 1
No ratings yet
Compiler 1
28 pages
phases of compiler
No ratings yet
phases of compiler
36 pages
Compiler Design: Dr. M. Moshiul Hoque Dept. of CSE, CUET
No ratings yet
Compiler Design: Dr. M. Moshiul Hoque Dept. of CSE, CUET
53 pages
1-Phases of Compiler
No ratings yet
1-Phases of Compiler
66 pages
Lecture 1 - Ch1. Introduction To Compiler
No ratings yet
Lecture 1 - Ch1. Introduction To Compiler
29 pages
INTRO TO COMPILERS
No ratings yet
INTRO TO COMPILERS
77 pages
CD Unit 1
No ratings yet
CD Unit 1
11 pages
PART1 - Compiler Lecture Notes
No ratings yet
PART1 - Compiler Lecture Notes
7 pages
Lecture 08 Language Translation PDF
No ratings yet
Lecture 08 Language Translation PDF
11 pages
Compiler Design.: Why To Learn About Compilers
No ratings yet
Compiler Design.: Why To Learn About Compilers
12 pages
CD Unit I Part I Introduction
No ratings yet
CD Unit I Part I Introduction
67 pages
Introduction To Compiling
100% (1)
Introduction To Compiling
26 pages
CC 1
No ratings yet
CC 1
41 pages
Unit 1 - CD Cs3501
No ratings yet
Unit 1 - CD Cs3501
24 pages
CS 321 - Compilers: Outline
No ratings yet
CS 321 - Compilers: Outline
8 pages
Compiler Lecture-1
No ratings yet
Compiler Lecture-1
47 pages
Compiler Notes
No ratings yet
Compiler Notes
68 pages
Compiler Theory: 001 - Introduction and Course Outline
No ratings yet
Compiler Theory: 001 - Introduction and Course Outline
33 pages
Lecture1 - Compiler Design
No ratings yet
Lecture1 - Compiler Design
52 pages
Unit 1 (A)
No ratings yet
Unit 1 (A)
40 pages
Language Processing System:-: Compiler
No ratings yet
Language Processing System:-: Compiler
6 pages
Dakshina Ranjan Kisku Associate Professor Department of Computer Science and Engineering National Institute of Technology Durgapur
No ratings yet
Dakshina Ranjan Kisku Associate Professor Department of Computer Science and Engineering National Institute of Technology Durgapur
31 pages
CS4031 Compiler Construction Lecture 1
No ratings yet
CS4031 Compiler Construction Lecture 1
42 pages
Lec00 Outline
No ratings yet
Lec00 Outline
27 pages
Unit 1
No ratings yet
Unit 1
37 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
Week 1 PDF
100% (1)
Week 1 PDF
38 pages
Compiler Design
From Everand
Compiler Design
Knowledge Flow
No ratings yet
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
From Everand
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
Dexter Rogers
No ratings yet
Code Beneath the Surface: Mastering Assembly Programming
From Everand
Code Beneath the Surface: Mastering Assembly Programming
Kameron Hussain
No ratings yet
The 1 Page Python Book
From Everand
The 1 Page Python Book
Barani Kumar
2/5 (1)
Understanding Python: Beginner's Guide to Programming
From Everand
Understanding Python: Beginner's Guide to Programming
Sabry Fattah
No ratings yet
Essential Python 3
From Everand
Essential Python 3
Kevin Vans-Colina
No ratings yet
Computer Graphics (CG CHAP 1)
No ratings yet
Computer Graphics (CG CHAP 1)
12 pages
CG Chap 1
No ratings yet
CG Chap 1
3 pages
Computer Graphics (CG CHAP 2)
No ratings yet
Computer Graphics (CG CHAP 2)
32 pages
Computer Graphics (CG CHAP 3)
0% (1)
Computer Graphics (CG CHAP 3)
15 pages
CHAP - I Microprocessor
No ratings yet
CHAP - I Microprocessor
49 pages
Computer Graphics: Geometry and Line Generation
No ratings yet
Computer Graphics: Geometry and Line Generation
5 pages
CHAP IInew
No ratings yet
CHAP IInew
29 pages
AI CH 4
No ratings yet
AI CH 4
19 pages
Compiler Design Chapter 2
No ratings yet
Compiler Design Chapter 2
14 pages
Chapter Seven: Code Generation
No ratings yet
Chapter Seven: Code Generation
33 pages
Compiler Design Chapter-6
No ratings yet
Compiler Design Chapter-6
83 pages
Chapter Five: Type Checking
100% (1)
Chapter Five: Type Checking
48 pages
Compiler Design Chapter-4
100% (2)
Compiler Design Chapter-4
77 pages
MiniProject Report
No ratings yet
MiniProject Report
23 pages
Java Cheatsheet
No ratings yet
Java Cheatsheet
17 pages
Python Ecommerce
No ratings yet
Python Ecommerce
31 pages
Full Stack Roadmap: Opinions
No ratings yet
Full Stack Roadmap: Opinions
8 pages
CSS Practical 01-05
No ratings yet
CSS Practical 01-05
16 pages
Wings1 T1 Full-Stack Application (62638)
No ratings yet
Wings1 T1 Full-Stack Application (62638)
6 pages
Lab03: Constructor and Destructor
No ratings yet
Lab03: Constructor and Destructor
9 pages
Introduction To Computers and Programming
No ratings yet
Introduction To Computers and Programming
25 pages
E Books
80% (5)
E Books
7 pages
Python Key
No ratings yet
Python Key
8 pages
12 Functions 90dcbe9f5
No ratings yet
12 Functions 90dcbe9f5
3 pages
Web Technology With Mini Project "Blood Bank Portal": Visvesvaraya Technological University Belgaum - 590 018, Karnataka
No ratings yet
Web Technology With Mini Project "Blood Bank Portal": Visvesvaraya Technological University Belgaum - 590 018, Karnataka
38 pages
React Native Cheat Sheet Galaxies
No ratings yet
React Native Cheat Sheet Galaxies
1 page
Test - Powered by HackerRank
No ratings yet
Test - Powered by HackerRank
3 pages
Firebase
No ratings yet
Firebase
9 pages
Dice Resume CV Kiranmye Noothanapati
No ratings yet
Dice Resume CV Kiranmye Noothanapati
6 pages
Introduction To Javascript
No ratings yet
Introduction To Javascript
127 pages
Web Development and Designing: Summer Internship Report
No ratings yet
Web Development and Designing: Summer Internship Report
24 pages
KM Sapna Cse CGCJ
No ratings yet
KM Sapna Cse CGCJ
1 page
Aspnet Core Aspnetcore 7.0
No ratings yet
Aspnet Core Aspnetcore 7.0
6,555 pages
Shinetech CV - Senior Software Engineer - Kris Wang
No ratings yet
Shinetech CV - Senior Software Engineer - Kris Wang
5 pages
React 18 - Course Content
No ratings yet
React 18 - Course Content
3 pages
Muhammad Usama Bin Islam - Full Stack Engineer - Java
No ratings yet
Muhammad Usama Bin Islam - Full Stack Engineer - Java
2 pages
Client Side Scripting - 6thecture
No ratings yet
Client Side Scripting - 6thecture
101 pages
Introduction To Windows Presentation Foundation: WPF Tutorial
No ratings yet
Introduction To Windows Presentation Foundation: WPF Tutorial
67 pages
Sufian Resume
No ratings yet
Sufian Resume
1 page
What Is Android
No ratings yet
What Is Android
44 pages
Format For Course Curriculum: L T P/ S SW/F W Total Credit Units Psda
No ratings yet
Format For Course Curriculum: L T P/ S SW/F W Total Credit Units Psda
6 pages
QTP Codes
No ratings yet
QTP Codes
172 pages