Compilers PDF
Compilers PDF
design
Overview and History
• Cause
– Software for early computers was written in assembly language
– The benefits of reusing software on different CPUs started to
become significantly greater than the cost of writing a compiler
2
Overview and History
• Compiler technology
– is more broadly applicable and has been employed in
rather unexpected areas.
• Text-formatting languages,
like nroff and troff; preprocessor packages like eqn,
tbl, pic
• Silicon compiler for the creation of VLSI circuits
• Command languages of OS
• Query languages of Database systems
3
What Do Compilers Do
• A compiler acts as a translator,
transforming human-oriented programming
languages
into computer-oriented machine languages.
– Ignore machine-dependent details for programmer
Programming Machine
Language Compiler Language
(Source) (Target)
4
What Do Compilers Do
• Compilers may generate three types of code:
– Pure Machine Code
• Machine instruction set without assuming the existence
of any operating system or library.
• Mostly being OS or embedded applications.
– Augmented Machine Code
• Code with OS routines and runtime support routines.
• More often
– Virtual Machine Code
• Virtual instructions, can be run on any architecture with
a virtual machine interpreter or a just-in-time compiler
• Ex. Java 5
The Structure of a Compiler
• Any compiler must perform two major tasks
Compiler
Analysis Synthesis
In more detail:
Intermediate
Language
Source Target Language
Front End – Back End –
Language
language specific machine specific
•Separation of Concerns
•Retargeting
7
Compiler Architecture
Intermediate
Intermediate
Language
Language
Symbol
Table
8
Lexical Analysis - Scanning
Code
Optimizer
9
Input: result = a + b * c / d
• Tokens:
‘result’, ‘=‘, ‘a’, ‘+’, ‘b’, ‘*’, ‘c’, ‘/’, ‘d’
identifiers
operators
10
Static Analysis - Parsing
tokens Syntactic
Scanner Parser structure Semantic Code
Source Target
language
(lexical (syntax Analysis Generator
language
analysis) analysis) (IC generator)
Code
Optimizer
11
Input: result = a + b * c / d
ID ID
12
Semantic Analysis
Syntactic/semantic
Syntactic structure
Scanner Parser structure Semantic Code
Source Target
language
(lexical (syntax Analysis Generator
language
analysis) analysis) (IC generator)
Syntactic/semantic
structure
Code
Optimizer
• “Meaning”
• Type/Error Checking
• Intermediate Code Generation –
abstract machine Symbol
Table
13
Optimization
14
Code Generation
Syntactic/semantic
structure
Scanner Parser Semantic Code
Source Target
language
(lexical (syntax Analysis Generator language
analysis) analysis) (IC generator)
Syntactic/semantic
Code structure
Optimizer
15
The Role of Lexical Analyzer
• Lexical analyzer is the first phase of a compiler.
• Its main task is to read input characters and produce as
output a sequence of tokens that parser uses for syntax
analysis.
16
Tokens, Patterns, Lexemes
• A lexeme is a sequence of characters in the source
program that is matched by the pattern for a token.
• A lexeme is a basic lexical unit of a language comprising
one or several words, the elements of which do not
separately convey the meaning of the whole.
• The lexemes of a programming language include its identifier,
literals, operators, and special words.
• A token of a language is a category of its lexemes.
• A pattern is a rule describing the set of lexemes that can
represent as particular token in source program.
17
Examples of Tokens
const pi = 3.1416;
The substring pi is a lexeme for the token “identifier.”
18
Lexeme and Token
Index = 2 * count +17;
Lexemes Tokens
Index Identifier
= equal_sign
2 int_literal
* multi_op
Count identifier
+ plus_op
17 int_literal
; semicolon
19
Lexical Errors
• Few errors are discernible at the lexical level
alone, because a lexical analyzer has a very
localized view of a source program.
• Let some other phase of compiler handle any
error.
• Panic mode
• Error recovery
20