Compiler Design – Introduction
What is Language?
• Language: “any system of formalized symbols,
signs, etc., used or conceived as a means of
communication.”
– Communicate: to transmit or exchange thought or
knowledge.
– Programming language: communicate between a
person and a machine
– Programming language is an intermediary
Hierarchy of (programming) languages
• Machine language
• Assembly language
• High level language
• Problem oriented
• Natural language
Language Translators
• Translator: Translate one language into another language (e.g.,
from C++ to Java)
– A generic term.
• For high level programming languages (such as java, C):
– Compiler: translate high level programming language code into host
machine code
– Interpreter: process the source program and data at the same time.
No equivalent machine code is generated.
• Assembler: translate an assembly language to machine code.
• Decompiler and Disassembler refer to translators which attempt to
take object code at a low level and regenerate source code at a
higher level.
• Cross-translators generate code for machines other than the host
machine.
Compiler Vs Interpreter
Goals of translation
1. A better compiler is one which generates smaller code
2. Correct Code
3. Output runs fast
4. The compiler itself must run fast
5. compilation time must be proportional to program size.
6. Good Diagnostics for errors
7. Compiler must be portable
Some early machines and
implementations
• IBM developed 704 in 1954. All programming was done in
assembly language. Cost of software development far
exceeded cost of hardware. Low productivity.
• Speedcoding interpreter: programs ran about 10 times
slower than hand written assembly code
• John Backus (in 1954): Proposed a program that translated
high level expressions into native machine code. Skeptism
all around. Most people thought it was impossible
• Fortran I project . (1954-1957): The first compiler was
released
– The whole new field of compiler design was started
– Modern compilers preserve the basic structure of the Fortran I
compiler !!!
Why study compilers?
• Compilers use the whole spectrum of
language processing technology
• Used in Variety of other fileds
– Editors/ Word processors
– Interpreters
– Silicon Complier design – Circuit
– Query Compilers etc…
The process of understanding a sentence
1. Recognize characters (alphabet, mathematical
symbols, punctuations).
2. Group characters into logical entities (words).
3. Check the words form a structurally correct
sentence
4. Check that the combination of words make sense
5. Plan what you have to do to accomplish the task
6. Execute it.
The structure (phases) of a compiler
Front end (analysis): depend on source language, independent on machine
- This is what we will focus (mainly the blue parts).
Backend (synthesis): dependent on machine and intermediate code,
independent of source code.
Why front end and backend?
4 languages on 3 machines=12 compilers?
4 front end + 3 back end = 7 !
Efficiency Issues
• Pass- Single reading of the source file
– Single Pass compliers
– Multi-Pass Compilers
One pass compiler Multi pass compiler
Intr.1 Instr.2 Intr.1 Instr.2
... ...
Instr.n Instr.3
Scanner Pass 1
Parser Pass 2
Symbol Table Pass 3
Code Gen. Pass 4
Optimizer Pass 5
Assignments overview
The Big picture
• Compiler is part of program development environment
– The other typical components of this environment are editor, assembler, linker, loader,
debugger, profiler etc.
– The compiler (and all other tools) must support each other for easy program development
Compiler engineering techniques
• Straightforward approach
• Bootstrapping
• Using cross-compilers
• Using virtual machines
• Just-in-time compiling
Bootstrapping
1. Suppose that there exists a compiler KA: P→A, where P is a language
with level higher than that of assembly.
.
2. We implement a compiler KP: L→A, and apply KA to process the
source code of the compiler KP, resulting in a compiler KA=KA(KP):
L→A.
3. The described scheme (also shown on the slide) is called
bootstrapping; the compiler KP is said to be "bootstrapped" by the
compiler KA.
Cross Compilers
Suppose that we have two computers:
1. One is the computer M with the assembler language A.
2. The other one is the computer M1 with the assembler language A1,
and a compiler KA1: P→A1.
3. The computer M is unavailable, or there is no compiler KA: P→A.
4. Our objective is a compiler KA: L→A.
5. In this situation, M1 can be used as the so-called host computer to
implement a compiler KP: L→A.
6. The compiler KP: L→A is called a cross-compiler.
7. Once M becomes available, KP can be ported to M and bootstrapped with
the aid of KA.
A compiler is said to be retargetable if it can generate code for a range of
target processors.
Usually retargetable compiler contains almost no processor-specific code, and
characteristics of the target machines are captured in explicit target
descriptions.
Virtual Machines
1. Another method uses virtual machines to obtain portable
compilers
2. This approach implies translation of the source language into
machine codes of a specially designed machine, which is not
supposed to be implemented as a hardware device.
3. Then a virtual machine emulator is written for every target
platform.
Just-in-time Compiling
1. The main disadvantage of virtual machines is the low performance of a
program being interpreted compared to a program compiled to native
machine code.
2. To improve application performance, the just-in-time compiling
technology (sometimes also referred to as dynamic compilation) is used.
3. The idea is that JIT-compiler generates machine code directly in memory
without storing it, which creates a great improvement of the performance.
Components of a Compiler
Analysis phase
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Synthesis Phase
ICG
Code Optimization
Code Generation
Structure of a Compiler
Efficiency Issues
• Pass- Single reading of the source file
– Single Pass compliers
– Multi-Pass Compilers
One pass compiler Multi pass compiler
Intr.1 Instr.2 Intr.1 Instr.2
... ...
Instr.n Instr.3
Scanner Pass 1
Parser Pass 2
Symbol Table Pass 3
Code Gen. Pass 4
Optimizer Pass 5
Two Pass Compiler
One Pass Complier
Design Issues
Symbol Table
Assignments overview
Compiler Generator
Lexical Analysis – Next class