L1 - Introduction to
Compiler
By
Dr. Mubarak Sufyan
1
Review from pervious classes
Languages* Converting an NFA to a DFA-Theory
Regular Expressions* From a Regular Expression to an NFA*
Regular Definitions* Grammars
A Regular-Expression Grammar Classes of Grammars
Finite Automata Context-Free Grammars
NFA Pushdown Machines
Deterministic finite automata (DNF)
2
Agenda
Objectives How is the converting done?
A few basic definitions Symbol Table Management
Compiler Error Detection and Reporting
A bit history A complete example
Definition Implementation of Compiler
Translation involves Summary
Compiler V.S Interpreters Qualities in compiler
Why compiler? For next lectures
Accompanies of a compiler References
Overall process
Construction of compiler
The phases of a compiler
Prototype compiler structure
How dose a compiler work?
An example of compilation 3
Objectives
Define compilers and why we study them.
Define the high-level structure of compilers.
Associate specific tasks, theories, and technologies with achieving
the different structural elements of a compiler
4
A few basic definitions
Source code, or simply code
The statements that a programmer writes in a high-level language.
Target code
The statements that writes in a Low-level language using test software.
Translate:
to turn into one’s own language or another.
to transform or turn from one of symbols into another
Assembler:
Convert assembly code into machine code.
Constructs final machine code from processor specific Assembly code.
Often used as last phase of a compilation process to produce binary
executable
5
A few basic definitions
Interpreter:
are translators that translate only as much as is necessary to run the next
statement of the program
is a program that both translates and executes the instructions in a high-
level language program.
Compiler:
Convert the source code of a high level language into machine code
are translators that produce object code (machine-runnable version) from
source code (humanreadable version).
Debuggers:
Used to determine execution error in a compiled program.
6
A few basic definitions (2)
Execute Time:
Refers to the duration taken by a program to run from the moment it receives input(s)
to the point where it produces output(s).
Compile Time:
Is the time at which the source code is converted into an executable code.
Run Time:
How long the execution of a program takes
Is the time at which the executable code is started running.
Compiler bugs (compilation error or compiler error):
Refers to a state when a compiler fails to compile a piece of computer program
source code,
either due to errors in the code, or,
more unusually, due to errors in the compiler itself
A compilation error message often helps programmers debugging the source code
A syntax error
is a mistake such as a misspelled key word, a missing punctuation character, or7 the
incorrect use of an operator.
Language Translator
❑ Translator programs are software which translate
high level language into machine language.
❑ Writing a program using binary digits 0 and 1 is
called machine language program because
machine language consists of 0 and 1 only
Punched
Card
8
Language Translator
❑ As the personal computer or
organizational computer can
only understand machine
language so, need to
convert our code of high-
level language to the
machine language.
❑ Can do it by various
translator programs or
translator software now a
days (translator software).
9
Language Translator High Level Language (HLL)
❑ A high-level language is a programming language that uses
English and mathematical symbols in its instructions.
❑ To execute a program in a high-level language, it can be
compiled or interpreted.
Source Code/ HLL Code Machine Code
Language
Translato
r
10
Language Translator
(Types of translator programs)
❑ Compiler
▪ It reads the program at once, and tells what errors are there.
▪ That program can be executed when those errors are eliminated.
❑ Interpreter
▪ It does not read the program all at once and reads it one by one
line.
▪ The line in which an error is found tells the error about and stop.
▪ Then when that error is set aside then it reaches the next line.
❑ Assembler
▪ These software converts programs written in high level language
into machine language.
▪ They work to convert the object code into source code.
11
Compilers vs. interpreters
Compilers implement languages by translation • Compiler:
Interpreters implement languages directly • Translation of a program written in a source
language into a semantically equivalent program
Note: the line is not always crystal-clear
written in a target language
Compilers and interpreters have tradeoffs
• Interpreter:
Execution speed of program
• Performing the operations implied by the
Start-up overhead, turn-around time source program
Ease of implementation
Programming environment facilities
Conceptual clarity
12
Difference between an interpreter and a compiler
Compiler Interpreter
Scans the entire program and translates it as a whole into machine
Translates program one statement at a time.
code.
It takes large amount of time to analyze the source code but the It takes less amount of time to analyze the source code but the overall
overall execution time is comparatively faster. execution time is slower.
Generates intermediate object code which further requires linking,
No intermediate object code is generated, hence are memory efficient.
hence requires more memory.
Compiler displays all errors and warning at the compilation time. The interpreter reads a single statement and shows the error if any.
Therefore, you can’t run the program without fixing errors. You must correct the error to interpret next line.
It generates the error message only after scanning the whole Continues translating the program until the first error is met, in which
program. Hence debugging is comparatively hard. case it stops. Hence debugging is easy.
Store machine language as machine code on the disk. Not saving machine code at all.
Generates output program which can be run independently from the Do not generate output program. So they evaluate the source
original program. program at every time during execution.
Programming language like C, C++ use compilers. Programming language like Python, Ruby use interpreters.
13
Language Translator
Internal Architecture
❑ Preprocessor
❑ Compiler
❑ Assembler
❑ Linker/Loader
Preprocessor Compiler Assembler Linker/Loader
14
Language Translator
Internal Architecture (Preprocessor)
❑ The preprocessor is a program that is part of the compiler and
“prepares” or modifies the source code before it is translated to
binary code.
❑ The changes are done by interpreting.
Pure
HLL
Source Code/ HLL
Code
15
Language Translator Internal Architecture (Compiler)
❑ A compiler or source-to-source translator are tools that read in a text file
containing a program and write out another text file containing the translation of
that program.
❑ For compilers, that generated program is in a low-level language such as assembly
language, byte-codes for a virtual machine such as the Java virtual machine (JVM),
or sometimes even C.
❑ For source-to-source translators the generated program is a higher-level language,
which may or may not be the same language as the input program.
Assembly Language
Pure
HLL
Compiler
16
Language Translator
Internal Architecture (Assembler)
❑ The Assembler is used to
translate the program written in
Assembly language into machine
code.
❑ The source program is an input of
an assembler that contains
assembly language instructions.
❑ The output generated by the
assembler is the object code or
machine code understandable by
the computer.
17
Language Translator
Internal Architecture (Linker/Loader)
❑ Linker
▪ The linker spawns to action after
the assembler has done its job.
Linker/
▪ The linker combines all external
programs (such as libraries and Loader
other shared components) with our
program to create a final
executable. Relocatable Machine Absolute Machine
Code
Code
❑ Loader
▪ The loader is a specialized
operating system module that
comes last in the picture.
▪ It loads the final executable code
into memory.
18
Compiler
19
A bit of history
1952:
First compiler (linker/loader) written by Grace Hopper for A-0 programming language
1957:
First complete compiler for FORTRAN by John Backus and team
1960:
COBOL compilers for multiple architectures
1962:
First self-hosting compiler for LISP
20
Introduction to Compiler
What is a Compiler?
❑ A compiler is a language translator that takes as input a
program written in a high level language and produces
an equivalent program in a low-level language.
❑ For example, a compiler may translate a C++ program
into an executable program running on a MIPS processor.
21
Introduction to Compiler
A software tool that translates
A program in source code form to
An equivalent program in an executable (target) form.
Converts from a form good for people to a form good for computers.
It takes a whole program at a time.
Time of conversion is called compile time.
The object program is executed at run time.
Terminology:
Structure ≡ Syntax
Meaning ≡ Semantics
Note that a compiler does not execute the program.
22
Definition of Compiler (2)
A compiler is a complex program
From 10,000 to 1,000,000 lines of codes
Compilers are used in many forms of computing.
Command interpreters, interface programs
Source Target
Compiler
code code
Errors
23
Translation involves
Read and understand the program
Precisely determine what actions it require
Figure-out how to faithfully carry-out those actions
Instruct the computer to carry out those actions
24
Why Compiler
If the code contains a syntax error, however, it cannot be translated.
Correctness.
Manage storage of all variables and code
The compiler or interpreter displays an error message indicating that the program
contains a syntax error.
Writing machine language-numeric codes is time consuming and tedious
C7 06 0000 0002
Mov x, 2
X=2
The assembly language has a number of defects
Not easy to write
Difficult to read and understand
25
Accompanies of a compiler
مرفقات المترجم
Editor:
Overall process
is the environment where you can
type-in your source code, sometimes
with highlights, automatic matching, or
intelligent completion.
Preprocessor
Assembler
Linker
Loader
26
Construction of Compiler
Front- and back-end .
Parses the source code into an intermediate representation
(IR)
These parts are often lumped into two categories
The front-end
Focuses on (repeated) analysis
Determines what the program is
The back-end
Focuses on synthesis
Produces target program equivalent to source program
27
Construction of Compiler (2)
28
Construction of Compiler (4)
Phases
Is the logical organization of compiler.
In implementation, sometime different
phases are group together in a pass that is
able to read the input and generate
output.
Six phases: • Three auxiliary components
lexical analysis (Scanner). • Literal table.
• Symbol table.
syntax analysis (Parser).
• Error Hand
Semantic Analyzer.
Intermediate code generation.
code generation (Code generator).
code optimization (Target Code Optimizer) 29
The Phases
of a
Compiler
30
Prototype compiler structure
31
References:
Andrew W. Appel, Modern compiler implementation in Java (Second edition),
Cambridge University Press, New York, NY, USA, 2002, with Jens Palsberg.
Compilers: Principles, Techniques, and Tools, Aho, Sethi and Ullman —
[Link] >
Parsing Techniques, Grune and Jacobs —
[Link]
Advanced Compiler Design and Implementation, Muchnik.
32
End
33