Compiler Construction
Lecture 1: Introduction
Dr. Sabina Akhtar
About me
• MS(Computer Science)
– FAST-NU Islamabad (2002-2005)
• MS(Computer Science)
– Université de Lorraine,
Nancy France (2007-2008)
About me
• Ph.d (Formal Verification)
– Researched at INRIA Nancy
– Université de Lorraine, Nancy, France (2008-2012)
Contact details
• Email: [email protected]
• Office hours:
Some Rules
– Attendance during first 10 minutes for each hour
– After that you will be marked as absent
– Raise your hand before asking any question and
then WAIT for the permission
– Must wear your ID card in the class
Some Rules
– Submission must be done on time
– No compensation for the missed quizzes,
assignments & projects
– Keep mobile phones on silent mode in the class
– Whatever you do, please do not create
disturbance in the class
Dishonesty, Plagiarism
• Students involved in any kind of cheating in any
exam (Quizzes/Assignments/Projects) will get 0
(Zero) in that exam OR get F in the course
• In case of quizzes with cheating case, their
weightage will be highest.
Pre-Requisites
• Must have cleared Automata Theory
• Should be familiar with programming
languages
Tentative Evaluation Breakdown
Total 100
Quizzes 10
Assignments 8
Projects 12
Mid-Term Examination 20
Final Examination 50
Assignments and Projects
• 2 Assignments
• 2 Projects
– Phase 1: Write language specification and
generate scanner and parser for the language.
– Phase 2: Write intermediate code generator and
target code generator in java for the output of
Phase 1.
– Documentation is mandatory along with the
projects.
No late submissions will be accepted!!!!
Groups for Projects
• 4-5 students per group
– All the students must participate in project
development process
– Marking will done on individual understanding
– List to be submitted today
• Assignments – in group
Tools required
• JavaCC
– a parser generator and a lexical analyzer generator
• Javac
– compiler for java programs
• Java Virtual Machine
• Eclipse(optional)
– plugin available for javacc
Course outline
Week # Topic
1 Introduction to compilers
2 Revision RE, DFA,…Project Introduction
3 Lexical Analysis
4 Introduction to Parsing
5 Issues in parsing grammars
6 Top down parsing
7 Syntax directed translation
8 Semantic analysis
9 Midterms
Course outline
Week # Topic
10 Intermediate code generation
11 Optimizations
12 Code generation
13 Runtime Environments
14 Bottom-up analysis
15,16 Demos
17 Revision
18 Final Exams
Books
• Text Book:
– Compilers Principles, Techniques and Tools, 2nd edition, the
“Dragon Book”, by Aho/Lam/Sethi/Ullman. Pearson
Education Inc. 2009.
• Web Reference:
– Lectures from New York University.
– Link: http://cs.nyu.edu/courses/fall14/CSCI-GA.2130-001/
– Lectures from Stanford University.
– Link: http://web.stanford.edu/class/cs143/
15
What is a compiler, what does it do?
What is this course about?
• How are programs written in a high-level
language transformed into machine code?
• How can we ensure a program is somewhat
meaningful?
• How is the program’s memory managed?
• How can we analyze programs to discover
invariant properties and improve their runtime
performance?
Compiler
• Source programs: typically high level,
• Target programs: typically assembler or
object/machine code,
• Compiler: implementations, e.g. C, Python, Java ...
Historical background
Modern languages and compilers
• Object-oriented languages
– Java, C++, C#, Python, Ruby,…
– Compilers are more complex
• What does Java compiler do?
• What about C#?
We already have many languages and their
compilers,...
Do we still need more compilers??
Context denpendency
• Designed for specific context
• Cannot be used in a different scenario
• Example
– SQL
• Where is it used?
– Java
–C
– Python
Example
• Go programming language by Google
– Why did they develop it?
– Large server software in C++
– Compilation was too slow
• Go programs compile faster than other languages
• Why?
Thats why we need to study compiler construction!
Language Processors
• Interpreter:
– a program (written in a meta language) for
executing another program.
• Compiler:
– a program (written in a meta language) that
translates a program into an equivalent program.
• Meta-language:
– a language to talk about another language
Advantages of compilers
• allow programming at an understandable abstraction
level,
• allow programs to be written in machine-
independent languages,
• help in verifying software and error reporting,
• help code optimization.
What is a compiler?
• A programming language translator
Source code Compiler Target code
Structure
Source Front IR Back Target
Code code Code
End End
Over Simplified!
Source
Code
Front IR IR IR Back
End code Optimization code End
Target
Code
Front End
Input string
Lexical Syntactic Semantic
analysis analysis analysis
IR Code
Lexical analysis
Le t t e r s Words
Lexical
Characters analyzer
tokens
Lexical analysis
Stream of characters
What are the tokens that will be produced by
lexer?
Lexemes
From Linguistics we have that a lexeme is the smallest
meaningful entity of a language.
The lexemes here: position, =, initial, +, rate, *, and 60.
Lexical analysis
Stream of characters
scanned into list of tokens, one for each lexeme:
⟨id,1⟩ ⟨=⟩ ⟨id,2⟩ ⟨+⟩ ⟨id,3⟩ ⟨∗⟩ ⟨num,60⟩
Syntactic analysis
⟨id,1⟩ ⟨=⟩ ⟨id,2⟩ ⟨+⟩ ⟨id,3⟩ ⟨∗⟩ ⟨num,60⟩
parsed into syntax tree (precedence):
Semantic analysis
enriched with semantic information (explicit
type conversion):
Intermediate Representation
translated to intermediate code:
Back End
Intermediate code
IR Code
Optimization generation
Target code
Optimization
optimized to
Code Generation
generates
Class Activity
Your task is to show the output of each of the
compiler phases discussed in this lecture for
the source code statement
position = (initial + rate)
Class Activity 2
Your task is to show the output of each of the
compiler phases discussed in this lecture for
the source code statement
if x == y then z = 1;
Note that keywords will be identified separately by the lexer
Compiler-construction tools
• Commonly used tools:
– scanner generators: token description → lexical
analyser,
– parser generators: grammar → syntax analyser,
– syntax-directed translation engines: syntax tree → IR,
– code-generator generators: translation rules → code
generator,
• Compiler-generation: integrated set of the above.
JavaCC
• JavaCC is Java’s Compiler Compiler
– creates scanners (lexical analyzers, lexers) and
parsers written in Java
• Since it creates both the lexer and the parser,
JavaCC is thus equivalent to a combination of
the traditional lex/flex and yacc/bison tools
• Free download from
– https://javacc.dev.java.net
JavaCC
• Unlike yacc/bison, JavaCC support is limited to
LL(k)
• As with lex/flex/yacc/bison, JavaCC’s .jj file
includes the grammars as well as embedded
code that defines what happens as
productions are recognized
• Output: seven Java source files, comprising
the lexer, parser, and supporting classes such
as exceptions, I/O utilities, and constants
Assignment 1
• Your task is to show the output of each of the
compiler phases discussed in this lecture for
the source code statement
celsius = (fahrenheit – 32) * (5/9)
Assignment 1
• What sequence of tokens does the lexical analyzer output?
• What abstract syntax tree does the syntax analyzer output?
• What is the abstract syntax tree after the semantic analyzer
modifies it?
• What sequence of three-address instructions does the IR
generator output?
• What sequence of three-address instructions does the
optimizer output?
• What sequence of machine instructions does the code
generator output?
Assignment 1
• Due date
– next week before the class
• No late submissions allowed after the
deadline
Quiz in the next week
• Revise your Automata Theory
• Topics
– Regular Expressions
– Deterministic Finite Automata
– Context Free Grammar
– Kleene’s closure
– Positive closure,…
References
• Book Section 1.1, 1.2
• Lectures from New York University. Link:
http://cs.nyu.edu/courses/fall14/CSCI-
GA.2130-001/
• Lectures from Stanford University. Link:
http://web.stanford.edu/class/cs143/