Compilation Techniques

The document covers compilation techniques, focusing on formal languages, regular expressions, and finite automata. It explains concepts such as alphabets, strings, concatenation, union, and the closure of languages, as well as the differences between nondeterministic and deterministic finite automata. Additionally, it discusses the conversion of regular expressions to finite automata and the optimization of these automata.

Uploaded by

Istin Codruta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views21 pages

Compilation Techniques

Uploaded by

Istin Codruta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Compilation Techniques

(2)
Formal languages
Regular expressions
Finite Automata
Alphabet
A set of symbols, letters, digits,
punctuators, spaces, …
Noted as a set: A={…}

Binary:{0, 1}
Morse: { . , _ }
ASCII
UNICODE
String (sentence or word)
Finite sequence of symbols from an
alphabet
Empty sequence: ϵ
|s| – the length of a string (in characters)

Binarystrings: ϵ,0,1,01,10,11,1000,0011
ASCII strings: ϵ,speed, 3.14159, ”hi”
Parts of a string
 PREFIX – any string obtained by removing zero or
more symbols from the end of the string.
( prefixes of “flower”: ϵ, f, flowe, flower )
 SUFFIX – any string obtained by removing zero or
more symbols from the beginning of the string
( suffixes of “flower”: ϵ, r, lower, flower )
 SUBSTRING – any string obtained by removing
any prefixes and suffixes from a string
( substrings of “flower”: ϵ, w, lowe, flower )
 PROPER prefixes, suffixes or substrings – the
ones which are not ϵ and are not the same as the
original string
Language
A countable set of strings over an
alphabet
Noted as a set: L={…}
Formal language – a language
constrained by rules
L+ – a language without ϵ

The language of strictly two binary digits:

{00, 01, 10, 11}
A baby language: {ϵ, mom, dad}
Concatenation of strings
The operation to obtain a new string by
adjoining two strings (the juxtaposition of
two strings)
Let L a language and x,y,z ∈ L
xy is the concatenation of x and y
(xy)z = x(yz)
ϵx = xϵ = x

Ex: L={ϵ, sun, flower} => concatenated strings:

sun, flower, sunflower, flowerflower, …
The (Kleene) closure
The set of strings obtained by
concatenating a language’s strings one or
more times
Noted L*
By definition: L0={ϵ}
Inductively: Ln=Ln-1L
L+ - a closure without ϵ (ϵ is present in L+ only if
it is present in L)
Ex: let L={a,b,…,z}
L2={aa,ab,…az,ba,bb,…bz,…za,zb,…zz}
L*={…any sequence of letters…}
Union
A new language obtained by all
strings from other two languages
L ∪ M = {s | s ∈ L or s ∈ M }

Ex: L={a,b}, M={0,1,2}

L ∪ M = {a,b,0,1,2}
Regular expressions
A regular expression “e” denotes a language L(e) over an
alphabet A, obtained by one of the following rules:
◦ ϵ is a regular expression and L(ϵ)={ϵ}
◦ if a∈A, L(a)={a}
◦ e* – repetition zero or more times: L(e)*. The operator * has the
highest precedence.
◦ e1e2 – concatenation: L(e1)L(e2). Concatenation (sequence) has the
second highest precedence.
◦ e1|e2 – union: L(e1)∪L(e2). Union (alternative) has the lowest
precedence.
◦ (e) – parenthesis around e does not change e: L(e)
Note: all operators are left associative
 A language that can be defined by a regular expression is called a
regular set. If two regular expressions r and s denote the same regular
set, we say they are equivalent: r=s (ex: a|b = b|a )
Extensions of regular expressions
 e+ – one or more instances of e: L(e)+.
e*=e+|ϵ e+=ee*=e*e
 e? – zero or one instance of e: L(e)∪{ϵ}
 if a1,a2,…an∈A (an alphabet):
a1|a2|…|an=[a1a2…an] (character class)
 if a1,a2,…an form a logical sequence (they are
consecutive in A):
a1|a2|…|an=[a1-an] (range)
 [^a1…an] – any character except a1…an
 \a – a loses any special significance and it will be
treated as a simple character

Examples of regular

expressions
flower – the sequence ‘f’ ‘l’ ‘o’ ‘w’ ‘e’ ‘r’
[0-9] – any digit
[^a-zA-Z] – any character except letters
[0-9]+(,[0-9]+)* –
a list of at least one integer, separated
by comma
[0-9]+(\.[0-9]+)?[eE][+\-]?[0-9]+ –
a number in scientific notation (mantissa
and exponent): 1.05e3, 7E-2
Finite automata
A finite set of states: S={S0,S1,…Sn-1}
A set of input symbols (the input alphabet): Σ={a1,a2,
…am}. (ϵ is never a member of Σ)
A state S0∈S, distinguished as start state (or initial
state)
 A set of states F⊂S, distinguished as accepting states
(or final states)
 A transition function
Nondeterministic Finite Automata (NFA)
 From the same state multiple transitions are possible
with the same input symbol
 ϵ is accepted as input symbol
 The transition function is defined as:

T:Sx(Σ∪{ϵ})->P(S)
 P(S) – the power set of S: the set of all subsets of S

S={a,b,c} => P(S)={{},{a},{b},{c},{a,b},{a,c},{b,c},{a,b,c}}

Deterministic Finite Automata (DFA)
For each state s and input symbol a, there is
exactly one transition from s labeled a
ϵ is not accepted as input symbol
The transition function is defined as:

T:SxΣ->S

Any DFA is a NFA

Finite automata representation
 S={0,1,2,3}, Σ={a,b}, initial state 0, F={3} State a b
 T given as a transition table:
0 0,1 0
for each state and input, 1 2
the possible transitions are given 2 3
3
 Ifno possible transitions are possible from a given state
with a given input, that cell is empty (or ∅)
 The initial state is figured sometimes with an arrow from
nowhere to it
a
a b b
0 1 2 3

b (a|b)*abb
NFA -> DFA (1)
DFA can use faster traversal algorithms
because in each state it is only one possible
transition for a given input character => it is
important to convert a NFA to a DFA
Rabin-Scott power set construction
For a state s, ϵ-closure(s): the set of all
reachable states from s, including itself, without
consuming any characters (by ϵ transitions)
SDFA – resulted DFA states. Initially S DFA will
contain sets of NFA states which will be
renamed later as new states
T – resulted DFA transitions
NFA -> DFA (2)
 If s0 is the NFA initial state: SDFA={ϵ-closure(s0)}, TDFA={}
 For each SDFA state (or set of states) s:
◦ For each input symbol a:
 Let p=the set of all NFA transitions from s with a
 Let c=ϵ-closure(p) in NFA
 If c is not in SDFA, add it to SDFA
 Add (c,a) to TDFA
 Final states in SDFA are the ones those sets contains
final states in NFA
 Rename all new sets in SDFA as new states
NFA -> DFA (3)
a
a b b
0 1 2 3
b

SDFA a b
0 {0,1} 0
{0,1} => 4 {0,1} {0,2}
{0,2} => 5 {0,1} {0,3}
{0,3} => 6 {0,1} 0
b

a
a b b
0 4 5 6
a
b
a
Regular expression -> NFA (1)
 Inorder to implement regular expressions, these can be
converted to NFA
 Thompson-McNaughton-Yamada algorithm
 Each component of the regular expression is
represented as in the following figures. The rectangles
are substituted with the NFA corresponding to that
regular expression. ϵ
ϵ e
ϵ a ϵ
a e?
ϵ
ϵ ϵ ϵ
e1e2 e1 e2
ϵ e
ϵ
ϵ
e* ϵ
ϵ
ϵ e1
e1|e2 ϵ
ϵ
ϵ ϵ
e2 ϵ ϵ ϵ
e+ e
Regular expression -> NFA (2)
(a|b)*abb
ϵ
ϵ a
ϵ
ϵ ϵ
ϵ ϵ b
ϵ
ϵ
ϵ a ϵ b ϵ b

Optimization: if between two states is only an ϵ transition, these states can be

joined, preserving all the incoming/outgoing transitions of both states

a
a b b
0 1 2 3

b
Bibliography reading
 Alfred V. Aho, Monica S. Lam, Ravi Sethi,
Jeffrey D. Ullman: Compilers. Principles,
Techniques and Tools, 2nd edition, Chapters
3.3, 3.6, 3.7

Lec02 Lexicalanalyzer
100% (1)
Lec02 Lexicalanalyzer
50 pages
CS 346: Compilers: Lexical Analyzer Lexical Analyzer
No ratings yet
CS 346: Compilers: Lexical Analyzer Lexical Analyzer
52 pages
Formal Languages Part 1 Including Regular Expressions: Basic Concepts For Symbols, Strings, and Languages
No ratings yet
Formal Languages Part 1 Including Regular Expressions: Basic Concepts For Symbols, Strings, and Languages
4 pages
CompilerD L3
No ratings yet
CompilerD L3
36 pages
CH 3 - Regular Languages Amd Regular Grammars
No ratings yet
CH 3 - Regular Languages Amd Regular Grammars
67 pages
Formal Language & Automata Basics
No ratings yet
Formal Language & Automata Basics
24 pages
Language Operations and Finite Automata
No ratings yet
Language Operations and Finite Automata
16 pages
Regular Expressions: Reading: Chapter 3
No ratings yet
Regular Expressions: Reading: Chapter 3
39 pages
CH 3 - Regular Languages Amd Regular Grammars
No ratings yet
CH 3 - Regular Languages Amd Regular Grammars
67 pages
Regular Expressions & Automata
No ratings yet
Regular Expressions & Automata
4 pages
Bengal College of Engineering & Technology: Regular Expressions
No ratings yet
Bengal College of Engineering & Technology: Regular Expressions
12 pages
TCS Notes
No ratings yet
TCS Notes
14 pages
Toc Unit 2
No ratings yet
Toc Unit 2
29 pages
TOC1
No ratings yet
TOC1
19 pages
Recognition of Tokens
No ratings yet
Recognition of Tokens
34 pages
Automata Lectuee3
No ratings yet
Automata Lectuee3
27 pages
FLAT Lec - 3
No ratings yet
FLAT Lec - 3
34 pages
Chapter 2 REGULAR EXPRESSION
No ratings yet
Chapter 2 REGULAR EXPRESSION
26 pages
TAFL Unit 1 - Basic Concepts and Automata Theory - Detailed Notes
No ratings yet
TAFL Unit 1 - Basic Concepts and Automata Theory - Detailed Notes
13 pages
Formal Language and Automata Theory: Prof. Sachin Jain, Prof - Atul Kumar, Prof. Vaibhavi Patel
No ratings yet
Formal Language and Automata Theory: Prof. Sachin Jain, Prof - Atul Kumar, Prof. Vaibhavi Patel
86 pages
Flat CH 2
No ratings yet
Flat CH 2
86 pages
Regular Expressions in Computer Science
No ratings yet
Regular Expressions in Computer Science
61 pages
Regular Expressions in Compiler Construction
No ratings yet
Regular Expressions in Compiler Construction
35 pages
Rahul Kumar Shaw
No ratings yet
Rahul Kumar Shaw
10 pages
Lecture 3 Lexical Analyzer
No ratings yet
Lecture 3 Lexical Analyzer
44 pages
CD - Unit1 - Lecture4 5 6 7
No ratings yet
CD - Unit1 - Lecture4 5 6 7
50 pages
CMP3008 LN4 RegularExpressions
No ratings yet
CMP3008 LN4 RegularExpressions
45 pages
Minimization of DFA
100% (1)
Minimization of DFA
25 pages
2 - Compilers (Lexical Analysis)
No ratings yet
2 - Compilers (Lexical Analysis)
60 pages
02 Automata
No ratings yet
02 Automata
78 pages
Regular Expression, DFA and NFA: Prepared By: Prof. J. S. Dhobi Prof. M. D. Mehta
No ratings yet
Regular Expression, DFA and NFA: Prepared By: Prof. J. S. Dhobi Prof. M. D. Mehta
82 pages
Regular Expressions & Automata
No ratings yet
Regular Expressions & Automata
28 pages
Understanding Regular Expressions and DFA
No ratings yet
Understanding Regular Expressions and DFA
16 pages
FLAT - Ch.2
No ratings yet
FLAT - Ch.2
86 pages
Chapter 3 Implementation - of - Lexical - Analysis
No ratings yet
Chapter 3 Implementation - of - Lexical - Analysis
63 pages
Spring 2024 Compiler Constructoin A Lab 3-2
No ratings yet
Spring 2024 Compiler Constructoin A Lab 3-2
16 pages
1.3 Regular Expression
No ratings yet
1.3 Regular Expression
47 pages
Regular Expression
No ratings yet
Regular Expression
106 pages
Finite Automata Answers
100% (1)
Finite Automata Answers
33 pages
Dfa 2
No ratings yet
Dfa 2
51 pages
Toc U2ppt
No ratings yet
Toc U2ppt
41 pages
Lexical Analyzer and Tokenization Process
No ratings yet
Lexical Analyzer and Tokenization Process
56 pages
Computability 05
No ratings yet
Computability 05
28 pages
Unit 3 - Regular Expression
No ratings yet
Unit 3 - Regular Expression
45 pages
Unit I
No ratings yet
Unit I
37 pages
Regular Expressions in ATCD
No ratings yet
Regular Expressions in ATCD
34 pages
Toc CHP-2
No ratings yet
Toc CHP-2
15 pages
Flat Unit
No ratings yet
Flat Unit
18 pages
Solution Module 1&2
No ratings yet
Solution Module 1&2
17 pages
Reg Exp 2 DFA
No ratings yet
Reg Exp 2 DFA
11 pages
Module 2flat
No ratings yet
Module 2flat
26 pages
Understanding Non-Deterministic Finite Automata
No ratings yet
Understanding Non-Deterministic Finite Automata
26 pages
Regular Expressions for Language Patterns
No ratings yet
Regular Expressions for Language Patterns
46 pages
Tafl Last Min Notes
No ratings yet
Tafl Last Min Notes
19 pages
Regular Expression
No ratings yet
Regular Expression
6 pages
Chapter 2 RegularExpressions
No ratings yet
Chapter 2 RegularExpressions
95 pages
Scanner and Token Recognition Basics
No ratings yet
Scanner and Token Recognition Basics
26 pages
Algebraic Laws for Regular Expressions
No ratings yet
Algebraic Laws for Regular Expressions
4 pages
Unit 1 Part 2 - Compiler
No ratings yet
Unit 1 Part 2 - Compiler
32 pages
Ad. Jingles Brand Recall
No ratings yet
Ad. Jingles Brand Recall
13 pages
Business Plan: of GROUP 1 From A2-11ABM-07
No ratings yet
Business Plan: of GROUP 1 From A2-11ABM-07
15 pages
Task 2 Module 9
No ratings yet
Task 2 Module 9
4 pages
Engineering Dynamics Essentials
No ratings yet
Engineering Dynamics Essentials
21 pages
Gmail - Your Refund Is Proposed To Be Adjusted Against An Outstanding Demand(s) - Kindly Respond - Intimation U - S 245 of Income Tax Act, 1961
No ratings yet
Gmail - Your Refund Is Proposed To Be Adjusted Against An Outstanding Demand(s) - Kindly Respond - Intimation U - S 245 of Income Tax Act, 1961
3 pages
Class Source #5 - What Does It Really Take To Build A New Habit
No ratings yet
Class Source #5 - What Does It Really Take To Build A New Habit
7 pages
Taylor Experiencing
No ratings yet
Taylor Experiencing
12 pages
Unit-10-Sources-of-Energy-Lesson-2-A Closer Look 1
No ratings yet
Unit-10-Sources-of-Energy-Lesson-2-A Closer Look 1
28 pages
Grade 10 Quadratic Relations Guide
No ratings yet
Grade 10 Quadratic Relations Guide
8 pages
Unit 4 5TH Basic
No ratings yet
Unit 4 5TH Basic
12 pages
ERP Production Planning Insights
No ratings yet
ERP Production Planning Insights
49 pages
Polyaluminium Silicate Sulphate - A New Coagulant For Potable and Wastewater Treatment
No ratings yet
Polyaluminium Silicate Sulphate - A New Coagulant For Potable and Wastewater Treatment
2 pages
Keterampilan Komunikasi Lisan dan Temu Ramah
No ratings yet
Keterampilan Komunikasi Lisan dan Temu Ramah
19 pages
Com Lynx User Guide 15 The 20110513
No ratings yet
Com Lynx User Guide 15 The 20110513
45 pages
Regression Performance Metrics Overview
No ratings yet
Regression Performance Metrics Overview
6 pages
The Human Side of Jose Rizal
100% (1)
The Human Side of Jose Rizal
9 pages
QC Tomotherapy
No ratings yet
QC Tomotherapy
37 pages
South Africa's 4IR Strategy
No ratings yet
South Africa's 4IR Strategy
17 pages
Enhancing Stability: A Review of Various Occlusal Schemes in Complete Denture Prosthesis
No ratings yet
Enhancing Stability: A Review of Various Occlusal Schemes in Complete Denture Prosthesis
8 pages
BMN B maXX 2024 en Web
No ratings yet
BMN B maXX 2024 en Web
40 pages
ABB Ability Wireless Monitor For Surge Arrester Data Sheet
No ratings yet
ABB Ability Wireless Monitor For Surge Arrester Data Sheet
4 pages
Rhode Island School of Design
No ratings yet
Rhode Island School of Design
108 pages
Gleason Gears
No ratings yet
Gleason Gears
8 pages
Important Equity Research Terms With Examples
No ratings yet
Important Equity Research Terms With Examples
13 pages
FMHM Coursefile
No ratings yet
FMHM Coursefile
77 pages
Liycy-Oz-Jz Delta PDF
No ratings yet
Liycy-Oz-Jz Delta PDF
3 pages
Me 160 Exam
No ratings yet
Me 160 Exam
29 pages
Car Series A, C,& D, Practice Questions
No ratings yet
Car Series A, C,& D, Practice Questions
26 pages
EDC Lab Manual: CS-302 Experiments
100% (1)
EDC Lab Manual: CS-302 Experiments
52 pages
Fast Response Quality Management Process
100% (1)
Fast Response Quality Management Process
19 pages

Compilation Techniques

Uploaded by

Compilation Techniques

Uploaded by

Compilation Techniques

The language of strictly two binary digits:

Ex: L={ϵ, sun, flower} => concatenated strings:

Ex: L={a,b}, M={0,1,2}

S={a,b,c} => P(S)={{},{a},{b},{c},{a,b},{a,c},{b,c},{a,b,c}}

Any DFA is a NFA

Optimization: if between two states is only an ϵ transition, these states can be

You might also like