0% found this document useful (0 votes)

226 views56 pages

Chapter 2

The document discusses lexical analysis and the role of a lexical analyzer. It describes how a lexical analyzer works with a parser to break a source program into tokens. It defines tokens, patterns, and lexemes. Regular expressions are used to specify patterns for tokens like identifiers, keywords, numbers, and comments. Finite automata like NFAs and DFAs are used to implement the lexical analysis process. Lexical scanners can be generated using a tool called Lex that allows specifying patterns with regular expressions.

Uploaded by

Ermias Mesfin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

226 views56 pages

Chapter 2

Uploaded by

Ermias Mesfin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 56

Chapter – two

Lexical analysis

1
Outline
 Introduction
 Interaction of Lexical Analyzer with Parser
 Token, pattern, lexeme
 Specification of patterns using regular expressions
 Regular expressions
 Regular expressions for tokens

 NFA and DFA

 Conversion from RE to NFA to DFA…
 Lex Scanner Generator
 Creating a Lexical Analyzer with Lex
 Regular Expressions in Lex
 Lex specifications and examples

2
Introduction
 The role of lexical analyzer is:
• to read a sequence of characters from the source
program
• group them into lexemes and
• produce as output a sequence of tokens for each
lexeme in the source program.
 The scanner can also perform the following
secondary tasks:
 stripping out blanks, tabs, new lines
 stripping out comments
 keep track of line numbers (for error reporting)

3
Interaction of the Lexical Analyzer
with the Parser

next char next token

lexical Syntax
analyzer analyzer
get next
char get next
token

Source
Program
symbol
table

(Contains a record
for each identifier)

token: smallest meaningful sequence of characters

of interest in source program
4
Token, pattern, lexeme
 A token is a sequence of characters from the source
program having a collective meaning.
 A token is a classification of lexical units.
- For example: id and num
 Lexemes are the specific character strings that make up
a token.
– For example: abc and 123A
 Patterns are rules describing the set of lexemes
belonging to a token.
– For example: “letter followed by letters and digits”
 Patterns are usually specified using regular expressions.
[a-zA-Z]*
Example: printf("Total = %d\n", score);

5
Example

Token Informal description Sample lexemes

if Characters i, f if
else Characters e, l, s, e else
comparison < or > or <= or >= or == or != <=, !=

id Letter followed by letter and digits pi, score, D2

number Any numeric constant 3.14159, 0, 6.02e23
literal Anything but “ sorrounded by “ “core dumped”

• In general, in programming languages, the following

are tokens:
keywords, operators, identifiers, constants, literals,
punctuation symbols…
Attributes of tokens
 When more than one pattern matches a lexeme, the
scanner must provide additional information about the
particular lexeme to the subsequent phases of the
compiler.
 For example, both 0 and 1 match the pattern for the
token num.
 But the code generator needs to know which number is
recognized.
 The lexical analyzer collects information about tokens
into their associated attributes.
• Tokens influence parsing decisions;
• Attributes influence the translation of tokens after
parse
7
Attributes of tokens…
 Practically, a token has one attribute:
 a pointer to the symbol table entry in which
information about the token is kept.
 The symbol table entry contains various
information about the token
 such as its lexeme, type, the line number in which
it was first seen …

Ex. y = 31 + 28 * x, The tokens and their

attributes are written as:

8
Attributes of tokens…

9
9
Errors
 Very few errors are detected by the lexical
analyzer.
 For example, if the programmer mistakes fi for
if, the lexical analyzer cannot detect the error
since it will consider as an identifier.
 Nonetheless, if a certain sequence of
characters follows none of the specified
patterns, the lexical analyzer can detect the
error.

10
Errors…
 When an error occurs, the lexical analyzer
recovers by:
 skipping (deleting) successive characters from the
remaining input until the lexical analyzer can find a
well-formed token (panic mode recover)
 deleting one character from the remaining input
 inserting missing characters into the remaining input
 replacing an incorrect character by a correct
character
 transposing two adjacent characters

11
Specification of patterns using
regular expressions

 Regular expressions
 Regular expressions for tokens

12
Regular expression: Definitions

 Represents patterns of strings of characters.

 An alphabet Σ is a finite set of symbols
(characters)
 A string s is a finite sequence of symbols
from Σ
 |s| denotes the length of string s
 ε denotes the empty string, thus |ε| = 0
 A language L is a specific set of strings over
some fixed alphabet Σ

13
Regular expressions…
 A regular expression is one of the following:
Symbol: a basic regular expression consisting of a single
character a, where a is from:
 an alphabet Σ of legal characters;
 the metacharacter ε: or
 the metacharacter ø.
 In the first case, L(a)={a};
 in the second case, L(ε)= {ε};
 in the third case, L(ø)= { }.
 {} – contains no string at all.
 {ε} – contains the single string consists of no character

14
Regular expressions…
 Union/Alternation: an expression of the form r|s,
where r and s are regular expressions.
 In this case , L(r|s) = L(r) U L(s) ={r,s}

 Concatenation: An expression of the form rs, where r

and s are regular expressions.
 In this case, L(rs) = L(r)L(s)={rs}

 Repetition: An expression of the form r*, where r is a

regular expression.
 In this case, L(r*) = L(r)* ={ε, r,…}

15
Regular expression: Language Operations
 Union of L and M
L ∪ M = {s |s ∈ L or s ∈ M}
 Concatenation of L and M
 LM = {xy | x ∈ L and y ∈ M}
 Exponentiation of L
 L0 = {ε}; Li = Li-1L The following shorthands
 Kleene closure of L are often used:
 L* = ∪i=0,…,∞ Li r+ =rr*
 Positive closure of L r* = r+| ε
r? =r|ε
 L+ = ∪i=1,…,∞ Li
16
Regular expressions…
Examples:
1- a | b = {a,b}
2- (a|b)a = {aa,ba}
3- (ab) | ε ={ab, ε}
4- ((a|b)a)* = {ε, aa,ba,aaaa,baba,....}
 Reverse
1 – Even binary numbers (0|1)*0
2 – An alphabet consisting of just three alphabetic
characters: Σ = {a, b, c}. Consider the set of all strings
over this alphabet that contains exactly one b.
(a | c)*b(a|c)* {b, abc, abaca, baaaac, ccbaca, cccccb}

17
Exercises
 Describe the languages denoted by the following
regular expressions:
1- a(a|b)*a
2- ((ε|a)b*)*
3- (a|b)*a(a|b)(a|b)
4- a*ba*ba*ba*
5- (aa|bb)*((ab|ba)(aa|bb)*(ab|ba)(aa|bb)*)*

18
Regular expressions for tokens

 Regular expressions are used to specify the

patterns of tokens.
 Each pattern matches a set of strings. It falls into
different categories:
 Reserved (Key) words: They are represented by
their fixed sequence of characters,
 Ex. if, while and do....
 If we want to collect all the reserved words into
one definition, we could write it as follows:
Reserved = if | while | do |...

19
Regular expressions for tokens…
 Special symbols: including arithmetic operators,
assignment and equality such as =, :=, +, -, *
 Identifiers: which are defined to be a sequence of
letters and digits beginning with letter,
 we can express this in terms of regular definitions as
follows:
letter = A|B|…|Z|a|b|…|z
digit = 0|1|…|9
or
letter= [a-zA-Z]
digit = [0-9]
identifiers = letter(letter|digit)*
20
Regular expressions for tokens…
 Numbers: Numbers can be:
 sequence of digits (natural numbers), or
 decimal numbers, or
 numbers with exponent (indicated by an e or E).
 Example: 2.71E-2 represents the number 0.0271.
 We can write regular definitions for these numbers as
follows:
nat = [0-9]+
signedNat = (+|-)? Nat
number = signedNat(“.” nat)?(E signedNat)?
 Literals or constants: which can include:
 numeric constants such as 42, and
 string literals such as “ hello, world”.
21
Regular expressions for tokens…

 relop < | <= | = | <> | > | >=

 Comments: Ex. /* this is a C comment*/
 Delimiter  newline | blank | tab | comment
 White space = (delimiter )+

22
Design of a Lexical Analyzer/Scanner
Finite Automata
 Lex – turns its input program into lexical analyzer.
 At the heart of the transition is the formalism known as
finite automata.
1. Finite automata are recognizers; they simply say "yes" or
"no" about each possible input string.
2. Finite automata come in two flavors:
a) Nondeterministic finite automata (NFA) have no restrictions
on the labels of their edges.
ε, the empty string, is a possible label.
b) Deterministic finite automata (DFA) have, for each state,
and for each symbol of its input alphabet exactly one edge
with that symbol leaving that state.

23
The Whole Scanner Generator Process
Overview
 Direct construction of Nondeterministic finite
Automation (NFA) to recognize a given regular
expression.
 Easy to build in an algorithmic way
 Requires ε-transitions to combine regular sub expressions
 Construct a deterministic finite automation
(DFA) to simulate the NFA Optional
 Use a set-of-state construction
 Minimize the number of states in the DFA
 Generate the scanner code.

24
Design of a Lexical Analyzer …
 Token  Pattern
 Pattern  Regular Expression
 Regular Expression  NFA
 NFA  DFA
 DFA’s or NFA’s for all tokens  Lexical Analyzer

25
Non-Deterministic Finite Automata
(NFA)
Definition
 An NFA M consists of five tuples: ( Σ,S, T, s0, F)
 A set of input symbols Σ, the input alphabet
 a finite set of states S,
 a transition function T: S × (Σ U { ε}) -> S (next state),

 a start state s0 from S, and

 a set of accepting/final states F from S.

 The language accepted by M, written L(M), is defined as:
The set of strings of characters c1c2...cn with each ci from Σ
U { ε} such that there exist states s 1 in T(s0,c1), s2 in
T(s1,c2), ... , sn in T(sn-1,cn) with sn an element of F.
26
NFA…
 It is a finite automata which has choice of
edges
• The same symbol can label edges from one state to
several different states.
 An edge may be labeled by ε, the empty
string
• We can have transitions without any input
character consumption.

27
Transition Graph
 The transition graph for an NFA recognizing the
language of regular expression (a|b)*abb
all strings of a's and b's ending in the
particular string abb
a

start a b b
0 1 2 3

b S={0,1,2,3}
Σ={a,b}
S0=0
F={3}
28
Transition Table
 The mapping T of an NFA can be represented
in a transition table
State Input Input Input
a b ε
0 {0,1} {0} ø

T(0,a) = {0,1} 1 ø {2} ø

T(0,b) = {0}
T(1,b) = {2} 2 ø {3} ø
T(2,b) = {3}
3 ø ø ø

The language defined by an NFA is the set of input

strings it accepts, such as (a|b)*abb for the example
NFA
29
Acceptance of input strings by NFA
 An NFA accepts input string x if and only if there is some path in the transition graph from the start state to one of the accepting states
 The string aabb is accepted by the NFA:

a a b b
0 0 1 2 3 YES

a a b b
0 0 0 0 0 NO

30
Another NFA
a

a


start
b
b


An -transition is taken without consuming any character from

the input.
What does the NFA above accepts?

aa*|bb*
31
Deterministic Finite Automata (DFA)

 A deterministic finite automaton is a special

case of an NFA
 No state has an ε-transition
 For each state S and input symbol a there is at
most one edge labeled a leaving S
 Each entry in the transition table is a single state
 At most one path exists to accept a string
 Simulation algorithm is simple

32
DFA example
A DFA that accepts (a|b)*abb

33
Simulating a DFA: Algorithm
How to apply a DFA to a string.
INPUT:
 An input string x terminated by an end-of-file character eof.
 A DFA D with start state So, accepting states F, and
transition function move.
OUTPUT: Answer ''yes" if D accepts x; "no" otherwise
METHOD
 Apply the algorithm in (next slide) to the input string x.
 The function move(s, c) gives the state to which there is an
edge from state s on input c.
 The function nextChar() returns the next character of the
input string x.

34
Simulating a DFA
s = so;
c = nextchar();
while ( c != eof ) {
s = move(s, c);
c = nextchar();
}
if ( s is in F ) return
"yes"; DFA accepting (a|b)*abb
else return "no";

Given the input string ababb, this DFA enters the

sequence of states 0,1,2,1,2,3 and returns "yes"
35
DFA: Exercise

 Draw DFAs for the string matched by the

following definition:
digit =[0-9]
nat=digit+
signednat=(+|-)?nat
number=signednat(“.”nat)?(E signedNat)?

36
Design of a Lexical Analyzer Generator

Regular Expression DFA

Two algorithms:
1- Translate a regular expression into an NFA
(Thompson’s construction)

2- Translate NFA into DFA

(Subset construction)
37
From regular expression to an NFA
 It is known as Thompson’s construction.

Rules:
1- For an ε, a regular expressions, construct:

start a

38
From regular expression to an NFA…
2- For a composition of regular expression:
 Case 1: Alternation: regular expression(s|r), assume
that NFAs equivalent to r and s have been
constructed.

39
39
From regular expression to an NFA…
 Case 2: Concatenation: regular expression sr

ε
…r …s

Case 3: Repetition r*

40
From RE to NFA:Exercises

 Construct NFA for token identifier.

letter(letter|digit)*
 Construct NFA for the following regular
expression:
(a|b)*abb

41
From an NFA to a DFA
(subset construction algorithm)

 Input NFA N Both accept the same

Output DFA D language usage (RE)

Rules:
 Start state of D is assumed to be unmarked.
 Start state of D is = ε-closer (S0),
where S0 - start state of N.

42
NFA to a DFA…
ε- closure
ε-closure (S’) – is a set of states with the following
characteristics:
1- S’ € ε-closure(S’) itself
2- if t € ε-closure (S’) and if there is an edge labeled
ε from t to v, then v € ε-closure (S’)
3- Repeat step 2 until no more states can be added
to ε-closure (S’).
E.g: for NFA of (a|b)*abb
ε-closure (0)= {0, 1, 2, 4, 7}
ε-closure (1)= {1, 2, 4}

43
NFA to a DFA…
Algorithm
While there is unmarked state
X = { s0, s1, s2,..., sn} of D do
Begin
Mark X
For each input symbol ‘a’ do
Begin
Let T be the set of states to which there is a transition ‘a’ from state s i in X.
Y= ε-Closer (T)
If Y has not been added to the set of states of D then {
Mark Y an “Unmarked” state of D add a transition from X to Y labeled a if
not already presented
}
End
End

44
NFA for identifier: letter(letter|digit)*
ε

letter
3 4
ε ε
start
letter ε ε
0 1 2 7 8
digit ε
ε 5 6

45
NFA to a DFA…
Example: Convert the following NFA into the corresponding
DFA. letter (letter|digit)*
A={0}
B={1,2,3,5,8}
start letter C={4,7,2,3,5,8}
A B
D={6,7,8,2,3,5}

letter digit
letter
digit D digit
C

letter

46
Exercise: convert NFA of (a|b)*abb in to DFA.

47
The Lexical- Analyzer Generator: Lex
 The first phase in a compiler is, it reads the
input source and converts strings in the source
to tokens.
 Lex: generates a scanner (lexical analyzer or
lexer) given a specification of the tokens using
REs.
 The input notation for the Lex tool is referred to as
the Lex language and
 The tool itself is the Lex compiler.
 The Lex compiler transforms the input patterns into a
transition diagram and generates code, in a file called
lex.yy.c, that simulates this transition diagram.

48
Lex…

 By using regular expressions, we can specify

patterns to lex that allow it to scan and match
strings in the input.
 Each pattern in lex has an associated action.
 Typically an action returns a token, representing the
matched string, for subsequent use by the parser.
 It uses patterns that match strings in the input and
converts the strings to tokens.

49
General Compiler Infra-structure
Parse tree
Program source Tokens Parser
Scanner Semantic
(tokenizer) Routines
(stream of
characters) Annotated/decorated
x c
Le
Y ac tree

Analysis/
Transformations/
Symbol and optimizations
literal Tables
IR: Intermediate
Representation

Code
Generator

Assembly code

50
Scanner, Parser, Lex and Yacc

5151
Generating a Lexical Analyzer using Lex
Lex is a scanner generator ----- it takes lexical specification as
input, and produces a lexical analyzer written in C.

Lex source
program Lex compiler lex.yy.c
lex.l

lex.yy.c
C compiler a.out

Sequence of
Input stream
a.out tokens

Lexical Analyzer
52
Implementation
 A DFA can be implemented by a 2D table T
 One dimension is “states”
 Other dimension is “input symbols”

 For every transition Si a Sk define T[i,a] = k

 DFA “execution”
 If in state Si and input a, read T[i,a] = k and
skip to state Sk
 Very efficient

53
Table Implementation of a DFA
0
0 T

S 0 1
1
1 U

0 1
S T U
T T U
U T U

54
Implementation (Cont.)
 NFA -> DFA conversion is at the heart of
tools such as flex or jflex

 But, DFAs can be huge

 In practice, flex-like tools trade off speed

for space in the choice of NFA and DFA
representations

55
Readings
 Chapter 3 of the book: Aho2

Chapter 2 - Lexical Analysis
100% (1)
Chapter 2 - Lexical Analysis
69 pages
CD PPTS 2
No ratings yet
CD PPTS 2
27 pages
Chapter 4 - Pushdown Automata
No ratings yet
Chapter 4 - Pushdown Automata
42 pages
Chapter 2 Lexical Analysis
No ratings yet
Chapter 2 Lexical Analysis
26 pages
SWE-202 Automata Theory Formal Language Final Exam Paper
No ratings yet
SWE-202 Automata Theory Formal Language Final Exam Paper
3 pages
Software Development Process Models Overview
No ratings yet
Software Development Process Models Overview
29 pages
Unit - Ii 2.1 Syntax Analysis
No ratings yet
Unit - Ii 2.1 Syntax Analysis
122 pages
Compiler Design: Ambo University School of Informatics and Electrical Engineering Department of Computer Science
No ratings yet
Compiler Design: Ambo University School of Informatics and Electrical Engineering Department of Computer Science
35 pages
rkCD-Chapter 2 - LEXICAL ANALYSIS
No ratings yet
rkCD-Chapter 2 - LEXICAL ANALYSIS
9 pages
Event-Driven Programming in .NET
No ratings yet
Event-Driven Programming in .NET
40 pages
Automata Theory Lec-02
No ratings yet
Automata Theory Lec-02
31 pages
Compiler Construction Design Phases
No ratings yet
Compiler Construction Design Phases
7 pages
Turing Machine Basics for CS Students
100% (1)
Turing Machine Basics for CS Students
67 pages
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part1
No ratings yet
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part1
63 pages
Formal Language and Compiler Design - 2
No ratings yet
Formal Language and Compiler Design - 2
40 pages
Adama Science and Technology University: School of Electrical Engineering and Computing
No ratings yet
Adama Science and Technology University: School of Electrical Engineering and Computing
10 pages
Rift Valley University Department of Computer Science
No ratings yet
Rift Valley University Department of Computer Science
35 pages
Arid Agriculture University, Rawalpindi: (Theory)
No ratings yet
Arid Agriculture University, Rawalpindi: (Theory)
6 pages
Compiler Design - Lexical Analysis
No ratings yet
Compiler Design - Lexical Analysis
2 pages
A Ad - A - Ab - Abc - B: Generate The SLR Parsing Table For The Following Grammar
0% (1)
A Ad - A - Ab - Abc - B: Generate The SLR Parsing Table For The Following Grammar
7 pages
Final Exam 50% Compiler Design
No ratings yet
Final Exam 50% Compiler Design
4 pages
Parallel Algorithm Design Guide
No ratings yet
Parallel Algorithm Design Guide
35 pages
Big O Notation and Algorithm Efficiency
No ratings yet
Big O Notation and Algorithm Efficiency
20 pages
Architectural Design: Establishing The Overall Structure of A Software System
No ratings yet
Architectural Design: Establishing The Overall Structure of A Software System
44 pages
Activity Diagram Validity in SE-210 Lab
No ratings yet
Activity Diagram Validity in SE-210 Lab
6 pages
Compiler-All-Anna-Question Till Nov-2016 PDF
No ratings yet
Compiler-All-Anna-Question Till Nov-2016 PDF
39 pages
Linker and Loader
100% (1)
Linker and Loader
25 pages
Chapter 3 - Searching and Planning
No ratings yet
Chapter 3 - Searching and Planning
56 pages
Advantages of Rapid Application Development
No ratings yet
Advantages of Rapid Application Development
6 pages
Computer Graphics Assignment
No ratings yet
Computer Graphics Assignment
13 pages
Chapter 5 Turing Machines
100% (1)
Chapter 5 Turing Machines
47 pages
Synchronization
No ratings yet
Synchronization
3 pages
Compiler Design Lab Manual for CSE
75% (16)
Compiler Design Lab Manual for CSE
55 pages
Introduction to Compiler Design
100% (1)
Introduction to Compiler Design
13 pages
Allslides Handout
No ratings yet
Allslides Handout
269 pages
CD Assignment-2
No ratings yet
CD Assignment-2
16 pages
Compiler Design Question Bank 2024
No ratings yet
Compiler Design Question Bank 2024
9 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
45 pages
ICT Assignment For Grade 10
No ratings yet
ICT Assignment For Grade 10
1 page
Operating System Course Outcome
No ratings yet
Operating System Course Outcome
6 pages
C# Assignment PDF
No ratings yet
C# Assignment PDF
49 pages
Regular Expressions Guide
No ratings yet
Regular Expressions Guide
72 pages
Simple Sorting and Searching Algorithms 2.1searching: Pseudocode
No ratings yet
Simple Sorting and Searching Algorithms 2.1searching: Pseudocode
7 pages
Regular Expressions in Automata Theory
No ratings yet
Regular Expressions in Automata Theory
28 pages
Compiler Design Worksheet
No ratings yet
Compiler Design Worksheet
2 pages
Theory of Automata Practice Questions For Mid 2
No ratings yet
Theory of Automata Practice Questions For Mid 2
9 pages
Chapter 4 LL - Requirement Engineering
No ratings yet
Chapter 4 LL - Requirement Engineering
5 pages
MOE Computer Science Departemt Exit Exam 2017
No ratings yet
MOE Computer Science Departemt Exit Exam 2017
14 pages
SWE-308 Human Computer Interaction Final Exam Paper 2021
No ratings yet
SWE-308 Human Computer Interaction Final Exam Paper 2021
4 pages
Ch-4 Geometry and Line Generation
No ratings yet
Ch-4 Geometry and Line Generation
55 pages
4th Edited New LMS For Jku Final Project Proposal Phase 1
No ratings yet
4th Edited New LMS For Jku Final Project Proposal Phase 1
56 pages
Software Component Design Course Outline
100% (1)
Software Component Design Course Outline
3 pages
CLR Parser
No ratings yet
CLR Parser
12 pages
Wolkite University Library Management System
100% (2)
Wolkite University Library Management System
2 pages
List of Experiment
No ratings yet
List of Experiment
51 pages
Sample Questions: Automata-PRO
No ratings yet
Sample Questions: Automata-PRO
2 pages
Chapter 5 Sad
No ratings yet
Chapter 5 Sad
12 pages
Chapter 2
No ratings yet
Chapter 2
91 pages
Chapter 2
No ratings yet
Chapter 2
99 pages
Compiler Design Chapter-2
60% (5)
Compiler Design Chapter-2
105 pages
Electrical Engineering Curriculum AAiT
100% (1)
Electrical Engineering Curriculum AAiT
287 pages
Code Generation
No ratings yet
Code Generation
30 pages
PC Troubleshooting and Maintenance
No ratings yet
PC Troubleshooting and Maintenance
129 pages
ECEG-1551-FEC-Lec - 02 - DC Analysis
No ratings yet
ECEG-1551-FEC-Lec - 02 - DC Analysis
100 pages
Circuits Basics for Engineering Students
No ratings yet
Circuits Basics for Engineering Students
32 pages
Git Training Overview by Feven Tadesse
No ratings yet
Git Training Overview by Feven Tadesse
51 pages
Chapter Five
No ratings yet
Chapter Five
88 pages
Chapter Two
No ratings yet
Chapter Two
50 pages
Basic Program Construction (Lecture-4)
No ratings yet
Basic Program Construction (Lecture-4)
16 pages
C++ Project
No ratings yet
C++ Project
1 page
Text Book:: Course Reference Website: Https://bit - Ly/intro2cpp Join Telegram Channel: Https://t.me/intro2cpp
No ratings yet
Text Book:: Course Reference Website: Https://bit - Ly/intro2cpp Join Telegram Channel: Https://t.me/intro2cpp
2 pages
SPM Chapter 2 Ext Crashing and Resource Leveling
No ratings yet
SPM Chapter 2 Ext Crashing and Resource Leveling
12 pages
Introduction To Computing (ECEG-2207) Labratory Manual (Version 2.0)
No ratings yet
Introduction To Computing (ECEG-2207) Labratory Manual (Version 2.0)
15 pages
Wave Shaping and Generators in Electronics
No ratings yet
Wave Shaping and Generators in Electronics
34 pages
Solution To Source Code For Introduction To Computing (Solution For Laboratory Exercises 5-7)
No ratings yet
Solution To Source Code For Introduction To Computing (Solution For Laboratory Exercises 5-7)
14 pages
Op Amp Basics: Applied Electronics II
No ratings yet
Op Amp Basics: Applied Electronics II
46 pages
Applied Electronics II (Chapter 1)
100% (6)
Applied Electronics II (Chapter 1)
45 pages
Applied Electronics II (Chapter 2)
100% (2)
Applied Electronics II (Chapter 2)
29 pages
6th Sem
No ratings yet
6th Sem
50 pages
Laboratory - Manual: Compiler Design
No ratings yet
Laboratory - Manual: Compiler Design
38 pages
CD Lab Manual
No ratings yet
CD Lab Manual
19 pages
LP Practical File 21dit044
No ratings yet
LP Practical File 21dit044
51 pages
Yacc Examples
No ratings yet
Yacc Examples
9 pages
Compiler Design Course Overview
No ratings yet
Compiler Design Course Overview
94 pages
Compiler Design Lab Course Overview
No ratings yet
Compiler Design Lab Course Overview
2 pages
CD Lab Manual BSC
No ratings yet
CD Lab Manual BSC
88 pages
Compiler Laboratory
No ratings yet
Compiler Laboratory
66 pages
Cs6109 - Compiler Design: Lab Assignment
No ratings yet
Cs6109 - Compiler Design: Lab Assignment
8 pages
Compiler Design Lab File
No ratings yet
Compiler Design Lab File
46 pages
Translation of Assignment Statements in Compilers
No ratings yet
Translation of Assignment Statements in Compilers
117 pages
Yuvraj CD
No ratings yet
Yuvraj CD
98 pages
Notes Compiler
No ratings yet
Notes Compiler
28 pages
Compiler Basics for Beginners
No ratings yet
Compiler Basics for Beginners
308 pages
Lex & Yacc Programming Guide
No ratings yet
Lex & Yacc Programming Guide
5 pages
CD Lab Manual
No ratings yet
CD Lab Manual
36 pages
AI & DS Lab Experiments R2021
No ratings yet
AI & DS Lab Experiments R2021
22 pages
Compiler Design Practical Exercises
No ratings yet
Compiler Design Practical Exercises
49 pages
Compiler Design
No ratings yet
Compiler Design
83 pages
ACD Unit-2 Part-1
No ratings yet
ACD Unit-2 Part-1
36 pages
Tarun
No ratings yet
Tarun
51 pages
Compiler Design Syllabus
No ratings yet
Compiler Design Syllabus
6 pages
CD - Unit 1 Notes
No ratings yet
CD - Unit 1 Notes
38 pages
Compiler Design Lab Manual 2010
No ratings yet
Compiler Design Lab Manual 2010
35 pages
CD Lab Manual
No ratings yet
CD Lab Manual
28 pages
Compiler Design Lab Tasks
No ratings yet
Compiler Design Lab Tasks
44 pages
18csl66 - Ss Lab Manual
50% (2)
18csl66 - Ss Lab Manual
116 pages
Compiler Construction Lab Manual
No ratings yet
Compiler Construction Lab Manual
22 pages
3.7 LeX
No ratings yet
3.7 LeX
13 pages

Chapter 2

Uploaded by

Chapter 2

Uploaded by

Chapter – two

 NFA and DFA

next char next token

token: smallest meaningful sequence of characters

Token Informal description Sample lexemes

id Letter followed by letter and digits pi, score, D2

• In general, in programming languages, the following

Ex. y = 31 + 28 * x, The tokens and their

 Represents patterns of strings of characters.

 Concatenation: An expression of the form rs, where r

 Repetition: An expression of the form r*, where r is a

 Regular expressions are used to specify the

 relop < | <= | = | <> | > | >=

 a start state s0 from S, and

 a set of accepting/final states F from S.

T(0,a) = {0,1} 1 ø {2} ø

The language defined by an NFA is the set of input

An -transition is taken without consuming any character from

 A deterministic finite automaton is a special

Given the input string ababb, this DFA enters the

 Draw DFAs for the string matched by the

Regular Expression DFA

2- Translate NFA into DFA

 Construct NFA for token identifier.

 Input NFA N Both accept the same

 By using regular expressions, we can specify

 For every transition Si a Sk define T[i,a] = k

 But, DFAs can be huge

 In practice, flex-like tools trade off speed

You might also like