0% found this document useful (0 votes)
13 views20 pages

Lec 5

Uploaded by

Mohammad Humayun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views20 pages

Lec 5

Uploaded by

Mohammad Humayun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Contents

 Recognition of Tokens
 Transition Diagrams
 Recognition of Reserved Words and Identifiers
 Recognizing Whitespace
 Recognizing Numbers

 Finite Automata
 NFA
 Transition Tables

1
Recognition of Tokens
 Now we see how to build a piece of code that examines the input
string and finds a prefix that is a lexeme matching one of the
patterns.
 Our current goal is to perform the lexical analysis needed for the
following grammar.

 Recall that the terminals are the tokens & the non-terminals
produce terminals.

2
Recognition of Tokens..
 A regular definition for the terminals is

3
Recognition of Tokens…
 We also want the lexer to remove whitespace so we define a new
token
ws → ( blank | tab | newline ) +

 where blank, tab, and newline are symbols used to represent the
corresponding ascii characters.

 If the lexer recognizes the token ws, it does not return it to the
parser but instead goes on, to recognize the next token, which is
then returned.

4
Recognition of Tokens..
 Our goal for the lexical analyzer is summarized below:

5
Transition Diagram
 As an intermediate step in the construction of a lexical analyzer, we
first convert patterns into stylized flowcharts, called "transition
diagrams”.

 Conversion of RE patterns to Transition Diagram.

 Transition diagrams have a collection of nodes or circles, called


states
 Each state represents a condition that could occur during the process
of scanning the input looking for a lexeme that matches one of
several patterns.

6
Transition Diagram..
 Edges are directed from one state of the transition diagram to
another.
 Each edge is labeled by a symbol or set of symbols.
 Some important conventions:
 The double circles represent accepting or final states at which point a
lexeme has been found. There is often an action to be done (e.g.,
returning the token), which is written to the right of the double circle.

 If we have moved one (or more) characters too far in finding the
token, one (or more) stars are drawn.

 An imaginary start state exists and has an arrow coming from it to


indicate where to begin the process.
7
Transition Diagram…
 A transition diagram that recognizes the lexemes matching the
token relop.

8
Recognition of Reserved Words and Identifiers
 Recognizing keywords and identifiers presents a problem.
 The transition diagram below corresponds to the regular definition
given previously.

9
Recognition of Reserved Words and Identifiers
 Two questions arises:

 How do we distinguish between identifiers and keywords such


as then, which also match the pattern in the transition diagram?

 What is (gettoken(), installID())?

 We will use the method, i.e having the keywords installed into the
identifier table prior to any invocation of the lexer.
 The table entry will indicate that the entry is a keyword.

10
Recognition of Reserved Words and Identifiers..
 installID() checks if the lexeme is already in the table. If it is not
present, the lexeme is installed as an id token. In either case a
pointer to the entry is returned.

 gettoken() examines the lexeme and returns the token name,


either id or a name corresponding to a reserved keyword.

 So far we have transition diagrams for identifiers (this diagram also


handles keywords) and the relational operators.

 What remains are whitespace, and numbers.

11
Recognizing Whitespace
 Recognizing Whitespace

 The delim in the diagram represents any of the whitespace


characters, say space, tab, and newline.
 The final star is there because we needed to find a non-whitespace
character in order to know when the whitespace ends and this
character begins the next token.
 There is no action performed at the accepting state.

12
Recognizing Numbers
 The transition diagram for token number

13
Finite Automata
 Finite automata are like the graphs in transition diagrams but they
simply decide if an input string is in the language (generated by our
regular expression).

 Finite automata are recognizers, they simply say "yes" or "no" about
each possible input string.

 There are two types of Finite automata:

 Nondeterministic finite automata (NFA) have no restrictions on the


labels of their edges. A symbol can label several edges out of the
same state, and ε, the empty string, is a possible label.

14
Finite Automata..

 Deterministic finite automata (DFA) have exactly one edge, for


each state, and for each symbol of its input alphabet with that
symbol leaving that state.

 So if you know the next symbol and the current state, the next state is
determined. That is, the execution is deterministic, hence the name.

 Both deterministic and nondeterministic finite automata are


capable of recognizing the same languages.

15
N - Finite Automata
 A nondeterministic finite automaton (NFA) consists of:

1. A finite set of states S.


2. A set of input symbols Σ, the input alphabet. We assume that ε, which
stands for the empty string, is never a member of Σ.
3. A transition function that gives, for each state, and for each symbol in
Σ U {ε} a set of next states.
4. A state S0 from S that is distinguished as the start state (or initial state)
5. A set of states F, a subset of S, that is distinguished as the accepting
states (or final states).

16
N - Finite Automata..
 An NFA is basically a flow chart like the transition diagrams we
have already seen.

 Indeed an NFA can be represented by a transition graph whose


nodes are states and whose edges are labeled with elements of Σ
∪ ε.
 The differences between a transition graph and previous transition
diagrams are:
 Possibly multiple edges with the same label leaving a single state.
 An edge may be labeled with ε.

17
N - Finite Automata...
 Ex: The transition graph for an NFA recognizing the language of
regular expression (a | b) * abb
 This ex, describes all strings of a's and b's ending in the particular
string abb.

18
Transition Tables
 Transition Table is an equivalent way to represent an NFA, in which,
for each state s and input symbol x (and ε), the set of successor
states x leads to from s.

 The empty set φ is used when there is no edge labeled x


emanating from s.

Transition Table for (a | b) * abb


19
Thank You

You might also like