0% found this document useful (0 votes)

37 views15 pages

Understanding Lexical Analysis in Compilers

Lexical Analysis is the first phase of a compiler that converts high-level input programs into a sequence of tokens, while also handling comments and error correlation. It utilizes deterministic finite automata to recognize tokens defined by regular expressions and involves stages such as input preprocessing, tokenization, classification, validation, and output generation. Tokens include identifiers, keywords, operators, and punctuation, with each token associated with specific attributes and patterns.

Uploaded by

sahurinku112

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views15 pages

Understanding Lexical Analysis in Compilers

Uploaded by

sahurinku112

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

LEXICAL ANALYSIS

• Lexical Analysis is the first phase of the compiler also known as a scanner.

• It converts the High level input program into a sequence of Tokens.

• Main task: to read input characters and group them into “tokens.”

• Secondary tasks:
• Skip comments and whitespace;
• Correlate error messages with source program (e.g., line number of error).
1. Lexical Analysis can be implemented with the Deterministic finite Automata.

[Link] output is a sequence of tokens that is sent to the parser for syntax
analysis

What is a Token?

• A lexical token is a sequence of characters that can be treated

as a unit in the grammar of the programming languages.
Example of tokens:

• Type token (id, number, real, . . . )

• Punctuation tokens (IF, void, return, . . . )

• Alphabetic tokens (keywords)

• Keywords; Examples- for, while, if etc.

• Identifier; Examples- Variable name, function name, etc.

Operators; Examples- '+', '++', '-' etc.

• Separators; Examples- ',' ';' etc.

Lexical Analysis: Terminology
• token: a name for a set of input strings with related structure.
Example: “identifier,” “integer constant”
• pattern: a rule describing the set of strings associated with a token.
Example: “a letter followed by zero or more letters, digits, or underscores.”
• lexeme: the actual input string that matches a pattern.
Example: count

5
Examples
Input: count = 123
Tokens:
identifier : Rule: “letter followed by LETTER OR DIGIT (MAX 8 CHARACTRES)
Lexeme: count
assg_op : Rule: =
Lexeme: =
integer_const : Rule: “digit followed by DIGIT”
Lexeme: 123

6
Attributes for Tokens
• If more than one lexeme can match the pattern for a token, the scanner
must indicate the actual lexeme that matched.
• This information is given using an attribute associated with the token.
Example: The program statement
count = 123
yields the following token-attribute pairs:
identifier, pointer to the string “count”
assg_op, 
integer_const, the integer value 123

7
• The tokens of a language are specified using
regular expressions.

.Tokens are Recognized by Finite Automata.

Structure of a Scanner Automaton

9
Example of Non-Tokens:

• Comments, preprocessor directive, macros, blanks, tabs, newline,

etc.
How Lexical Analyzer Works?

[Link] preprocessing: This stage involves cleaning up the input text and
preparing it for lexical analysis. This may include removing comments,
whitespace, and other non-essential characters from the input text.

[Link]: This is the process of breaking the input text into a sequence of
tokens. This is usually done by matching the characters in the input text against
a set of patterns or regular expressions that define the different types of
tokens.
[Link] classification: In this stage, the lexer determines the type of each token.
For example, in a programming language, the lexer might classify keywords,
identifiers, operators, and punctuation symbols as separate token types
4. Token validation:
In this stage, the lexer checks that each token is valid
according to the rules of the programming language. For
example, it might check that a variable name is a valid
identifier, or that an operator has the correct syntax.

[Link] generation: In this final stage, the lexer generates the output of the lexical
analysis process, which is typically a list of tokens. This list of tokens can then be
passed to the next stage of compilation or interpretation.
int main()

{ // 2 variables int a, b;

a = 10; return 0;

All the valid tokens are:

'int' 'main' '(' ')' '{' 'int' 'a' ',' 'b' ';' 'a' '='
'10' ';' 'return' '0' ';' '}'
Exercise 1: Count number of tokens:

int main()

{ int a = 10, b = 20;

printf("sum is:%d",a+b); return 0;

}
Answer: Total number
of token: 27.

Lexical Analysis
No ratings yet
Lexical Analysis
10 pages
Understanding Lexical Tokens in Analysis
No ratings yet
Understanding Lexical Tokens in Analysis
10 pages
Lexical Analyzer: Tokenization Process
No ratings yet
Lexical Analyzer: Tokenization Process
37 pages
Lexical Analysis
No ratings yet
Lexical Analysis
35 pages
Unit2 Lexical Analyzer
No ratings yet
Unit2 Lexical Analyzer
6 pages
5.tokens, Patterns, and Lexemes
No ratings yet
5.tokens, Patterns, and Lexemes
7 pages
Lexical Analysis and Token Recognition
No ratings yet
Lexical Analysis and Token Recognition
67 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
59 pages
Lexical Analysis Overview by Atif Ishaq
100% (1)
Lexical Analysis Overview by Atif Ishaq
37 pages
Lexical Analysis for Developers
No ratings yet
Lexical Analysis for Developers
16 pages
Compiler
No ratings yet
Compiler
14 pages
Role of Lexical Analyzer in Compilers
No ratings yet
Role of Lexical Analyzer in Compilers
38 pages
Practical File: Computer Network and Security
No ratings yet
Practical File: Computer Network and Security
28 pages
Lecture 3 - Lexical Analysis
No ratings yet
Lecture 3 - Lexical Analysis
42 pages
Understanding Lexical Analysis in Compilers
No ratings yet
Understanding Lexical Analysis in Compilers
59 pages
Lexical Analyzer Overview and Functions
No ratings yet
Lexical Analyzer Overview and Functions
56 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
10 pages
002chapter 2 - Lexical Analysis
No ratings yet
002chapter 2 - Lexical Analysis
114 pages
Role of A Lexical AN
No ratings yet
Role of A Lexical AN
26 pages
Lexical Analysis in Compiler Design
100% (1)
Lexical Analysis in Compiler Design
52 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
27 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
46 pages
2 1 Lexical Analysis
No ratings yet
2 1 Lexical Analysis
30 pages
Learning Materials, CD, Unit-2 (Lexical Analysis)
No ratings yet
Learning Materials, CD, Unit-2 (Lexical Analysis)
13 pages
Understanding Lexical Analysis in Compilers
No ratings yet
Understanding Lexical Analysis in Compilers
12 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
25 pages
Understanding Lexical Analysis Basics
No ratings yet
Understanding Lexical Analysis Basics
10 pages
1 - Scanning Slides Sanyal Part1
No ratings yet
1 - Scanning Slides Sanyal Part1
22 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
74 pages
Lexical Analyzer Overview and Functions
No ratings yet
Lexical Analyzer Overview and Functions
21 pages
Lexical Analyzer in Compiler Design
No ratings yet
Lexical Analyzer in Compiler Design
38 pages
Lexical Analysis: Tokenization Basics
No ratings yet
Lexical Analysis: Tokenization Basics
2 pages
L4 - Lexical Analysis (Introduction)
No ratings yet
L4 - Lexical Analysis (Introduction)
11 pages
02 Lexical Analysis
No ratings yet
02 Lexical Analysis
86 pages
Unit 2 Lexical Analyzer
No ratings yet
Unit 2 Lexical Analyzer
30 pages
Lexical Analysis: Phases & Tokens Explained
No ratings yet
Lexical Analysis: Phases & Tokens Explained
11 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
71 pages
Understanding Lexical Analysis Basics
No ratings yet
Understanding Lexical Analysis Basics
14 pages
Define Lexeme, Pattern, and Token Lexical
No ratings yet
Define Lexeme, Pattern, and Token Lexical
10 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
44 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
64 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
41 pages
Lecture 4 Lexical Analysis
No ratings yet
Lecture 4 Lexical Analysis
23 pages
Understanding Lexical Analysis in Compilers
No ratings yet
Understanding Lexical Analysis in Compilers
9 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
48 pages
Lexical Analysis
No ratings yet
Lexical Analysis
128 pages
Unit NO.03 Phases in Compilers-Lexical Analysis& Syntax Analysis
No ratings yet
Unit NO.03 Phases in Compilers-Lexical Analysis& Syntax Analysis
43 pages
Compiler Lexical Analysis Guide
No ratings yet
Compiler Lexical Analysis Guide
16 pages
Chapter 2 - Lexical Analyser
No ratings yet
Chapter 2 - Lexical Analyser
39 pages
Lexical Analysis
No ratings yet
Lexical Analysis
9 pages
Lexical Analyzer in Compiler Design
No ratings yet
Lexical Analyzer in Compiler Design
3 pages
Lexical Analysis and Parsing CD
No ratings yet
Lexical Analysis and Parsing CD
107 pages
Chapter 2 Lexical Analysis (Scanning)
No ratings yet
Chapter 2 Lexical Analysis (Scanning)
56 pages
Final Year BTech CSE Structure and Syllabus
No ratings yet
Final Year BTech CSE Structure and Syllabus
32 pages
Top-Down Parsing for CS Students
No ratings yet
Top-Down Parsing for CS Students
73 pages
The Java Virtual Machine & The Kotlin Compiler
No ratings yet
The Java Virtual Machine & The Kotlin Compiler
103 pages
C Lexical Analyzer and Symbol Table Code
No ratings yet
C Lexical Analyzer and Symbol Table Code
28 pages
1 Basics Operators Expressions
No ratings yet
1 Basics Operators Expressions
50 pages
AI & ML Syllabus V Sem
No ratings yet
AI & ML Syllabus V Sem
16 pages
SPCC Viva Questions for Sem 6 CS
No ratings yet
SPCC Viva Questions for Sem 6 CS
12 pages
Lex and Flex: A Comprehensive Guide
No ratings yet
Lex and Flex: A Comprehensive Guide
18 pages
Korp: Språkbanken's Corpus Infrastructure
No ratings yet
Korp: Språkbanken's Corpus Infrastructure
5 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
44 pages
Rishi Mitacs CV
No ratings yet
Rishi Mitacs CV
2 pages
Usage-Based Grammar and Language Change
No ratings yet
Usage-Based Grammar and Language Change
18 pages
LMRC Assistant Manager IT/CS 2018 Solutions
No ratings yet
LMRC Assistant Manager IT/CS 2018 Solutions
8 pages
Hybrid Sentiment Analysis of Amazon Reviews
No ratings yet
Hybrid Sentiment Analysis of Amazon Reviews
11 pages
Парсер для SP: Загрузка и спецификации
No ratings yet
Парсер для SP: Загрузка и спецификации
24 pages
CCCCCCCCCCCCC CC: C CCC CCCCCC CCC
No ratings yet
CCCCCCCCCCCCC CC: C CCC CCCCCC CCC
23 pages
Compiler Design Principles and Techniques
No ratings yet
Compiler Design Principles and Techniques
128 pages
Compiler Design Course Overview
No ratings yet
Compiler Design Course Overview
58 pages
Academic Regulations Course Structure AND Detailed Syllabus: Computer Science and Engineering
No ratings yet
Academic Regulations Course Structure AND Detailed Syllabus: Computer Science and Engineering
20 pages
Understanding Compiler Design and Function
No ratings yet
Understanding Compiler Design and Function
47 pages
CD Unit-I-1
No ratings yet
CD Unit-I-1
42 pages
Cosine Similarity for Marathi Spelling Errors
No ratings yet
Cosine Similarity for Marathi Spelling Errors
6 pages
PoPL Lecture 3
No ratings yet
PoPL Lecture 3
31 pages
Foundations of Software Architecture
No ratings yet
Foundations of Software Architecture
16 pages
Text Processing and More About Wrapper Classes
No ratings yet
Text Processing and More About Wrapper Classes
52 pages
04 Novikov
No ratings yet
04 Novikov
25 pages
Compiler design-BE-WINTER-2020
No ratings yet
Compiler design-BE-WINTER-2020
2 pages
NLP for the Web: Course Overview 2024
No ratings yet
NLP for the Web: Course Overview 2024
87 pages
Compiler Design Lab Manual
No ratings yet
Compiler Design Lab Manual
32 pages
Automated Python Bug Repair with LLMC
No ratings yet
Automated Python Bug Repair with LLMC
10 pages

Understanding Lexical Analysis in Compilers

Uploaded by

Understanding Lexical Analysis in Compilers

Uploaded by

LEXICAL ANALYSIS

• It converts the High level input program into a sequence of Tokens.

• A lexical token is a sequence of characters that can be treated

• Type token (id, number, real, . . . )

• Punctuation tokens (IF, void, return, . . . )

• Alphabetic tokens (keywords)

• Keywords; Examples- for, while, if etc.

• Identifier; Examples- Variable name, function name, etc.

• Separators; Examples- ',' ';' etc.

.Tokens are Recognized by Finite Automata.

• Comments, preprocessor directive, macros, blanks, tabs, newline,

All the valid tokens are:

{ int a = 10, b = 20;

printf("sum is:%d",a+b); return 0;

You might also like