0% found this document useful (0 votes)
13 views11 pages

Chapter 6 Part 1

The document provides an overview of Python's character set, which includes letters, digits, special symbols, whitespaces, and Unicode characters. It explains tokens or lexical units, detailing types such as keywords, identifiers, literals, operators, and punctuators. Additionally, it covers the rules for forming identifiers, the different types of literals (string, numeric, boolean, and special), and the various operators used in Python programming.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views11 pages

Chapter 6 Part 1

The document provides an overview of Python's character set, which includes letters, digits, special symbols, whitespaces, and Unicode characters. It explains tokens or lexical units, detailing types such as keywords, identifiers, literals, operators, and punctuators. Additionally, it covers the rules for forming identifiers, the different types of literals (string, numeric, boolean, and special), and the various operators used in Python programming.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Created by Turbolearn AI

Python Character Set


Python can recognize characters from a language's encoding standard. A character set is a set of valid characters that a
language can use. Python supports Unicode, which means it can handle any letter, digit, or symbol.

Python's character set includes:

Letters: A-Z, a-z


Digits: 0-9
Special Symbols: + - * / % ! = & # <= >= _ (underscore)
Whitespaces: Blank space, tab, newline, carriage return, formfeed
Unicode characters (as part of ASCII)

Any character from this set can be part of Python data, literals, statements, expressions, or other program components.

Tokens or Lexical Units


Individual words, punctuation marks, and other elements in a program are known as tokens or lexical units. They're the smallest
individual units in a program that carry meaning.

A token is the smallest individual unit in a program.

Consider the following figure, which helps to describe the composition of tokens in Python:

Types of Tokens
Python has the following types of tokens:

Page 1
Created by Turbolearn AI

1. Keywords
2. Identifiers (Names)
3. Literals
4. Operators
5. Punctuators

Keywords
Keywords are words that have special meaning to the language's compiler/interpreter. They are reserved for specific purposes
and cannot be used as normal identifier names.

A keyword is a word having special meaning reserved by a programming language.

Here is a list of keywords in Python:

Keyword Keyword Keyword Keyword

False assert del for


in None break elif
from or while True
class else is pass
with global lambda raise
and continue except if
nonlocal yield return as
def finally import not
try

Identifiers (Names)
Identifiers are the names given to different parts of a program, such as variables, objects, classes, functions, lists, and
dictionaries.

Identifier-forming rules in Python:

An identifier is an arbitrarily long sequence of letters and digits.


The first character must be a letter; the underscore (_) counts as a letter.
Upper and lowercase letters are distinct. All characters are significant.
Digits (0-9) can be part of the identifier, except for the first character.
Identifiers have unlimited length.
Python is case-sensitive.
An identifier must not be a keyword.
An identifier cannot contain any special character except for underscore (_).

Here are some valid identifiers:

Myfile DATE9_7_77 MYFILE _DS CHK FILE13 Z2T0Z9 HJI3_JK

Here are some invalid identifiers and why:

Identifier Reason

DATA-REC Contains special character - (hyphen)


29CLCT Starts with a digit
break Reserved keyword
My.file Contains special character dot (.)

Literals (Values)

Page 2
Created by Turbolearn AI

Literals (or constant values) are data items that have a fixed value.

Python allows several kinds of literals:

1. String literals
2. Numeric literals
3. Boolean literals
4. Special literal None
5. Literal Collections

String Literals
A string literal is a sequence of characters surrounded by quotes (single, double, or triple quotes). The text enclosed in quotes
forms a string literal in Python.

Examples of string literals: 'd', 'abc', "abc"

Both single characters and multiple characters enclosed in quotes are treated as string literals.

Valid string literals:

"Amy's" "129045" 'Hello World' 'Astha' "Rizwan" "1-x-0-w255" "112FBD291"

Nongraphic Characters:

Python allows you to include certain nongraphic characters (that cannot be typed directly from the keyboard, such as tabs,
backspace, carriage return, etc.) by using escape sequences. These are represented by a backslash () followed by one or more
characters.

Here's a table of escape sequences in Python:

Escape Sequence Character What it does

\\ Backslash Represents a backslash


\' Single quote Represents a single quote
\" Double quote Represents a double quote
\r Carriage Return (CR)
\t Horizontal Tab (TAB)
\uxxxx Character with 16-bit hex value (Unicode only)
\Uxxxxxxxx Character with 32-bit hex value (Unicode only)
\a ASCII Bell (BEL)
\v ASCII Vertical Tab (VT)
\b ASCII Backspace (BS)
\f ASCII Formfeed (FF)
\ooo Character with octal value
\n Newline character
\xhh Character with hex value
\N{name} Character named name (Unicode only)

In Python, you can also directly type a double quote inside a single-quoted string and vice versa (e.g., "anu's" is a valid string).

String Types in Python:

Python allows you to have two string types:

1. Single-line Strings (Basic strings)


2. Multiline Strings

(i) Single-line Strings:

Page 3
Created by Turbolearn AI

Strings created by enclosing text in single quotes ('') or double quotes ("") are normally single-line strings, meaning they
terminate in one line.

The reason for the error shown above is that Python by default creates single-line strings. If there is no closing quotation mark at
the end of a line, Python shows an error. An escape sequence represents a single character and consumes one byte in ASCII
representation.

(ii) Multiline Strings:

Sometimes, you need to store text spread across multiple lines as a single string. Python offers two ways to create multiline
strings:

(a) Using a backslash at the end of normal single-quote / double-quote strings:

Here is an example using the python shell:

Page 4
Created by Turbolearn AI

Page 5
Created by Turbolearn AI

(b) By enclosing text in triple quotation marks (''' ''') or triple-apostrophe (""" """):

Typing text in triple quotation marks allows you to type multiline strings without needing a backslash at the end of the line.

Size of Strings:

Python determines the size of a string by counting the characters in the string. For example, the size of "abc" is 3, and the size of
'hello' is 5.

If a string literal has an escape sequence, count the escape sequence as one character.

'\\' size is 1 (\\ is an escape sequence)


'abc' size is 3
"\a\b" size is 2 (\a is an escape sequence, thus one character)
"Seemal\'s pen" size is 11 (escape sequence \' has been used).
"Amy's" size is 5

For multiline strings created with triple quotes, the EOL (end-of-line) character at the end of the line is also counted in the size.

For multiline strings created with single/double quotes and a backslash at the end of the line, the backslashes are not counted in
the size.

To check the size of a string, you can use the len(<stringname>) command in the Python console window shell.

len(<object name>) can be used to get the size or length of an object.

Triple quoted multiline strings count EOL characters in the size of the string, but do not count backslashes at the end of
intermediate lines.

Page 6
Created by Turbolearn AI

Numeric Literals
Numeric literals can belong to any of the following three different numerical types:

int (signed integers): Positive or negative whole numbers with no decimal point.
float (floating point real values): Real numbers written with a decimal point.
complex (complex numbers): Numbers of the form a + bJ, where a and b are floats, and J represents √-1.

Python allows three types of integer literals:

1. Decimal Integer Literals: A sequence of digits is taken to be a decimal integer literal unless it begins with 0. For example:
1234, 41, +97, -17.
2. Octal Integer Literals: A sequence of digits starting with 0o (digit zero followed by letter o) is taken to be an octal integer.
For example, decimal integer 8 will be written as 0o10 as an octal integer. An octal value can contain only digits 0-7.
3. Hexadecimal Integer Literals: A digits sequence preceded by 0x or 0X is taken to be a hexadecimal integer. For instance,
decimal 12 will be written as 0xC as a hexadecimal integer.

Integer literals should follow these rules:

Must have at least one digit.


Must not contain any decimal point.
May contain either a + or - sign. A number with no sign is assumed to be positive.
Commas cannot appear within an integer constant.

Examples of valid and invalid hexadecimal literals:

Valid: 0XBK9, 0x19AZ


Invalid: 0XPrO (invalid letters)

Floating Point literals are also called real literals. Floating numbers have fractional parts. Real literals are written in two forms:

1. Fractional Form
2. Exponent Form

1. Fractional Form:

Consists of signed or unsigned digits including a decimal point. A real constant in fractional form must have at least one digit
either before or after the decimal point. It may also have either a + or - sign before it. A real constant with no sign is assumed to
be positive.

Examples of valid real literals in fractional form: .3 (will represent 0.3), 7. (will represent 7.0)

Examples of invalid real literals:

7 (No decimal point)


+17/2 (Illegal symbol)
17,250.26.2 (Two decimal points)
17,250.262 (Comma not allowed)

2. Exponent Form:

Consists of two parts: mantissa and exponent. For instance, 5.8 can be written as 0.58 x 10^1 = 0.58E01, where the mantissa
part is 0.58 and the exponent part is 1. E01 represents 10^1.

The rule for writing a real literal in exponent form is:

A real constant in exponent form has two parts: a mantissa and an exponent. The mantissa must be either an integer or a real
constant. The mantissa is followed by a letter E or e and the exponent. The exponent must be an integer.

Examples of valid real literals in exponent form:

1.52E07, 0.152E08, 152E05, 152.0E08, 152E+8, 1520E04, -0.172E-3, 172.E3, .2EE-4, 3.E3 (equivalent to 3.0E3)

Page 7
Created by Turbolearn AI

Examples of invalid real literals in exponent form:

1.7E (No digit specified for exponent)


0.17E2.3 (Exponent cannot have fractional part)
17,225E02 (No comma allowed)

Numeric values with commas are not considered int or float values; Python treats them as a tuple.

Page 8
Created by Turbolearn AI

Boolean Literals
A Boolean literal in Python represents one of two Boolean values: True (Boolean true) or False (Boolean false).

Special Literal None


Python has one special literal, which is None. The None literal is used to indicate the absence of a value or to indicate the end of
lists in Python. It means "There is no useful information" or "There's nothing here."

Python doesn't display anything when asked to display the value of a variable containing None. However, printing with print
shows that the variable contains None.

Python None

Python also supports literal collections such as tuples and lists, but these will be discussed later.

True and False are the only two Boolean literal values in Python. The None literal represents the absence of a value. Boolean
literals True, False, and the special literal None are some built-in constants/literals of Python.

Page 9
Created by Turbolearn AI

Operators
Operators are tokens that trigger some computation or action when applied to variables and other objects in an expression.
Variables and objects to which an operator is applied are called operands.

Operators are tokens that trigger some computation or action when applied to variables and other objects in an
expression.

Types of Operators:

Unary Operators
Binary Operators

Unary Operators: Operators that require one operand.

- Unary minus
+ Unary plus
not Logical negation
~ Bitwise complement

Binary Operators: Operators that require two operands.

Page 10
Created by Turbolearn AI

Arithmetic operators
+ Addition
- Subtraction
* Multiplication
/ Division
% Remainder/Modulus
** Exponent (raise to power)
// Floor division
Bitwise operators
& Bitwise AND
^ Bitwise exclusive OR (XOR)
| Bitwise OR
Shift operators
<< shift left
>> shift right
Identity operators
is is the identity the same?
is not is the identity not the same?
Relational operators
< Less than
> Greater than
<= Less than or equal to
>= Greater than or equal to
== Equal to
!= Not equal to
Logical operators
and Logical AND
or Logical OR
Assignment operators
= Assignment
/= Assign quotient
+= Assign sum
*= Assign product
%= Assign remainder
-= Assign difference
**= Assign Exponent
//= Assign Floor division
Membership operators
in whether variable in sequence
not in whether variable not in sequence

Punctuators
Punctuators are symbols used in programming languages to organize sentence structures and indicate the rhythm and emphasis
of expressions, statements, and program structure.

Common punctuators in Python:

! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ {` |}~

Page 11

You might also like