Ultimate
Python
Programming
Learn Python with 650+ programs,
900+ practice
questions, and 5 projects
Deepali Srivastava
www.bpbonline.com
First Edition 2024
Copyright © BPB Publications, India
ISBN: 978-93-55516-558
All Rights Reserved. No part of this publication may be reproduced, distributed or transmitted in any
form or by any means or stored in a database or retrieval system, without the prior written permission
of the publisher with the exception to the program listings which may be entered, stored and executed
in a computer system, but they can not be reproduced by the means of publication, photocopy,
recording, or by any electronic and mechanical means.
LIMITS OF LIABILITY AND DISCLAIMER OF WARRANTY
The information contained in this book is true to correct and the best of author’s and publisher’s
knowledge. The author has made every effort to ensure the accuracy of these publications, but
publisher cannot be held responsible for any loss or damage arising from any information in this
book.
All trademarks referred to in the book are acknowledged as properties of their respective owners but
BPB Publications cannot guarantee the accuracy of this information.
www.bpbonline.com
Dedicated to
Sri Anjaneya Swamy
About the Author
Deepali Srivastava has a Masterʼs degree in Mathematics and is an author
and educator in the field of computer science and programming. Her books
“C in Depthˮ and “Data Structures Through C in Depthˮ are widely used as
reference materials by students, programmers and professionals looking to
enhance their understanding of programming languages and data structures.
These books are known for their clarity, depth of coverage, and practical
approach to learning. In addition to her writing, Deepali Srivastava has been
involved in creating online video courses on Data structures and
Algorithms, Linux and Python programming. Her books and courses have
helped 350,000+ students learn computer science concepts. Her work has
been appreciated by students and has been a valuable resource for those
looking to build their programming skills.
Acknowledgement
I would like to thank God for blessing me with the opportunity and
inspiration to write this book, and for giving me the strength to do it.
I am grateful to my husband Suresh Kumar Srivastava for always believing
in my capabilities and consistently inspiring me to give my best. He
introduced me to book writing and helped me unleash my potential. His
thoughtful suggestions and feedback helped me improve the content and
presentation of this book.
I would like to thank my parents, my brother and my sister-in-law for their
unwavering love and support. Blessings of my parents and late parents-in-
law are a major source of my inner strength.
I am indebted to my teachers in my journey of education, especially my
teachers and friends in MJP Rohilkhand University Bareilly, where I got
introduced to the world of programming.
I extend my appreciation to the readers of my books and students of my
online courses for their interest in my work, and for their appreciation and
suggestions. Any sort of feedback is valuable to me and helps me in
improving my work and creating better content.
I am grateful to the BPB publication team for their guidance and support
provided during every step of the publishing journey. Special appreciation
goes to the editing team, layout team, and all other contributors involved in
bringing this book to life.
Preface
Python is a widely used general-purpose programming language. Its
popularity can be attributed to its simplicity and a rich set of powerful
features. The clean and intuitive syntax makes it an excellent choice for
novices, allowing them to grasp the fundamentals of programming quickly,
and the advanced features make it appealing to experienced programmers
too. It can run on various platforms, including Windows, macOS, and
Linux. Since it is an open-source software, it is freely available to all.
The widespread usage of Python is evident in the technology world, with
major companies and organizations such as Google, Amazon, Instagram,
Facebook, and NASA using it in different ways. Whether you are involved
in machine learning, data science, artificial intelligence, scientific
computing, automation or you need to create robust web applications and
games, Python provides the necessary tools and resources. The extensive
collection of libraries available in Python can be effectively utilized across
diverse domains. Therefore, adding Python to your skill set can greatly
enhance your capabilities and open up numerous opportunities in various
fields.
This book provides a thorough and comprehensive introduction to Python,
focusing on the core programming concepts and problem-solving skills
required for building a solid foundation in programming. Throughout the
book, there are numerous programming examples and end-of-chapter
exercises to give you a hands-on experience. The exercises include
multiple-choice questions and programming problems; multiple-choice
questions will assess both your memory and comprehension of the topic,
while the programming exercises will provide you with a chance to apply
the acquired concepts. The book includes coding conventions and best
practices for writing efficient, readable, and maintainable code. The code in
the book is written and tested using Python version 3.11, which is the most
recent version at the time of publishing the book.
Python is easy to learn. You can start writing Python programs within a few
days. However, if you wish to leverage all the powerful features of Python,
a more in-depth exploration is required. The content in this book can assist
you in achieving that. This book includes 21 chapters that gradually
introduce new topics so learners can proceed at a sustainable pace. If you
are a beginner, start from the first chapter and go through all the chapters in
order, and work out the examples and exercises along the way. If you have a
working knowledge of Python, you can quickly browse through the initial
chapters and then randomly jump to topics that are new to you or that you
want to master. However, I would still recommend reading the chapters in
sequence to get the most out of the book. If you are transitioning from some
other language, you might be tempted to skip the initial information, but I
would suggest you go through all the basic details to avoid any confusion
later. Here is a brief summary of the chapters presented in the book.
Chapter 1 is an introduction to Python and shows the installation process.
Chapter 2 covers the fundamental elements of Python, such as data types,
variables, input, output, and many other basic concepts you need to get
started in Python. Chapter 3 provides a detailed explanation of strings that
represent textual data in Python. Chapters 4 and 5 cover the container types:
lists, tuples, dictionaries, and sets. Chapter 6 provides an insight into
conditional execution. In chapter 7, we will see how to perform repetitive
tasks using loops, and chapter 8 discusses some common looping
techniques in Python. Chapter 9 introduces the concept of comprehensions
which help us write shorter and readable code.
Chapter 10 contains detailed coverage of functions. We will see how to
create our own functions and will discuss parameters, arguments, arguments
passing, function objects, and many other details about functions. Chapter
11 shows how to create and use modules and packages. Chapter 12 is about
namespaces and scoping rules. Chapter 13 shows how to write programs
that can create files, write data into files, and read the data stored in files.
Chapters 14, 15, and 16 provide you with a strong understanding of the
object-oriented concepts. We will discuss classes, objects, methods,
inheritance, polymorphism, and magic methods. Chapters 17 and 18 are
devoted to advanced topics like iterators, generators, and decorators.
Chapter 19 is about functional programming and lambda functions. Chapter
20 shows how to handle run-time errors in Python, and Chapter 21
discusses context managers that are used to automate common resource
management patterns.
At the end of each chapter, you will find exercises, and their solutions are
provided at the end of the book. I would suggest that you try to solve these
exercises by yourself before looking at the solution. Solving exercises and
writing code will help you to internalize the concepts presented in the book.
Some typographical conventions are followed throughout the book for a
good reading experience. The code snippets and programs in the book
appear in this font to differentiate them from the regular text. Program
elements, such as variable names, types, etc., within the regular text, are in
this font. Any output produced by the code on the screen as a result of
running a program or anything that the user has to input through the screen
appears in this font.
My aim was to write an absolute hands-on book that is simple enough to
follow and yet gives detailed knowledge. Reading this book will be a
breeze, yet it will give you a comprehensive knowledge of Python and
instill the confidence to excel in any written test, interview, or professional
work. Programming is fun only when you get your hands dirty with code.
Reading a book is not enough for learning programming. I highly
recommend that you try the coding examples and exercises presented in the
book. The efforts you put in to strengthen your fundamentals of core
programming concepts will take you a long way in your software
development journey.
By the end of this book, you will develop a strong foundation in core
Python skills and will get the ability to explore the vast range of
functionalities offered by the standard library and third-party libraries. As
you progress, you will continue to be amazed by the capabilities of Python
and the remarkable libraries available. With your newfound skills you can
venture into diverse fields like data science or machine learning. Moreover,
if this is the first programming language you are learning, equipped with the
foundation of programming concepts and problem-solving skills, you can
easily learn any other programming language.
After using this book as a tutorial to learn the language, you can always
refer to it as a handy resource whenever you need to recall or review any
concept and apply it to your work.
Writing this book was a very enjoyable, insightful, and amazingly satisfying
journey for me and I am sure my readers will have a similar experience
while reading the book. I hope you enjoy reading the book and start loving
Python.
Happy programming!
Code Bundle and Coloured Images
Please follow the link to download the
Code Bundle and the Coloured Images of the book:
https://rebrand.ly/z815rfg
The code bundle for the book is also hosted on GitHub at
https://github.com/bpbpublications/Ultimate-Python-Programming. In
case there’s an update to the code, it will be updated on the existing GitHub
repository.
We have code bundles from our rich catalogue of books and videos
available at https://github.com/bpbpublications. Check them out!
Errata
We take immense pride in our work at BPB Publications and follow best
practices to ensure the accuracy of our content to provide with an indulging
reading experience to our subscribers. Our readers are our mirrors, and we
use their inputs to reflect and improve upon human errors, if any, that may
have occurred during the publishing processes involved. To let us maintain
the quality and help us reach out to any readers who might be having
difficulties due to any unforeseen errors, please write to us at :
[email protected]
Your support, suggestions and feedbacks are highly appreciated by the BPB
Publications’ Family.
Did you know that BPB offers eBook versions of every book published, with PDF and ePub files
available? You can upgrade to the eBook version at www.bpbonline.com and as a print book
customer, you are entitled to a discount on the eBook copy. Get in touch with us at :
[email protected] for more details.
At www.bpbonline.com, you can also read a collection of free technical articles, sign up for a
range of free newsletters, and receive exclusive discounts and offers on BPB books and eBooks.
Piracy
If you come across any illegal copies of our works in any form on the internet, we would be
grateful if you would provide us with the location address or website name. Please contact us at
[email protected] with a link to the material.
If you are interested in becoming an author
If there is a topic that you have expertise in, and you are interested in either writing or
contributing to a book, please visit www.bpbonline.com. We have worked with thousands of
developers and tech professionals, just like you, to help them share their insights with the global
tech community. You can make a general application, apply for a specific hot topic that we are
recruiting an author for, or submit your own idea.
Reviews
Please leave a review. Once you have read and used this book, why not leave a review on the site
that you purchased it from? Potential readers can then see and use your unbiased opinion to make
purchase decisions. We at BPB can understand what you think about our products, and our
authors can see your feedback on their book. Thank you!
For more information about BPB, please visit www.bpbonline.com.
Join our book’s Discord space
Join the book’s Discord Workspace for Latest updates, Offers, Tech
happenings around the world, New Release and Sessions with the Authors:
https://discord.bpbonline.com
Table of Contents
1. Introduction to Python
1.1 What makes Python so popular
1.2 Python implementation
1.3 Installing Python
1.4 Python Interactive Mode
1.5 Executing a Python Script
1.6 IDLE
1.7 Getting Help
2. Getting Started
2.1 Identifiers
2.2 Python Types
2.3 Objects
2.4 Variables and assignment statement
2.5 Multiple and Pairwise Assignments
2.6 Deleting a name
2.7 Naming convention for constants
2.8 Operators
2.8.1 Arithmetic operators
2.8.2 Relational operators
2.8.3 Logical operators
2.8.4 Identity operators
2.8.5 Membership operators
2.8.6 Bitwise operators
2.9 Augmented assignment statements
2.10 Expressions
2.11 Order of operations: Operator Precedence and Associativity
2.12 Type Conversion
2.13 Statements
2.14 Printing Output
2.15 Getting user input
2.16 Complete programs
2.17 Comments
2.18 Indentation in Python
2.19 Container types
2.20 Mutable and Immutable Types
2.21 Functions and methods
2.22 Importing
2.23 Revisiting interactive mode
2.24 Errors
2.25 PEP8
Exercise
3. Strings
3.1 Indexing
3.2 Strings are immutable
3.3 String Slicing
3.4 String Concatenation and Repetition
3.5 Checking membership
3.6 Adding whitespace to strings
3.7 Creating multiline strings
3.8 String methods
3.9 Case-changing methods
3.10 Character classification methods
3.11 Aligning text within strings
3.12 Removing unwanted leading and trailing characters
3.13 Searching and replacing substrings
3.14 Chaining method calls
3.15 String comparison
3.16 String conversions
3.17 Escape Sequences
3.18 Raw string literals
3.19 String formatting
3.20 String formatting using the format() method of string class
3.21 Representation of text - character encodings
Exercise
4. Lists and Tuples
4.1 Accessing individual elements of a list by indexing
4.2 Getting parts of a list by slicing
4.3 Changing an item in a list by index assignment
4.4 Changing a Portion of the list by slice assignment
4.5 Adding an item at the end of the list by using append()
4.6 Adding an item anywhere in the list by using insert()
4.7 Adding multiple items at the end by using extend() or +=
4.8 Removing a single element or a slice by using the del statement
4.9 Removing an element by index and getting it by using pop()
4.10 Removing an element by value using remove()
4.11 Removing all the elements by using clear()
4.12 Sorting a List
4.13 Reversing a List
4.14 Finding an item in the list
4.15 Comparing Lists
4.16 Built-in functions used on lists
4.17 Concatenation and Replication
4.18 Using a list with functions from the random module
4.19 Creating a list
4.20 Using range to create a list of integers
4.21 Using the repetition operator to create a list of repeated values
4.22 Creating a list by splitting a string
4.23 Converting a list of strings to a single string using join()
4.24 List of Lists (Nested lists)
4.25 Copying a list
4.26 Shallow copy and deep copy
4.27 Repetition operator with nested lists
4.28 Tuples
4.29 Tuple packing and unpacking
Exercise
5. Dictionaries and Sets
5.1 Dictionaries
5.2 Adding new key-value pairs
5.3 Modifying Values
5.4 Getting a value from a key by using the get() method
5.5 Getting a value from a key by using the setdefault() method
5.6 Getting all keys, all values, and all key-value pairs
5.7 Checking for the existence of a key or a value in a dictionary
5.8 Comparing dictionaries
5.9 Deleting key-value pairs from a dictionary
5.10 Creating a Dictionary at run time
5.11 Creating a dictionary from existing data by using dict()
5.12 Creating a dictionary by using the fromkeys() method
5.13 Combining dictionaries
5.14 Nesting of dictionaries
5.15 Aliasing and Shallow vs. Deep Copy
5.16 Introduction to sets
5.17 Creating a set
5.18 Adding and Removing elements
5.19 Comparing sets
5.20 Union, intersection, and difference of sets
5.21 Frozenset
Exercise
6. Conditional Execution
6.1 if statement
6.2 else clause in if statement
6.3 Nested if statements
6.4 Multiway selection by using elif clause
6.5 Truthiness
6.6 Short circuit behavior of operators and and or
6.7 Values returned by and and or operators
6.8 if else operator
Exercise
7. Loops
7.1 while loop
7.1.1 Indentation matters
7.1.2 Removing all occurrences of a value from the list using the
while loop
7.1.3 while loop for input error checking
7.1.4 Storing user input in a list or dictionary
7.2 for loop
7.2.1 Iterating over a string with for loop
7.2.2 Unpacking in for loop header
7.2.3 Iterating over dictionaries and sets
7.2.4 Iterating through a series of integers
7.3 Nesting of Loops
7.3.1 Using nested loops to generate combinations
7.3.2 Iterating over nested data structures
7.4 Premature termination of loops using the break statement
7.5 continue statement
7.6 else block in Loops
7.7 pass statement
7.8 for loop vs. while loop
Exercise
8. Looping Techniques
8.1 Iterating in sorted and reversed order
8.2 Iterating over unique values
8.3 Index-Based for loops
8.4 Making in-place changes in a list while iterating
8.5 Skipping some items while iterating
8.6 Using range and len combination to shuffle a sequence
8.7 enumerate function
8.8 Iterating over multiple sequences using zip
8.9 Modifying a collection while iterating in a for loop
8.10 Infinite loop with break
8.11 Avoiding complex logical conditions using break
Exercise
9. Comprehensions
9.1 List Comprehensions
9.2 if clause in list comprehension
9.3 Ternary operator in list comprehension
9.4 Modifying a list while iterating
9.5 Getting keys from values in a dictionary using list comprehension
9.6 Using list comprehensions to avoid aliasing while creating lists of
lists
9.7 Multiple for clauses and Nested list Comprehensions
9.8 Extracting a column in a matrix
9.9 Dictionary Comprehensions
9.10 Inverting the dictionary
9.11 Set Comprehensions
9.12 When not to use comprehensions
Exercise
10. Functions
10.1 Function Definition
10.2 Function call
10.3 Flow of control
10.4 Parameters and Arguments
10.5 No type checking of arguments
10.6 Local Variables
10.7 return statement
10.8 Returning Multiple Values
10.9 Semantics of argument passing
10.9.1 Why study argument passing
10.9.2 Pass by assignment
10.9.3 Assignment inside function rebounds the parameter name
10.9.4 Immutables vs Mutables as arguments
10.9.5 How to get the changed value of an immutable type
10.9.6 How to prevent change in mutable types
10.9.7 Digression for programmers from other languages
10.9.8 Advantages of Python’s information passing
10.10 Default Arguments
10.11 Default arguments that could change over time
10.12 Positional and Keyword Arguments
10.13 Unpacking Arguments
10.14 Variable number of positional arguments
10.15 Variable number of keyword arguments
10.16 Keyword-only arguments
10.17 Positional-Only Arguments
10.18 Multiple Unpackings in a Python Function Call
10.19 Arguments and Parameters summary
10.20 Function Objects
10.21 Attributes of a function
10.22 Doctrsings
10.23 Function Annotations
10.24 Recursive Functions
Exercise
11. Modules and Packages
11.1 Modules
11.2 Types of modules
11.3 Exploring modules
11.4 Creating and naming a new module
11.5 Importing a module
11.6 Importing all names from a module
11.7 Restricting names that can be imported
11.8 Importing individual names from a module
11.9 Using an alias while importing
11.10 Documenting a module
11.11 Module search Path
11.12 Module object
11.13 Byte compiled version of a module
11.14 Reloading a module
11.15 Scripts and modules
11.16 Packages
11.17 Importing a package and its contents
11.18 Subpackages
11.19 Relative imports
Exercise
12. Namespaces and Scope
12.1 Namespaces
12.2 Inspecting namespaces
12.3 Scope
12.4 Name Resolution
12.5 global statement
12.6 nonlocal statement
Exercise
13. Files
13.1 Opening a File
13.2 File opening modes
13.3 Buffering
13.4 Binary and Text Files
13.5 Closing a file
13.6 with statement
13.7 Random Access
13.8 Using seek in text mode
13.9 Calling seek in append mode
13.10 Reading and writing to the same file
13.11 Reading a File using read()
13.12 Line oriented reading
13.13 Writing to a file
13.14 Redirecting output of print to a file
13.15 Example Programs
13.16 File Related Modules
13.17 Command Line Arguments
13.18 Storing and Retrieving Python objects using pickle
Exercise
Project : Hangman Game
14. Object Oriented Programming
14.1 Programming Paradigms
14.2 Introduction to object-oriented programming
14.3 Defining Classes and Creating Instance Objects
14.4 Adding methods to the class
14.5 Adding instance variables
14.6 Calling a method inside another method
14.7 Common pitfalls
14.8 Initializer
14.9 Data Hiding
14.10 Class Variables
14.11 Class and object namespaces
14.12 Changing a class variable through an instance
14.13 Class Methods
14.14 Creating alternative initializers using class Methods
14.15 Static Methods
14.16 Creating Managed Attributes using properties
14.16.1 Creating read only attributes using properties
14.16.2 Creating Computed attributes using properties
14.16.3 Deleter method of property
14.17 Designing a class
Exercise
Project : Quiz creation
Project : Snakes and Ladders Game
Project : Log in system
15. Magic Methods
15.1 Overloading Binary Arithmetic operators
15.2 Reverse methods
15.3 In-place methods
15.4 Magic Methods for comparison
15.5 Comparing objects of different classes
15.6 String representation of an instance object
15.7 Construction and destruction of objects
15.8 Making instance objects callable
15.9 Overloading type conversion functions
15.10 List of magic methods
Exercise
Project : Date Class
16. Inheritance and Polymorphism
16.1 Inheriting a class
16.2 Adding new methods and data members to the derived class
16.3 Overriding a base Method
16.4 Invoking the base class methods
16.5 Multilevel Inheritance
16.6 object class
16.7 Multiple Inheritance
16.8 Method Resolution Order (MRO)
16.9 super and MRO
16.10 Polymorphism
16.11 Abstract Base classes
16.12 Composition
Exercise
17. Iterators and Generators
17.1 Iterables
17.2 Iterators
17.3 for loop Iteration Context – How for loop works
17.4 Iteration Tools
17.5 Iterator vs Iterable
17.6 Creating your own Iterator
17.7 Making your class Iterable
17.8 Some More Iterators
17.9 Lazy evaluation
17.10 itertools Module
17.11 Generators
17.12 Generator function vs Normal function
17.13 Generator expressions
Exercise
18. Decorators
18.1 Prerequisites for understanding decorators
18.2 Introduction to decorators
18.3 Writing your first decorator
18.4 Applying your decorator to multiple functions
18.5 Automatic decoration syntax
18.6 Decorator Example: Timer
18.7 Decorator Example: Logger
18.8 Decorator Example: Counting function calls
18.9 Applications of decorators
18.10 Decorating functions that take arguments
18.11 Returning values from decorated functions
18.12 Decorator Example: Checking return values
18.13 Decorator Example: Checking argument values
18.14 Applying Multiple Decorators
18.15 Preserving metadata of a function after decoration
18.16 General template for writing a decorator
18.17 Decorators with parameters
18.18 General template for writing a decorator factory
18.19 Decorator factory example
18.20 Applying decorators to imported functions
18.21 Decorating classes
18.22 Class Decorators
18.23 Class Decorators with parameters
Exercise
19. Lambda Expressions and Functional Programming
19.1 Lambda expression
19.2 Comparing def statement and lambda expression
19.3 Examples of lambda expressions
19.4 Using Lambda expressions
19.5 Using lambda expressions for returning function objects
19.6 Lambda expressions as closures
19.7 Creating jump tables using lambda functions
19.8 Using lambda expressions in sorted built-in function
19.9 Functional programming
19.10 map
19.11 map with multiple iterables
19.12 filter
19.13 Reducing an iterable
19.14 Built-in reducing functions
19.15 operator module
Exercise
20. Exception Handling
20.1 Types of Errors
20.2 Strategies to handle exceptions in your code
20.3 Error Handling by Python (Default exception handling)
20.4 Built-in Exceptions: Python Exceptions Class Hierarchy
20.5 Customized Exception Handling by using try…except
20.6 Catching multiple exceptions using multiple except handlers and
single except handler
20.7 How to handle an exception
20.8 Guaranteed execution of finally block
20.9 else Block
20.10 Why do we need an else block
20.11 How to get exception details
20.12 Nested try statements
20.13 Raising Exception
20.14 Re-raising Exception
20.15 Chaining Exceptions
20.16 Creating your own exceptions in Python (Custom exceptions)
20.17 Assertions
Exercise
21. Context Managers
21.1 with statement
21.2 Implementing our own context manager
21.3 Exception raised inside with block
21.4 Why we need with statement and context managers
21.5 Runtime context
21.6 Example: Sending output of a portion of code to a file
21.7 Example : Finding time taken by a piece of code
21.8 Using context managers in the standard library
21.9 Nested with statements and multiple context Managers
21.10 Implementing a context manager by using a decorator on a
generator
Exercise
Solutions
Index
Introduction to Python 1
Python is a widely used high-level and general-purpose programming
language originally developed by Guido Van Rossum in the early 1990s in
the Netherlands. It is maintained by a community of core developers who are
actively engaged in its growth and advancement. Although the official logo
of Python shows two intertwined snakes, it is not named after any snake.
Van Rossum named this language after a 1970s comedy show 'Monty
Python's Flying Circus'.
Python has three major versions; the initial version, Python 1.0, was released
in January 1994. The second major version, Python 2.0, was released in
2000, and the third major version, Python 3.0, was released in 2008. Python
3 is not backward compatible with Python 2; this means that the code written
in Python 2 may not work as expected in Python 3 without making some
modifications. In this book, we will use Python 3. The latest release of
Python is available on its official website www.python.org. Python is an
open-source software, which means that it is free to use and distribute.
1.1 What makes Python so popular
Python is a general-purpose language used in a wide variety of domains. It is
used extensively in different fields such as web development, data mining,
artificial intelligence, image processing, robotics, network programming,
developing user interfaces, database programming, scientific and
mathematical computing, game programming, and even education. Most of
the top companies and organizations, such as Google, Facebook, Amazon,
and NASA, use Python in different ways. Let us see some of the key factors
that contribute to Python's popularity.
Python is very easy to learn. It doesn't take much time to become productive
with Python. This is why it is often the introductory programming language
taught in many universities. Compared to languages such as C++ or Java,
Python code tends to be more concise, requiring fewer lines of code to
achieve the same functionality. Due to the simple syntax of Python,
programmers can focus more on finding the solution to a problem instead of
getting caught up in complex language features. Python uses indentation for
grouping together statements, resulting in a visually clean layout that
enhances code readability.
Python offers a convenient command line interface known as the 'Python
interactive shell' or 'Python REPL' (Read-Eval-Print Loop). With the Python
interpreter, you have the option to work interactively, allowing you to test
and debug small sections of code in real-time. The interactive mode serves
as a useful tool for experimenting and exploring Python's features.
One of the main advantages of Python is that it takes care of memory
management automatically. Python's built-in memory management system
allocates memory when needed and frees it up when it is no longer in use.
Programmers do not have to worry about managing memory manually, as
they would have to do in other languages like C or C++.
Python includes a vast standard library of modules; this is why the phrase
'Batteries included' is often used for Python. These modules contain code
that you can use in your own programs. In addition to the extensive standard
library, many third-party libraries are also available for use. Thus, you have
access to lots of prewritten reusable code in the form of standard library
modules and third-party modules, which can do most of the work for you
and save you from reinventing the wheel. This code can be incorporated into
your code to develop complex solutions with minimal effort. Whether you
are working on web programming, creating graphics, analyzing data,
performing mathematical calculations, engaging in scientific computing, or
developing games, you will find reusable code modules that can help you
achieve your goals.
Python supports multiple programming paradigms, including procedural,
functional, and object-oriented programming. Thus, programmers have the
flexibility to choose the coding structure that best suits their needs. The
object-oriented features of Python are much easier to implement and are
more intuitive when compared to similar features found in other
programming languages.
Python is a cross-platform and portable programming language, which
means that programs written in Python can be developed and executed on
various hardware platforms and operating systems. The same code can be
executed on multiple platforms without making any significant changes. The
cross-platform development minimizes the efforts required to adapt the
programs to different systems and thus facilitates code reuse and sharing on
different platforms.
Python has the capability to interact with software components written in
other languages. Python code can call libraries written in C and C++, and it
can also integrate with components developed in Java and .NET. This allows
Python programmers to tap into the strengths and functionalities of other
languages and libraries written in them. Python is also embeddable which
means that Python code can be placed within the code of another language
like C or C++.
Another reason for Python's popularity is its large base of active and
supportive developer community. Community members are actively engaged
in improving and enhancing the capabilities of Python as well as in
developing various libraries and tools. There are numerous resources and
extensive support available due to the vibrant community members.
Python has emerged as the preferred programming language for developers
because of its ease of use and powerful features. It is suitable both for
beginners and experts alike, and due to its versatility, it can be used in a
variety of applications.
In the next section we will learn about Python implementations and will see
what happens internally when a Python program is executed. While it is not
necessary to have this knowledge in order to write and run programs, having
a fundamental understanding of what occurs behind the scenes during
program execution is beneficial for a comprehensive understanding of the
language.
1.2 Python implementation
The terms C, C++, Basic, Java, or Python refer to programming languages,
which are essentially sets of rules and specifications. In order to use these
languages, they need to be implemented by creating software that allows us
to write programs in that language and run them on a computer. The
implementation of a language is the program that actually runs the code that
you write in that language. An implementation translates the source code to
native machine code instructions (binary 0s and 1s) so that the computer's
processor can execute it.
There are primarily two approaches to implementing a programming
language: compilation and interpretation. In compilation, a compiler
translates the complete program code in one go to another language such as
machine code or bytecode. If the translated code is machine code that is
understood by the processor, then it is directly executed, and if it is
bytecode, then it has to be again input to another interpreter or compiler. In
interpretation, an interpreter translates the code to machine code one line at a
time; a line of code is read, translated, and executed, then the next line is
read, translated, and executed, and so on. The code is translated line by line
at run time, so the interpreted implementations tend to be slower than the
compiled ones, which translate the whole code at once.
An implementation of a language can be a compiler, interpreter, or a
combination of both. A programming language can have multiple
implementations, and these implementations can be written in different
languages and can use different approaches to compile or interpret code. The
notion of interpretation and compilation is associated with language
implementation rather than the language itself; describing a language as
compiled or interpreted is not technically correct. The language
implementations that are written for a language are described as compiled or
interpreted and not the language. Compilation or interpretation is not a part
of the language specification; it is an implementation decision. The
implementations of C and C++ mostly use the compilation approach, while
Java, Python, and C# implementations generally use a combination of
compilation and interpretation techniques. C and C++ compilers translate
source code to machine code, which is executed directly by the processor.
Python has multiple implementations. The original and standard
implementation of Python is CPython written in C language. It is the most
widely used and up-to-date implementation of Python. When you download
Python software from the official site python.org, this is the implementation
that you get. The other implementations are Jython written in Java, and
IronPython written for the .NET platform. PyPy is the implementation that is
written in RPython, which is a subset of Python.
The software that is used for running Python programs is referred to as
Python interpreter. Let us understand how CPython interpreter combines the
compilation and interpretation techniques to execute a Python program.
We write our Python code in a source file (.py file), but the computer cannot
understand and execute this code; it can execute only machine code, which
consists of instructions written in binary form (0s and 1s). The source code
has to be converted to machine code so that the processor can execute it. The
source code is not directly converted to machine code. It is first compiled
into an intermediate form known as the bytecode. This bytecode is a low-
level code that is Python-specific and platform-independent, but it is not
understandable to the processor.
There is another software called Python Virtual Machine (PVM), that is
responsible for executing this bytecode on a specific platform. The bytecode
passes through the Python Virtual machine; it interprets this bytecode, which
means that it converts the bytecode instructions to machine code instructions
one by one and sends these machine code instructions to the processor for
execution, and we get the output. So, the job of PVM is to convert the
bytecode instructions to machine code instructions that the processor can
understand and execute.
Figure 1.1: The execution of a Python program
This is what happens when we execute a Python program. The intermediate
compilation step is hidden from the programmer; we can just type and run
our program immediately. The programmer does not have to explicitly
compile the code, so there is no separate compile time in Python; there is
only runtime. The compilation to bytecode is done to improve the efficiency
as the bytecode can be interpreted faster than the original source code.
In this whole process, the bytecode complier is a software that converts
source code to bytecode, and PVM is a software that converts bytecode to
machine code for the target platform. Python Virtual machine contains some
platform-specific components that may be implemented differently for each
platform. This allows the virtual machine to covert the bytecode into native
machine code according to the platform. It abstracts away the underlying
hardware and operating system details and thus provides a consistent
runtime environment for Python programs across different platforms. Both
the bytecode compiler and the virtual machine are part of the Python
interpreter software and are included in your Python installation.
The intermediate bytecode is generally cached for faster execution. It is
stored in .pyc or .pyo files inside a folder named __pycache__ and the
programmer can just ignore these files. When the program is run multiple
times without modifying the source code, the compiled bytecode from the
cached file is loaded and executed instead of re-compiling from source code
to bytecode every time. This bytecode is stored only for imported files, not
for the top-level scripts; we will see the difference between the two later in
the book.
The Jython implementation translates Python code into Java bytecode,
enabling its execution on a Java virtual machine. An advantage of Jython is
its ability to directly access Java libraries. Similarly, IronPython is designed
for the .NET framework and facilitates integration with .NET components.
Some implementations of virtual machines (bytecode interpreters) use just-
in-time (JIT) compilation approach to speed up the interpretation process.
The PyPy implementation of Python has better speed as it includes a just-in-
time compiler for faster execution of the bytecode. Just-in-time compiler
will compile the frequently executed blocks of bytecode to machine code
and cache the result. Next time, when the virtual machine has to execute the
same block of bytecode, the precompiled(cached) machine code is utilized
and executed, resulting in faster execution. So, the JIT compiler uses the
compilation approach to improve the efficiency of bytecode execution.
1.3 Installing Python
To download Python, visit the official website of Python. On the homepage,
select the Downloads option to go to the download page, or you can directly
go to www.python.org/downloads/. The website will automatically detect
your operating system and provide a suitable installer that corresponds to
your system's requirements, whether it be 32-bit or 64-bit. Click on the
Download button to download the installer (.exe) file for the latest version of
Python. At the time of writing this book, the latest version is 3.11.3. If you
wish to download any previous version of Python, you can scroll down the
page and click on the download button located next to the version number
you desire.
Figure 1.2: Official website of Python
Once the download is complete, double-click on the installer to execute it
and begin the installation process. On the first screen of the installer, you
will be presented with two choices: "Install Now" and "Customize
Installation." Clicking on "Install Now" will install Python with the default
features, while clicking on "Customize Installation" will allow you to
change the installation location or install other optional and advanced
features. The defaults should work well for now, so we will go with Install
Now. Before clicking on Install Now, make sure to select the Add
python.exe to PATH checkbox, as this will add Python to your system's
PATH environment variable and will enable you to run Python from the
command prompt.
Figure 1.3: Installing Python
Click Yes if it asks for permission to make changes to your device. The
installation begins, and all the required Python files, along with the standard
library, will be installed on your system.
Figure 1.4: Installation in progress
After the installation is complete, the following pop-up box will appear. This
shows that Python is installed on your system. Click on Close to complete
the installation and exit the installer. The appearance of the images shown in
the screenshots may vary depending on the version of Python that you
choose to install.
Figure 1.5: Installation successful
To verify the installation, write cmd in the Start search menu to open the
command prompt window and type the command python --version.
If Python has been successfully installed on your system, it will show the
version of the Python installed. Now write python (all in lowercase) in the
command window. You will see a line with some text describing the Python
version, and after that, you will see a prompt with three greater-than signs
(>>>). This is the Python shell prompt. Write 8 + 2 and press Enter; you
will get the output as 10 on the next line. The prompt appears again; this
time, write print('Hello world'), and the text Hello world will
appear on the next line. This verifies that Python is up and working on your
system. On this interactive Python shell, you can execute single statements
of Python. To quit this Python shell and come back to your command
prompt, type quit() or exit() or press Ctrl-Z.
Figure 1.6: Verifying installation on the command line
You can also verify your installation by opening the Integrated Development
and Learning Environment application(IDLE), which is installed by default
with Python. To open IDLE, type idle or python in the Start search menu
and click on the IDLE app. If the installation is successful, IDLE will show
an interactive Python shell window in which you can type Python commands
at the shell prompt (>>>) and execute them.
Figure 1.7: Verifying installation on IDLE
Installation on Mac is done in a similar way. Most macOS systems come
with Python, but usually, it is the 2.x version. To check if Python is installed,
type python --version on your terminal. To check if Python 3 is
installed, type python3 --version on your terminal. If Python 3 is not
installed, you can install it from the official website, and if it is installed, you
can update it to get the latest version.
Visit the official Python website and download the installer package (.pkg
file) that if offers for your system. After downloading, double-click on the
installer to run the installation process. Proceed with the installation by
following the on-screen instructions and accepting the defaults. You may
need to enter your administrator password to authorize the installation.
After the installation process is complete, Python's installation folder will
automatically open up. Inside this folder, you will find IDLE application,
which, as we have seen, is the development environment that comes with
Python. Double-click on this application to open it. If the installation is
successful, IDLE will display the interactive Python shell. You can type
print('Hello world') at the shell prompt to verify that it is
functioning correctly. To confirm the installation on the terminal, open the
Terminal application and type python3 –version, and press Enter.
This should show the version of Python that you have installed. Type
python3 to open the Python shell, which shows the >>> prompt where you
can start typing Python statements. You can close this Python shell by
entering Ctrl D or typing exit().
Installation on Linux can be done through the Package Manager specific to
your distribution. Linux systems come with Python installed on them. To
check whether Python is installed correctly or to check before installation
whether Python is already there on the system, execute the following
command irrespective of your operating system:
$ python3 --version
or
$ python --version
On Mac and Linux, the python --version command will mostly show
the Python 2 version and the python3 --version command will show
the Python 3 version.
After you have installed Python on your machine, you can either use an
Integrated Development Environment (IDE) to write and execute your
Python scripts, or you can write your script in a text editor and execute them
on the command line. An IDE combines a text editor and software tools to
provide a program development environment. You can create, edit, run, and
debug your programs using a single interface, and this makes program
development easier.
We will be working on IDLE, which is the built-in IDE for Python. IDLE is
included with the Python standard distribution for Windows and macOS, so
there is no need to install it separately. To get started with Python, IDLE is a
good IDE. It has an interactive interpreter and features like smart
indentation, auto-completion, and syntax highlighting, and also includes a
basic debugger. There are many other popular text-based editors and IDEs
available that can work with Python. If you want, you can choose any of
them to write your programs.
Eclipse is the IDE for development in Java. If you are familiar with Eclispe,
you can install the PyDev plugin to develop Python programs. If you are
comfortable working with Vim, you can use it for Python development by
adding some plugins. PyCharm is the Python IDE for professional
developers by JetBrains and comes in both free and paid editions. Sublime
Text is a code editor that supports Python and many other languages. If you
don't want to install Python on your computer, there are online platforms
available that provide a web-based Python interpreter.
1.4 Python Interactive Mode
Whether you work on the command line or use IDLE, there are two ways in
which you can use the Python interpreter - script mode and interactive mode.
In the script mode, we write our program statements in a file and then
execute the contents of that file to get the output of the whole program. In
the interactive mode, we type single Python statements on the prompt, and
we get to see the output immediately. This interactive experimentation is
particularly useful for beginners who have just started to learn the language.
Even when we have learned the language well, we can use this mode to
write short snippets of code and see how they work before putting them into
a big program. As we have seen in the previous section, we can enter the
interactive mode (Python shell) either through the command line or through
IDLE.
The shell prompt or interactive prompt (>>>) denotes that you are in the
Python interactive mode, so you can type just any valid Python statement or
expression, hit Enter, and the result will be displayed immediately.
>>> 4 + 6
10
>>> print('Hello')
Hello
The interactive mode has active memory that remembers the previously
executed statements on the prompt. However, this memory is active only for
the current session. If you exit the interpreter and open it again, the code you
typed in will not be available or remembered. So, if you want to retain and
reuse your code, you should place it in a file and save it.
Although the interactive mode is not used for developing programs, it can
serve as an excellent learning tool and can also be used to test code snippets.
This book will often use this mode to explain different language features.
You can also use this mode to play around with different Python constructs
and functionalities and explore more about them.
Here is something different and interesting that you can try on the interactive
prompt. Type import this, and you will get a short poem written by
Tim Peters. This poem summarizes the style and philosophy of Python in the
form of some guiding principles.
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the
rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to
guess.
There should be one-- and preferably only one --
obvious way to do it.
Although that way may not be obvious at first
unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a
bad idea.
If the implementation is easy to explain, it may be
a good idea.
Namespaces are one honking great idea -- let's do
more of those!
1.5 Executing a Python Script
Interactive mode is good for experimenting and exploring, but when we have
to write complete working programs that can be reused, we need to save our
work in a file. In script mode, we create a file of Python statements and
instruct the interpreter to execute the whole file, which is often called a
script. The interpreter will execute all statements in the file sequentially,
maintaining the order in which they appear.
There are two different ways of creating and executing scripts - we can write
the script in a text editor and execute it on the command line, or we can
create and execute the script in a development environment like IDLE. First,
let us see how to execute a script from the command line.
Create a new text file using a text editor like Notepad++. Write the following
two lines of code in the file and save it with the name hello.py; all Python
files are conventionally saved with the .py extension.
print('Hello!')
print(5 + 3)
You have written your first Python program. Now, let us see how to run it
from the command line. On the command prompt, type python followed by
the filename with full path and press Enter. Our program will be executed,
and we can see the output.
Figure 1.8: Executing a Python program on the command line
If you do not want to type the whole path, you can first change your current
directory to the directory in which you have your file by using the cd
command. On recent versions of Windows and Python 3.3 onwards, you can
write py instead of python or even write the name of the file to execute it.
Next, let us see how to run a Python script using IDLE, the built-in IDE of
Python. When you open IDLE on your system, the Shell window appears. In
the File Menu, click on New File, and a new window will open with Untitled
written on its title bar. Save the file with the name hi.py. By default, your file
will be saved in the Python installation folder where the Python code is
stored. It is better to make a working folder for your programs in some other
location and save your files in that folder. After saving the file, write the
following code in the file:
print('Hi!')
print(5 - 2)
To run this program, either press F5 or click on Run Module in the Run
menu. The output of the program appears in the Python Shell window.
Similarly, we can execute any existing Python program in IDLE; for
example, we can open and execute our file hello.py that we had created
using a text editor.
So now you know how to create a Python program and execute it. You can
either use IDLE to write and run your programs, or you can write your
program in a text editor and then run it on the command line. For beginners,
using IDLE is recommended. If you are using a text editor, Notepad++
would be a better choice than Notepad. You should not use a word processor
like MS Word, which uses formatted text. The text editor should store text in
its pure form.
1.6 IDLE
To write programs effectively, you need to have a good understanding of the
programming environment. Therefore, it is worth spending some time
looking at the features of IDLE, the IDE that you will be using to write your
programs. If you choose to use a different programming environment, make
sure that you familiarize yourself with it before starting to write programs.
IDLE is the abbreviated form for "Integrated Development and Learning
Environment"; Van Rossum probably named it after Eric Idle, who is a
member of 'Monty Python's Flying Circus'. It is a very simple integrated
development environment with features like syntax highlighting, automatic
code indentation, auto completion, call tips, and a basic debugger. It is coded
in Python using the TKinter GUI toolkit, and it is not platform-specific. It
works mostly in the same way on Windows, Unix, and macOS. IDLE
provides you with a simple graphical user interface (GUI) for performing
your programming tasks, so it is easier to use than the command line.
As we have seen, there are two window types in IDLE - Shell Window and
Code Editor Window. The Shell window provides a Read-Eval-Print Loop
(REPL) environment for executing single statements. This window is
interactive; it gives output for your command immediately. When you launch
IDLE, the shell window opens up. Within this shell window, if you select
"New File" or "Open" from the File menu, the code editor window will
appear. In the code editor window, you can write and save a new program, or
you can open an existing program. The editor of IDLE is multi-window, so
you can have multiple code editor windows open at a time. In any of the
open windows, the Windows menu presents a list of currently active
windows, enabling you to switch between them.
When you run the program written in the code editor window, the Shell
window automatically becomes active, and any output or error messages for
the program will be displayed in this window. This means that if you have to
do some editing in the editor window, then you have to activate it either by
clicking on it or by switching to it through the Windows menu. If you want,
you can arrange both the code editor window and the shell window side by
side on your screen and then click on the one that you want to work in.
While the menu options in both windows are mostly similar, each one has
some distinct options. We will briefly discuss some menu options; more
detailed information can be found in the IDLE doc option of the Help menu
of any of the two windows.
The File menu has its regular features like creating, opening, saving a file,
closing the current window and exiting IDLE.
The Edit menu also has its typical options like Undo, Redo, Cut, Copy,
Paste, Select All, Goto line, Find and Replace. You can use Ctrl+space to see
a list of possible completions while typing a word. The Expand Word option
can be used to expand a prefix to match a full word used in the same
window. You can also use Tab for expanding words or for seeing a list of
possible completions. This feature can be used to avoid typing long names,
for instance, if you have defined the name total_marks, you can simply
type "to" and press the tab key to quickly access and reuse the name in your
current window. The Show Call Tip option is used while calling functions,
and Show Surrounding Parentheses highlights the surrounding parenthesis.
In the Shell menu, Restart Shell will restart the shell and clean the
environment. All the names that you have defined will be gone. The View
Last Restart will scroll this shell window to the last Shell restart. To access
command line history, you can use the options Previous History or Next
History or press Alt+p and Alt+n, respectively. This way, you can scroll
through previously entered commands.
The Debug menu is for debugging the program, which involves detecting
and removing errors for your program.
In the Options menu, we have the configure IDLE option, which can be
used to change the settings of IDLE. We can change the preferences for font,
indentation, key shortcuts, startup windows and size, and text color theme.
IDLE uses color coding for highlighting different types of text, for example,
red for comments and orange for keywords. You can change the colors in the
Settings box and save your selections as a theme. IDLE comes in with a
built-in set of shortcut keys and you can define your own shortcut keys also.
You can change the window that opens up when you launch Python. By
default, the shell window is open when IDLE is launched; you can change
this default to make the editor window open. You can create new help
sources for IDLE. For example, you can provide a link to an online link or
any other file on your computer. The help source that you provide will
appear in the help menu. For most of the tasks, the default settings are pretty
good, and most of the time, there is no need to change them.
The Options menu in the editor window has the Show Code Context option,
which is useful in programs that have long functions or classes. If the name
of the function or class has scrolled above the top of the screen, you can
enable this option to see which function or class you are currently in.
The Help menu gives you access to the IDLE and Python documentation
available on the official website. You can use this documentation even when
you are not connected to the internet.
The code editor window includes a Format menu, which can be used to
format a selection in different ways. The Indent Region and Dedent Region
will shift the selected lines right or left. The default indent width is 4 spaces,
and it can be changed. However, changing it is not recommended since this
is the standard. The options Comment Out Region and Uncomment Region
will comment or uncomment the selected text; we will learn about comments
in the next chapter. The Tabify Region will turn leading stretches of spaces
into tabs, and the Untabify Region will turn all tabs into spaces. Toggle Tabs
is there to switch between indenting with spaces and tabs.
The code editor window also includes the Run menu, which can be used to
run your code or check your code for syntax errors. You can also use it to
open or activate the Python Shell window.
There are context menus available, which you can open by right-clicking in
the window. In the Shell window, you have Cut, Copy, Paste, Go to file/line.
In the Editor window, you have Cut, Copy, Paste, Set Breakpoint, and Clear
Breakpoint. The last two options are used while debugging.
1.7 Getting Help
To get help on any Python feature, you can utilize the shell prompt by typing
help followed by an opening parenthesis, a single quote, the keyword or
topic you require help with, another single quote, and finally, a closing
parenthesis. Here are some examples:
>>> help('print')
>>> help('keywords')
>>> help('for')
If you write help('topics'), the available topics will appear, and you
can get help on any of those topics.
The other way to get help is to get into the help mode. To start the help
mode, type help followed by empty parentheses and press Enter.
>>> help()
help>
In the help mode, we can see the help prompt in the window. Now, here, we
can directly type the item on which we are seeking help.
help>print
help>keywords
To see what topics are available, you can type topics and press Enter. The
help mode is suitable when you want to browse the help topics. To quit this
help mode and return to the interactive interpreter, type quit, or you can
just press Enter without typing anything.
Getting Started 2
Before you start writing programs, it is important to have a strong base in the
fundamentals of Python. This chapter will introduce you to the basic
concepts and building blocks that are used to construct a Python program.
While many of the concepts presented in this chapter will be explored in
more depth later in the book, it is important to familiarize yourself with
certain terms right from the beginning. This chapter will provide a gentle
introduction to these terms, offering you a solid foundation to build upon as
we move to more comprehensive discussions in the following chapters.
In this chapter, you will learn how to name things in Python, what type your
data can be, what operators you can use, how to input and output data from
your program, how to structure your program, and many more things. Even
if you have programmed in any other language before, the subject matter
presented in the chapter will prove to be useful because you will find that
Python operates differently in many aspects. If you dive into coding without
having a solid foundation, you will always find yourself looking back at the
basics. While you may manage to make your programs work to some extent,
you will lack a comprehensive understanding of how they work and the
underlying processes going on. A strong grasp of the fundamentals will serve
as a solid framework for further exploration and growth in your Python
programming journey.
2.1 Identifiers
As you start writing programs, you will create different program elements
like variables, functions, classes, modules, instance objects, etc. To identify
these elements in a program, you will have to give them some names. These
names are called identifiers, as they are used to identify program elements.
There are some rules and conventions for naming identifiers. You have to
follow the rules to prevent any errors and make your program work.
Following the conventions increases the readability of your code and makes
it easier to understand and maintain. Let us first see the rules for naming
identifiers.
The first character should be a letter or an underscore.
The rest of the characters can be any combination of letters (A to Z, a
to z), digits (0 to 9), and underscores. Special characters like @, %, $,
#, & are not allowed.
There is no limit on the length of an identifier.
Identifiers are case-sensitive. For example, marks, Marks and
MARKS are considered different identifiers.
Here are some examples of valid and invalid identifiers:
Valid p part3 min_length
Student
Invalid cost$ min-length 3rd_part cost
price
cost$ is invalid as it contains the illegal character dollar sign, min-
length is invalid as it contains the illegal character dash, 3rd_part is
invalid as it starts with a number, and cost price is invalid as it contains
a space.
There are some special words that programmers cannot use for naming their
program elements even though they satisfy all these rules. Here is the list of
those words:
False class from or
None continue global pass
True def if raise
and del import return
as elif in try
assert else is while
async except lambda with
await finally nonlocal yield
break for not
You cannot use any of these words for naming your program elements. For
example, you cannot have a variable named import or a function named
raise. These names are reserved by the language for specific purposes;
they are called keywords of the language. These keywords have predefined
meanings in the language so you cannot use these names for naming your
program elements. You can see the list of keywords in the interactive shell
by using help.
>>> help('keywords')
These were the rules that need to be followed while naming identifiers. Now,
let us see some conventions.
It is good to choose meaningful and descriptive names for identifiers. The
name should indicate the purpose; for example, a variable name should
describe the contents of the variable, a function name should indicate what
the function does, and so on. This approach makes your code self-
documenting and, therefore, easy to understand. For example,
shortest_path and spath are both valid identifiers, but the former
makes more sense. Similarly, min_height is better than mheight.
However, there are some exceptions where single-letter or abbreviated
names are fine. For example, names like i, j, k are generally used for loop
indices. When names have to be used in big and complex expressions, longer
names would make the code harder to read, so in these cases also we can
think of shorter names.
We have seen that spaces are not allowed in identifiers, so when we need
names with multiple words, we can use underscore as the word separator
(eg. marks_maths, calculate_tax). For most of the names, all
lowercase letters are used, but for class names, we generally use the
CapWords convention, in which the initial letters of all the words are
capitalized. As we proceed through the chapters and get introduced to
different program elements, we will see the naming conventions for them.
There are some built-in names, like all, any, print, sum, max, etc., that
you should not use as identifiers, although Python will not complain if you
use them. Using these names as your identifiers will overwrite the built-in
names and may cause subtle problems in your program. To view the built-in
names, you can type the following on the prompt:
>>> dir(__builtins__)
When you write your program, you will notice that the editor will highlight
different terms in your program with different colors. For example, IDLE
will highlight the keywords and built-in names in orange and purple colors,
respectively. This feature of text editors is called syntax highlighting. They
can recognize the category of a term and highlight it accordingly. So, in
IDLE, if a word is highlighted in orange, it means that it is a keyword, and
you cannot use it for your identifier name; if you try to do so, you will get an
error. If a word is highlighted in purple, it is a built-in name, and it is better
not to use it for your identifiers.
Another convention is to avoid names that start with single or double leading
underscores. However, a single underscore on its own can be used as an
identifier, and it has special meaning in the interactive mode.
2.2 Python Types
The programs that we write mainly store and process data, and data can be in
different forms; it can be numeric, text, or a list. Data can be categorized,
and each category is called data type or simply type. A data type represents a
domain of values and a set of possible operations that can be performed on
those values. For example, for integer data type, the domain of values
contains all integers, and the set of possible operations are addition,
subtraction, multiplication, etc. The data types that are predefined in Python
are called built-in data types. Python has a variety of data types that you can
use to represent your data. Here are some of them:
int float complex str bool list tuple set
dict
You can also define your own types by combining these types. We will see
how to do that later when we learn how to define classes. In this chapter, we
will look at some of Python's basic built-in data types. Before looking at
Python types, let us see what the term literal means. A literal is a notation
for a constant value of some built-in type. A literal can be a number or some
text that appears in a program; it is just a value. For example, these are some
literals:
12 35.2 'hello' True
12 is a literal of type int, 35.2 is a literal of type float, 'hello' is a
literal of string type str, and True is a literal of boolean type bool. Now,
let us see the types in detail. The first one that we will see is int type.
These are some examples of int literals:
34 123 1233 6532216
Integer values can be arbitrarily long. There is no limit on the size of
integers in Python. In practice, they are limited by the size of your
computer's memory. If we enter an int literal on the interactive prompt, it
prints the literal back. We can also perform simple arithmetic operations on
integers at the interactive prompt.
>>> 25
25
>>> 3 + 42
45
>>> 6 ** 200
426825223812027400796974891518773732342988745354489
429495479078935112929549619739
019072139340757097296812815466676129830954465240517
595242384015591919845376
In the last example, we are performing an exponentiation operation, which
gives us the value of 6200. We can have such big integers in Python; they can
be of unlimited size. While performing arithmetic calculations on integers,
we do not have to think about overflow.
By default, the integer literal values are expressed in decimal base (number
system), but they can also be expressed in hexadecimal, octal, or binary
base.
In Hexadecimal form (Base 16) Prefix 0x (or 0X)
In Octal form (Base 8) Prefix 0o (or 0O)
In Binary form (Base 2) Prefix 0b (or 0B)
If you want to express an integer value in hexadecimal, then you have to
prefix the value with zero and letter x; for octal, you have to prefix the value
with zero and letter o; and for binary, you have to prefix the value with zero
and letter b. Here are some examples of int literals in different bases:
0x1abc 0X1ABC 0o1776
0b11000011
The first 2 integers are expressed in hexadecimal, the third integer is in octal
base, and the fourth one is in binary base. If you enter these numbers on the
interactive prompt, it will print them in decimal form, as shown below:
>>> 0x1abc
6844
>>> 0X1ABC
6844
>>> 0o1776
1022
>>> 0b11000011
195
Floating-point values are numbers with a decimal point and an optional
exponent represented by lowercase or uppercase E. Here are some examples
of float type literals:
2.34 5.8 3e5 7.2e42 6.5E-24
When the letter e or E is used, the floating-point value is said to be in
scientific notation. This letter separates the number from the exponent. You
can easily represent very large or very small values using this notation. For
example, the value 6.54e25 denotes 6.54 x 1025, which is a very big
number, and the value 5.32e-13 denotes 5.32 x 10-13, which is a
very small number.
In general mathematics, we use commas in large numbers for clarity; for
example, we would write one million as 1,000,000. In Python, commas are
not allowed inside numbers, but you can use underscores to separate the
digits of numeric literal values so that you can write 1 million as
1_000_000. This feature was added in 3.6 to enhance the readability of
numeric values. Here are some examples:
45_345_678 0x_234_CAB 0o_231_354
23_456.678_566
Python supports a Boolean type bool, which takes a value of either True
or False. These are the only literal values for bool type. The first letter is
capitalized in both True and False, and both of these are keywords. bool
type is generally used in comparisons; we will see it in detail when we learn
about operators.
The complex number type is mostly used in scientific applications. Complex
numbers have a real part and an imaginary part. The imaginary part is
denoted with a suffix of lowercase or uppercase J. Here are some literals of
complex type:
3 + 5j 2 + 4j 3 + 6j
The string type str is the most commonly used type. We will explore
strings in detail in the next chapter. We have already used strings inside the
print function. A string is just a group of characters placed inside a pair of
quotes. In Python, you can enclose a string literal within a pair of single
quotes ('...') or a pair of double quotes ("..."). You can even use triple
quotes, but the most commonly used delimiters are single quotes. Here are
some literals of type str:
'Bareilly' '430164' '$129' 'Enter your
name: '
Python has a special type NoneType, which can be used to represent no
value or nothing. It has a single literal value None. There is only one None
object, and all references to None refer to that same object. So, whenever
you want to make any null object in your program, you can use None.
You can use the built-in type function to check the type of any value, as
shown below:
>>> type(23)
<class 'int'>
>>> type(True)
<class 'bool'>
>>> type(2.3)
<class 'float'>
>>> type('hi')
<class 'str'>
>>> type("Hello")
<class 'str'>
>>> type(None)
<class 'NoneType'>
The collection types like lists, tuples, dictionaries, sets, and frozensets are
covered in separate chapters in the book. Types are also known as classes in
Pythons. Later, we will see how to define our own types by using the class
keyword.
2.3 Objects
Everything in Python is implemented as an object. Any data value you write,
like any number or a string, is an object. Program elements like functions,
classes, and modules are also implemented as objects. An object is just a
chunk of memory used to store some data. So, objects are Python's
abstraction for data.
Whenever we write any literal value in our program, Python identifies its
type because of its notation and creates an object of the appropriate type. If it
sees a sequence of digits, it will create an int object; if it sees text inside
quotes, it will create a str object. For example, if we write the literal 56 in
our program, Python recognizes it as an integer literal and creates an object
of type int. Similarly, for the string literal 'Hello', it creates an object of
str type.
Figure 2.1: Objects of type int and str
Python uses objects to hold data values. Every object has a type, a value, and
an identity. For the first object, 15031263572 is the identity or the id, 56
is the value, and the type is int. For the second object, 18043139781 is
the id, 'Hello' is the value, and str is the type. The value of an object is
the data that it contains, and the type of an object determines what kind of
operations can be performed on the value. For example, we can slice str
values but not int values, and we can divide two int values but not str
values.
The identity(id) of an object is an integer that is guaranteed to be unique
among simultaneously existing objects. Each object in our program will
have a different id, which will never change during its lifetime. An object is
stored in memory, and typically, the identity of an object is the memory
address of that object, i.e., the location in the memory where the object is
stored. We can use the built-in id() function to get the identity of an object
in our program.
2.4 Variables and assignment statement
We have seen that Python uses objects to store values. If we want to work
with a value later on in our program, we can associate a name with the object
that contains the value. So, objects contain values, and to access these
objects and manipulate them in our program, we can create names and bind
them to objects. In the following example, the name x is bound to int
object with value 56, and the name p is bound to the str object with value
'Hello'.
Figure 2.2: References to objects
Whenever we write the name x in our program, the value 56 will be used,
and whenever we write p, the string 'Hello' will be used. The names x
and p are variables. In Python, variables are just names that refer (or point)
to objects; the actual data is contained in the objects. So, objects are chunks
of memory that store the actual values, and variables are names that link to
objects by storing the memory address or location of the object. We can
think of variables as object references - they are just names attached to
objects.
Now, let us see how to create a variable in our program and bind it to an
object. For that, we need to write an assignment statement:
>>> x = 56
When Python executes this statement, it creates an object of type int with
value 56. It also creates a variable named x and will make that variable refer
to this object, or we can say that it binds the name x to this object. After the
execution of this statement, whenever x appears in an expression, it will be
substituted with the value of the object that is bound to the name x. The
value 56 is contained within the object, and we can refer to the object by
using the name x. At the prompt, typing just the name of the variable will
display its value. In the program file, we have to use the print function to
print the value of the variable on the output screen:
>>> x
56
>>> print(x)
56
Since x refers to an int object, we can perform all operations on x that
make sense for int type:
>>> x + 5
61
If we send a variable to the type() or id() function, we will get the type
or id of the object that the variable is currently referring to:
>>> type(x)
<class 'int'>
>>> id(x)
15031263572
The following statement will create an object of type str with the value
'hello', and it will create and bind the variable named p to this object:
>>> p = 'Hello'
Now, whenever we write p, it will give us the value of the object bound to
the name p:
>>> p
'Hello'
>>> print(p)
Hello
Variables x and p will be available in the interactive session until we exit it.
The following assignment statement will create a new variable named z:
>>> z = x
The name z will also refer to the same object to which x is referring:
Figure 2.3: Variables x and z refer to the same object
Now, both x and z refer to the same object, so now, we can access this
object by any of these two names. We have created an alias for x.
The following assignment statement creates one more variable named y, and
it is bound to the object to which the name z is bound.
>>> y = z
Figure 2.4: Variables x, y, and z refer to the same object
Now, all three variables, x, y, and z are bound to the same object, and any
of them can be used to access the underlying object. This is known as object
sharing or aliasing.
>>> x
56
>>> y
56
>>> z
56
When we apply the id function to a variable name, we get the identity of the
object that the variable is referring to. The following output proves that all
three variables, x, y, and z, refer to the same object.
>>> id(x)
15031263572
>>> id(y)
15031263572
>>> id(z)
15031263572
Any variable can be made to refer to another value; that is why it is called a
variable. Let us see what happens when we try to change x by writing the
following assignment statement:
>>> x = 25
A new integer object with value 25 will be created, and the variable x will
refer to this newly created object:
Figure 2.5: Variable x refers to a new object
The first time, when x appeared on the left-hand side of the assignment
statement, the variable name x was created and was bound to an object. The
second time, when x appeared on the left-hand side of the assignment
statement, the name x already existed, so this time, it was rebound to an
object. Now the value of x is 25, and its id has also changed.
>>> x
25
>>> id(x)
15031264562
Can you guess what happens when we write the following assignment
statement?
>>> z = z + 3
First, the expression on the right-hand side is evaluated, the value of z is 56,
56+3 is 59, and the value on the right-hand side is 59, so a new int object
with value 59 is created, and z is now bound to this new object:
Figure 2.6: Variable z refers to a new object
The statement z = z + 3 does not in any way change the object that z
was referring to originally; instead, it rebounds z to another object. The
variable y is still referring to the object with value 56. Now, we write the
following statement:
>>> y = 3.6
A new float object is created, and y is now rebound to this object:
Figure 2.7: Variable y refers to a new object; object 56 is orphaned
Variable names have no types associated with them. They are just names,
and they can refer to any type of object. The variable y was initially
referring to an int; now it is referring to a float. You can make it refer to
any other type of object later on. This is why Python is called a dynamically
typed language. A variable in Python can be associated with any type of
object, and it can later be rebound to any other type of object. To see the type
of the object that a variable is currently referring to, we can use the type
function:
>>> type(y)
<class 'float'>
We can see that the object 56 has been orphaned; there is no variable name
referring to it. Python will notice this, and its garbage collector will
automatically remove it from the memory. The memory that was occupied
by this object can be used for some other purpose.
In our examples, we have taken single-lettered variables. In real
applications, variables with more meaningful names will be used. While
naming variables, we need to follow the same general rules that we had seen
in naming identifiers. The convention for naming variables is to use
lowercase letters with underscores separating words.
Here are some examples of variable names:
area marks_in_english total_marks
simple_interest
If you have programmed in C, C++, or Java, you might be wondering how
we can use a variable without declaring it in advance. These languages are
statically typed; they require you to declare a variable along with its type
information before it can be used in the program, and once you declare a
variable, you can never change its type. For example, if you declare a
variable of type float, it will be a float for the whole duration of the program,
and you can store only float values in it. So, in these languages, variables
have predetermined types, and a variable can be used to hold values of only
that type. Python is a dynamically typed language, so there is no need to
declare the type of a variable. Variables do not have fixed types in Python;
they are generic in nature. Instead, objects have types. A variable is just a
name, and it can refer to any type of object.
In statically typed languages, a variable is considered as a storage box that
can store a value of a specific type. The variable names represent fixed
places in memory, and we need to declare the type because the amount of
space reserved depends upon the type. In Python, a variable is visualized as
a kind of label or tag that can be attached to an object of any type. There is
no need to predeclare variables in Python as they automatically come into
existence when they are initially assigned (assigned first time), and there is
no need to specify a type because variables do not have any type associated
with them. The initial assignment introduces the name of the variable in the
program and binds it to an object, and all other future assignments to the
variable rebind it.
2.5 Multiple and Pairwise Assignments
We have seen that a variable can be created or rebound using a simple
assignment statement. More than one variable can be created or rebound in a
single assignment statement by using multiple and pairwise assignments.
We can assign multiple variables simultaneously with a common value. For
example, in the following assignment, variables a, b, and c are assigned in a
single line, and all of them refer to the same object.
>>> a = b = c = 10
Figure 2.8: Multiple assignment makes variables refer to the same object
If any of these variable names do not exist before this assignment statement,
they will be created. Variables that already exist will be reassigned.
Pairwise assignment can be done by using commas:
>>> x, y, z = 1, 2.5, 3
Figure 2.9: Pairwise assignment
The values 1, 2.5, and 3 are assigned to variables x, y, and z respectively.
As usual, if this assignment is the first for any of these variables, then it will
be created, and if the variable name already exists, then it will be reassigned.
For pairwise assignment, the number of variables on the left side should be
equal to the number of values on the right side.
2.6 Deleting a name
The del statement can be used to delete a variable name. It consists of the
del keyword followed by the name that has to be deleted.
del name
Suppose we have three variables, x, y, and z, referring to the same object.
Figure 2.10: Variables x, y, and z refer to the same object
To delete the variable name x, we can write the following statement:
>>> del x
This statement will unbind the name x from the object and will also delete
the name x. It will not delete the object referred to by x, which means that
it will not free the memory occupied by the object. An object will be
automatically garbage collected by Python only if there is no other name
referring to it; you can never explicitly destroy an object.
While the program is running, objects are automatically created and
reclaimed automatically when they become unreachable. This automatic
reclamation of the space occupied by an unreachable object is called garbage
collection. It is done to free up space so that it can be used for other objects
that may be created later on in the program. This memory management is
automatically done by Python; programmers do not have to bother about
freeing up space that is no longer in use. There is no need to manage
memory manually by writing allocation and deallocation code that is
required in other languages like C and C++. In these languages, the
programmer is responsible for allocating and deallocating memory. This is
error-prone and can cause memory leaks if not done properly. Automatic
garbage collection in Python reduces efforts and minimizes the chances of
memory management problems.
The process of garbage collection depends on the Python implementation;
typically, a reference counting mechanism is used. In each object, a
reference counter is stored, which keeps track of the number of references
referring to the object. When this number drops to 0, the object is
automatically reclaimed, which means that the memory allocated for the
object is freed. Programmers do not need to worry about how the garbage
collector works; the whole process is hidden and automatic.
Continuing our example, suppose we delete the name y and reassign the
name z:
>>> del y
>>> z = 10
Now, there is no variable name referring to object 25, so it will be garbage
collected.
We can use the del statement to delete more than one name by using
commas.
>>> del a, b, c
The del statement is used very rarely; variable names have a lifetime, and
they are deleted automatically when their lifetime is over. We will discuss
this later in the book.
2.7 Naming convention for constants
In some languages, we can define names which cannot be reassigned. Once
they are given a value, they cannot be changed throughout the execution of
the program. They are called constants. In Python, there is no concept of
constants; there is no way to define names that cannot be reset to a different
value. All names in Python can be reassigned at any time. However, there is
a widely used naming convention to indicate that you do not want a name to
be reassigned. The convention is to use all capital letters with underscores
separating words. Here are some examples:
PI = 3.14159
MAXIMUM_SIZE = 100
RATE_OF_INTEREST = 5
Use all lowercase letters in the names of variables whose values might
change, and use all uppercase letters for names that should never be
reassigned values. But remember that this is just a convention and not a
restriction; these names with all uppercase letters can be reassigned, and the
interpreter will not complain. By using all caps, you are not instructing the
interpreter that it is a constant; you are telling the programmer that it should
be treated as a constant and should not be changed.
So, why do we need such names that do not change? We could just use the
literal value 3.14159 instead of the name PI or 100 instead of
MAXIMUM_SIZE. The reason is that they can help in documenting the
program. When you need to use these literals in many places, it is better to
give them a name. The number 100 does not give any real information,
while the name MAXIMUM_SIZE is clear and understandable. Using these
descriptive labels is better than literal numbers scattered throughout your
program.
Another reason is that they are good for code maintenance. Suppose that
after some time, you decide to change the maximum size from 100 to 150;
then, you will need to change it at only one place where you have defined
this name. If you use the number 100 instead of this name, then you will
have to find every single place where this number 100 is used as the
maximum size and change it to 150, which is time-consuming and
definitely error-prone.
2.8 Operators
An operator is a symbol or a word that specifies an operation to be
performed. Here are some examples of operators in Python:
+ ** is // >> == and <=
An operator works on operands and yields a value. An operand is a data item
on which an operator acts; it can be a literal value or a variable. Here are
some examples of operands:
24 5.8 marks x
Python includes a large number of operators that fall under several different
categories, depending on the type of task that they perform.
Figure 2.11: Operators
If an operator operates only on one operand, it is a unary operator, and if it
operates on two operands, it is a binary operator. For example, the negation
operator (–) is unary, while the addition (+) and less than (<) operators are
binary. Most of the operators are binary.
2.8.1 Arithmetic operators
Arithmetic operators perform arithmetic operations like addition or
subtraction, and relational operators perform comparisons. Most of the
operators do what you would expect them to do, but some of them require
explanation. So, we will briefly discuss these categories one by one. First, let
us discuss arithmetic operators:
Figure 2.12: Arithmetic Operators
The - sign is used for both the negation operator and the subtraction
operator. To specify a number as negative, we put the negation operator in
front of it. The addition operator + adds its operands, and the * operator
multiples its operands. There are two division operators: true division (/)
and floor division (//) operator. Both these operators divide the left operand
by the right operand; the true division operator returns the result as a float
value, while the floor division returns an integer, which is the floor value of
the result. The floor value is calculated by rounding off to minus infinity
(rounding down); for example, the value of 15//2 is 7, and the value of
-15//2 is -8.
The modulo operator (%) returns the remainder when the left operand is
divided by the right operand. The result has the same sign as its second
operand. This operator can be used to check whether a number is divisible
by another number. For example, if x % y is zero, it means that x is
divisible by y. It can also be used to extract digits from the right of a
number; for example, if x is an integer, x % 10 will give the rightmost
digit, x % 100 will give the last two digits from the right, and x % 1000
will give last 3 digits.
The operator with two asterisks (**) is the exponentiation operator; float
values can be used both in the base and the exponent.
For addition, subtraction, multiplication, modulo, floor division, and
exponentiation operators, if both operands are int, the result will be an
int. If one of the operands is a float, then the result will be a float.
For the true division operator (/), the result will always be a float. The
following table will help you understand the difference between the true
division operator and the floor division operator:
Figure 2.13: Division operators
We can see the result of an operation by typing it in the interactive terminal.
Adding space around operators makes the operations more readable in the
code:
>>> 1.2 + 4
5.2
>>> 3 ** 2
9
>>> 16 ** 0.5
4.0
>>> 17 / 5
3.4
>>> 17 // 5
3
>>> 17 % 5
2
When a variable is used with an operator, its value is used, and then the
operation is performed.
>>> x = 4
>>> y = 5
>>> x + 5
9
>>> x ** y
1024
>>> x // y
0
2.8.2 Relational operators
Relational operators, also called comparison operators, compare their
operands and return either True or False.
Figure 2.14: Relational operators
The operator == returns True if its operands are equal. This operator has two
equal signs; when only one equal sign is used, an assignment is performed,
as we have seen in previous sections. A common beginner's mistake is to use
= instead of == for comparison since, in school maths, we use = for equality.
In Python, whenever you need to perform a comparison, use two equal signs,
and when you want an assignment, use one equal sign.
The operator != returns True if its operands are not equal. We also have less
than, greater than, less than or equal to, and greater than or equal to
operators. All the relational operators have the expected meaning for
numeric types int and float, and for strings, they are defined
lexicographically and case-sensitively.
>>> x = 3
>>> y = 4
>>> x < y
True
>>> x == y
False
>>> x != y
True
>>> x >= y
False
2.8.3 Logical operators
There are three logical operators that can be used to combine Boolean
values. These operators can be applied to operands that have values of True
or False or to operands that can be converted to these values.
The result of and operator will be True only when both its operands are
True; otherwise, it will be False. The result of or operator will be False only
when both its operands are False; otherwise, it will be True. The not
operator will negate the value of its operand. If the operand is True, the
result will be False, and if the operand is False, the result will be True.
Figure 2.15: Logical operators
Since relational operators return Boolean values, we can use relational
operations (like a < b) as operands of logical operators. This way, we can
make multiple comparisons by combining different conditions.
>>> x = 3
>>> y = 4
>>> x > 0 and x < 6
True
>>> x == 3 and y < 6
True
>>> x > 10 and y < 6
False
>>> x > 10 or y < 6
True
Python allows chaining of comparison operators. So, we can write
expressions like these:
>>> 1 < x < 8
True
The expression 1 < x < 8 will be True if 1 < x is True, and x < 8 is
also True. The expression implies logical AND; it is equivalent to writing 1
< x and x < 8. The chained form is more readable as it evaluates the
subexpression only once.
2.8.4 Identity operators
Sometimes, we may want to know whether two variables refer to the same
object. Instead of applying the id function on both variables and then
comparing the results, we can use the two identity operators represented by
the words is and is not.
Figure 2.16: Identity operators
Suppose we have three variables: x, y, and z. The variables x and y refer to
the same object, while the variable z refers to a different object.
Figure 2.17: x and y refer to the same object, and z refers to a different object with the same value
x is y will return True because x and y both have the same identity and
refer to the same object. x is z will return False because x and z have
different identities and refer to different objects, although their values are the
same. It is important to understand the difference between equality and
identity. The relational operators == and != test for equality, and the
operators is and is not test for identity. The equality operator will return
True for both x==y and x==z as it only tests for equality of values. Here
are some examples on the prompt:
>>> a = 123456789
>>> b = 123456789
>>> c = a
>>> a is b
False
>>> a is c
True
>>> id(a)
2293201428272
>>> id(b)
2293201428240
>>> id(c)
2293201428272
>>> a is not b
True
>>> a == b
True
The is operator is commonly used to compare a variable with None,
which is the null object of Python.
>>> a = None
>>> a is None
True
Here are a few more examples:
>>> c = 2
>>> d = 2
>>> c is d
True
>>> e = 1.5
>>> f = 1.5
>>> e is f
False
>>> g = 'cat'
>>> h = 'cat'
>>> g is h
True
We get different results here because, for small strings and small integers,
Python performs optimization and maintains a cache; it does not create a
new object. For big integer literals and floats, it will make separate objects.
2.8.5 Membership operators
There are two membership operators named in and not in.
Figure 2.18: Membership operators
These operators look for the left operand in the collection represented by the
right operand and return True or False accordingly. We will see their use
when we learn about collection types in Python. There is also a ternary
operator in Python, which we will discuss later.
2.8.6 Bitwise operators
Bitwise operators operate on individual bits in the binary representations of
their integer operands.
Figure 2.19: Bitwise operators
We will not discuss these operators in detail; at this point, it is just sufficient
to know that these low-level operations are supported in Python.
2.9 Augmented assignment statements
It is common to perform some mathematical binary operation on a variable
and then assign the result back to the variable. Here are some examples:
count = count + 1
salary = salary - 1000
marks_in_maths = marks_in_maths + grace_marks
price_pencil = price_pencil // 2
In the first statement, 1 is added to the variable count, and then the new
value is assigned back to the variable count. Similarly, in all the other
statements, we are performing some operations on the variable and assigning
the result back to the variable. Python supports augmented assignment
statements, which provide a shortcut for these types of expressions.
count += 1
salary -= 1000
marks_in_maths += grace_marks
price_pencil //= 2
Figure 2.20: Augmented assignment statements
Augmented assignment syntax is available for all binary arithmetic
operations.
2.10 Expressions
An expression is a combination of variables, literals, and operators, and it
always evaluates to a single value, which is again represented by an object.
Here are some examples of expressions:
45 + 6 20.56 – 3 * 6 marks +
50 2 + 4 * 3
(y+1) * (x-3) a <= b 35
marks
A single literal or a variable by itself is also considered an expression that
evaluates to itself; for example, the integer literal 35 is an expression, and
the variable marks is also an expression. Parentheses can be used in
expressions for enclosing some operations. We have already seen that if we
type an expression on the interactive prompt, the result of the expression is
displayed. In the program, simply writing the expression will not do
anything. We have to use the value of the expression in some way.
Evaluation of an expression generally results in the creation of a new object
so that it can be used on the right side of an assignment statement.
z = x + y * 3
Here, first, the expression x + y * 3 will be evaluated, and a new object
will be created for the result. This object will be assigned to the variable z.
So, if you want to preserve the value produced by an expression, you can
assign it to a variable. Otherwise, the value will just vanish.
2.11 Order of operations: Operator
Precedence and Associativity
When there is only one operator in an expression, it is evaluated without any
ambiguity. For example, there is no confusion in evaluating the expression
45 + 6. However, when more than one operator appears in an expression,
then you need to determine which operator will be evaluated first. For
example, consider the expression 2 + 4 * 3. There are two ways in
which this expression can be evaluated. If an addition is done first, then the
value of this expression will be 6 * 3, which is 18, and if multiplication is
done first, then the value will be 2 + 12, which is 14. According to
mathematics rules, multiplication would be done first, and 14 would be the
correct value.
In Python also, there are some rules that are followed while evaluating
expressions with multiple operators. Let us see those rules. The order of
evaluation depends on the precedence of an operator. The following table
shows the operator precedence for some common operators in Python, from
lowest to highest precedence.
Figure 2.21: Operator Precedence and Associativity
Operators in the same box have the same precedence. For example, the
operators *, /, //, % have the same precedence. To get the complete table
on your interactive prompt, you can type the following:
>>> help('PRECEDENCE')
Whenever an expression contains more than one operator, the operator with
a higher precedence is evaluated first. For example, in the expression 2 +
4 * 3, multiplication will be performed before addition because
multiplication has higher precedence than addition. In the expression x + y
< 10, firstly, the addition will be performed and then comparison because
the addition operator(+) has a precedence higher than that of the less than(<)
operator.
In the expression 36 / 2 * 3, division and multiplication are in the same
group, so they have the same precedence. If division is performed first, then
the value will be 54, and if multiplication is performed first, then the value
will be 6. In the expression 19 – 12 – 4 – 2, we have three subtraction
operators, which obviously have the same precedence. If we evaluate from
left to right, then the value is 19-12=7, 7-4=3, and then 3-2=1. If we
evaluate from right to left, we have 4-2=2, 12-2=10, and then 19-
10=9. So, for expressions that have operators with the same precedence, the
evaluation order is still a problem. To solve these types of problems, an
associativity is assigned to each group. Associativity defines the order of
evaluation for operators that have the same precedence.
In the precedence table, we can see that all the operators associate from left
to right except for the exponentiation operator, for which the precedence is
right to left. So, in the expression 36 / 2 * 3, the interpreter will first
perform division and then multiplication. The expression 19 – 12 – 4 –
2 will also be evaluated from left to right, and the answer will be 1.
In the expression 2 ** 3 ** 2, we have the exponentiation operator,
which associates from right to left, therefore, firstly, 3 ** 2 will be
evaluated, which is 9, and then 2 ** 9, which is 512.
So, these were the precedence and associativity rules in Python. If you want
to override these rules and change the default evaluation order, you can use
parentheses. The operations that are enclosed within parentheses are
performed first. For example, in the expression 2 + 4 * 3, if you want to
perform addition first, you can enclose it inside parentheses. The value of the
expression (2 + 4) * 3 is 18 because addition is performed before
multiplication.
For evaluation of the expression inside parentheses, the same precedence
and associativity rules apply. For example, in the expression 39 / (5 +
2 * 4), inside the parentheses, multiplication will be performed before
addition.
You can use nested parentheses in expressions, which means a pair of
parentheses can be enclosed within another pair of parentheses. In these
cases, expressions within the innermost parentheses are always evaluated
first, and then next to innermost parentheses, and so on, till the outermost
parentheses. After evaluating all expressions within parentheses, the
remaining expression is evaluated as usual. For example, in the expression 5
* ((10 - 2) / 4), 10 – 2 is evaluated first, then 8 / 4, and
then 5 * 2.
You can use appropriate spacing to show the evaluation order explicitly.
PEP8 suggests adding whitespace around operators with the lowest priority.
In the following expressions, the order of operations performed is clearer
due to spacing.
x + y**2 - a/b
a+b < c+d
This approach makes the code more readable. Anyone reading the code does
not need to refer to the precedence table to figure out which operation will
be performed first.
2.12 Type Conversion
You can combine different types of values in an expression. For example, 2
* 3.5 is a mixed-type expression. The two operands are of different types:
int and float. Similarly, 1.5 < x < 8 and 9 + '5' are also mixed
type expressions. Before the evaluation of such expressions, the operands
have to be converted to a common type. There can be other situations also
where you will want to convert from one type to another. For example, you
might have some numeric data in string form, and you want to convert it to
int or float so that arithmetic calculations can be performed on that data.
The process of converting a value of one type to another type is called type
conversion. There are two kinds of type conversions in Python:
Implicit type conversion (Coercion)
Explicit type conversion (Casting)
Implicit type conversion is done automatically by the interpreter when
evaluating expressions of mixed types. For example, in the expression 2 *
3.5, the interpreter will convert integer 2 to the floating point equivalent
2.0, and then both float operands will be multiplied, and the result will
be a float. The interpreter always promotes the smaller type to the larger
type to avoid any loss of data. It then performs the operation in larger type
and returns the result in larger type. The type int is considered "smaller"
than float, and float is considered "smaller" than complex. The
implicit conversion is done only in related types; it is not performed between
unrelated types like, for example, int and str.
If we try to add a string and an int, for example, '2' + 5, Python will
not perform any conversion automatically. In this case, the programmer has
to request a conversion explicitly. Explicit type conversion is performed by
writing the required type name followed by the value to be converted inside
parentheses. For example, int('2') will convert the str value '2' to
int value 2, and float(28) will convert the int value 28 to float
value 28.0. Here, int()and float() are type conversion functions.
They will try to convert a value to their respective types. For example, the
int() function will take any value and try to convert it to an integer, if
possible.
>>> int(12.3)
12
>>> int('100')
100
>>> int(True)
1
>>> int(False)
0
>>> int('two')
ValueError: invalid literal for int() with base 10:
'two'
When we convert a float to an int, the fractional part is truncated. When
Boolean values True and False are converted to int, we get 1 and 0
because True is equivalent to integer 1 and False is equivalent to integer
0. When we tried to convert the string value 'two' to an int, we got an
error because the int() function cannot convert a string to an integer if the
string does not represent a valid integer value.
The int() function can convert a string to an integer if the string
represents a number in hexadecimal or binary base. In this case, we have to
inform the int() function about the base. In the following examples, we
are converting strings that contain hexadecimal and binary values to integer
values, which are displayed in a decimal base.
>>> int('FF', 16)
255
>>> int('1010', 2)
10
We can use the str() function to convert a value to str type and
float() function to convert a value to float type.
>>> str(100)
'100'
>>> str(3.6)
'3.6'
>>> float('3.45')
3.45
>>> float(3)
3.0
If the string that you send to the float() function is something that cannot
be converted to a valid float, then Python will raise an error.
We know that the type of an object cannot be changed, so whenever there is
a type conversion, whether implicit or explicit, Python creates a new object
for the converted value.
2.13 Statements
A program is a sequence of statements, and a statement is an instruction that
the Python interpreter can execute. Statements can be simple or compound.
Statements like a = 5, x *= 10, y = a + b are simple statements.
Compound statements (e.g. if, while, for) are a group of statements that are
treated as a single statement. They generally consist of a header line ending
in a colon and an indented block that contains other statements. We will
learn about compound statements in the coming chapters.
Simple statements in Python generally end with a newline. Unlike other
languages like C++ or Java, there is no need to place a semicolon (;) to end a
statement. In Python, the end of the line means the end of the statement. So,
Python uses newline as the statement terminator. However, there are two
exceptions to this rule. If there is a backslash at the end of the line, then the
statement continues on the next line. For example, the following statement
continues on the next line because of the backslash character:
total_marks = marks_science + marks_maths \
+ marks_english + marks_socials \
+ grace_marks
So, if you have to write a statement that is too long to fit on a single line, you
can spread it on multiple lines by using backslash (\) as the continuation
character. This character at the end of the line indicates that the next line is a
continuation. This way, you can join multiple adjacent lines to construct a
single statement. This is called explicit line joining or explicit continuation.
Another situation when a statement does not end with a newline is when an
opening delimiter like parentheses, square brackets, or curly braces has not
been closed yet. In this case, Python automatically continues the statement
on the next adjacent line. This is called implicit line joining or implicit
continuation.
months = [
'January', 'February', 'March', 'April',
'May', 'June', 'July', 'August',
'September', 'October', 'November',
'December'
]
if (is_leap==TRUE and month=='MARCH'
and weekday=='SUNDAY'):
student = {
'name': 'John',
'gender': 'M',
'city': 'Paris',
'age': 21,
}
Thus, any expression that is inside parentheses (), square brackets [], or curly
braces { } can be split over more than one line without using backslashes.
An exception to this is when there is an unterminated string literal enclosed
in single or double quotes.
print('Age should be less than 80
and greater than 18')
Here, the implicit line joining will not work.
You can take advantage of the implicit continuation to write more readable
code. Instead of inserting backslashes to continue the statement, it is
recommended to enclose your expression in parentheses to increase
readability.
total_marks = ( marks_science + marks_maths
+ marks_english + marks_socials
+ grace_marks )
You can place multiple statements on a single line by separating the
statements with a semicolon. For example, the following line of code
consists of actually four statements.
a = 10; x = 5; y = a + x; z = a - y;
However, this style is not recommended. Writing a single statement on each
line is preferred as it makes the code more readable and easier to understand.
2.14 Printing Output
Most computer programs interact with the user; they take some input from
the user at run time and display some sort of output on the screen. In Python,
we use the input function to get input from the console and print
function to display the output on the console. In this section, we will discuss
the print function, and in the next section, we will discuss the input
function.
We have already seen how to display information on the screen by using the
print function. We have used it to print a literal, value of a variable or any
expression. Write the following statements in a .py file, execute it, and
observe the output.
print('Let us start programming')
print(5 + 3*6)
name = 'Devank'
age = 10
print(name)
print(age)
Output-
Let us start programming
23
Devank
10
The first statement prints a string literal, the second one prints the value of
an expression, and the last two print the values of variables.
We can display multiple items in a single print call by separating the items
with commas. Here are some examples:
name = 'Devank'
age = 10
print(name, age)
print('Age =', age)
print('Five times six is', 5 * 6)
print('My name is', name, 'I will be an adult
after', 18 - age, 'years')
Output-
Devank 10
Age = 10
Five times six is 30
My name is Devank I will be an adult after 8 years
The first statement with the print function prints two variables, the second
one prints a string and a variable, the third one prints a string and an
expression, and the fourth one prints a combination of strings, a variable,
and an expression, all separated by commas. Note that only string literals are
enclosed in quotes, while other items are written without quotes.
When we use the print function to display multiple items, all the items are
separated by a single space in the output. If we want to change this default
behavior and want the items to be separated by something else, then we can
specify a separator by adding a sep parameter at the end of the print call.
We will learn about the term parameter when we discuss functions. At this
point, you just need to know that you can write sep= followed by a string
literal that you want to be used as the separator.
day = 9
month = 11
year = 1977
print(day, month, year, sep='/')
print(day, month, year, sep='-')
print(day, month, year, sep='::')
print(day, month, year, sep='')
Output-
9/11/1977
9-11-1977
9::11::1977
9111977
In the first print call, we have specified '/' for the sep parameter, so
each value in the output is separated by a '/'. Similarly, for the second
print call, each value in the output is separated by a dash, and in the third
print call, it is separated by two colons. If we do not want anything to be
printed between the values, we can specify an empty string for the sep
parameter, as we have done in the fourth print call. In this case, nothing is
printed between the values, and so all the values are just joined together in
the output.
From the output of our programs, we can see that every print call ends
with a newline. This means that after printing everything, the cursor
automatically moves to the next line. Thus, the output of the next print
call starts with a fresh line. If we want the print call to end with something
else instead of a newline, we can specify the end parameter. For example,
end='?' will end the line with a question mark.
print('Hello world', end='---')
print('Python is easy', end=' ')
print('Python is interesting!', end='')
print('Programming is fun')
print('Good Bye')
Output-
Hello world---Python is easy Python is
interesting!Programming is fun
Good Bye
In the first print call, we have specified '---' for the end parameter,
so the output of this call ends with '---' instead of a newline. Similarly,
the output of the second print ends with a space because of the end
parameter. The third print has an empty string as the end parameter, so it
prints nothing at the end, and the last two print calls end with the default
newline. If required, we can write both the sep and end parameters in a
single print call to specify our own custom separator and custom line ending.
This gives us more control over the format of our output.
You can write a print call with empty parentheses to insert an empty line
in the output. For example, when the following code is executed, two empty
lines will be printed between the two lines of text.
print('Let us start programming')
print()
print()
print('Python is interesting')
Output-
Let us start programming
Python is interesting
There are other ways of formatting the output, which we will learn in the
next chapter.
2.15 Getting user input
A program that does not take any input from the user will essentially
perform the same computations and will produce the same output every time
it is executed. Most of the time, we have to write programs that interact with
the user and behave differently depending on the user's response. To write
such interactive programs, we should know how to get input from the user
and use it in our program.
The built-in function input() can be used to get keyboard input from the
user. When the input() function executes, the program is paused, and the
user is expected to enter some text on the screen.
print('Enter name of a city : ', end='')
city = input()
print('You entered', city)
When you execute this code, first, the message of the print call is
displayed on the screen. After this, the input function is called; this call
pauses the program, and the interpreter waits for the user to enter some text.
The user types the input and ends the input by pressing the Enter key, and
after this, the program execution continues. The input function returns the
entered text as a string, which means that a string object is created. To use
this string in our program, we have to assign it to a variable. In our program,
we have assigned the string to a variable named city. After this, we used
the variable city in a print function call. Here are two sample runs of the
program:
Sample Run 1-
Enter name of a city : Bangalore
You entered Bangalore
Sample Run 2-
Enter name of a city : Bareilly
You entered Bareilly
The input() function captures the data entered by the user in a string, and
that data can be used in the program by using the variable name.
Before asking the user to input something, we need to print a clear message
telling the user exactly what kind of data to enter. This message is called a
prompt. We have displayed this message by using the print function, but
the input function is also capable of displaying the prompt. Writing a
separate print function for the prompt is not required; we can place the
prompt inside the parentheses of the input function.
city = input('Enter name of a city : ')
print('You entered', city)
This input function call first prints the prompt and then returns the text
entered by the user as a string. In our next example, we are going to enter
salary, display it, and then increment by using the augmented assignment
syntax, and then again display it.
salary = input('Enter salary : ')
print('Initial salary', salary)
salary += 200
print('Incremented salary', salary)
Here is a sample run of the program -
Enter salary : 1200
Initial salary 1200
Traceback (most recent call last):
File "C:\Users\test.py", line 3, in <module>
salary += 200
TypeError: can only concatenate str (not "int") to
str
We got a TypeError, because the input function always returns the user
input in the form of a string. We typed 1200 on the screen, and it was
returned as the string '1200'. You can check the type of variable salary by
using the type function. It will show str. When we added 200 to
salary, the interpreter complained, saying that the two types are different
and it cannot perform implicit conversion. We want salary to be of
numeric type since we will have to do arithmetic calculations with it. The
type str does not support arithmetic operations, so we will perform an
explicit conversion here.
salary = int(input('Enter salary : '))
print('Initial salary', salary)
salary += 200
print('Incremented salary', salary)
We enclosed the call to the input function inside the int function, so now
the value returned by the input function is converted to int. Now, when
we run it, it gives the expected output.
Enter salary : 1200
Initial salary 1200
Incremented salary 1400
Now, if we check the type of salary, it will show int. The input function
always returns a string; it is your responsibility to convert the data returned
by input to the required type. So, when you expect a numeric input from the
user, make sure to convert the input to a numeric type using the correct
conversion function.
2.16 Complete programs
Now we know enough basic concepts to start writing short and simple but
complete programs. We know how to get input from the user, perform some
basic calculations, and how to print and format output. So let us start writing
some programs.
I. Write a program that enters two numbers and displays their sum, product,
and difference.
n1 = int(input('Enter first number : '))
n2 = int(input('Enter second number : '))
print('Sum =', n1 + n2)
print('Difference =', n1 - n2)
print('Product =', n1 * n2)
II. Write a program that enters height in inches and displays it in feet and
inches.
ht_inches = int(input('Enter the height in inches :
'))
ft = ht_inches // 12
inches = ht_inches % 12
print(ft, 'feet', inches, 'inches')
III. Write a program that inputs the length and breadth of a rectangle and
displays its area, perimeter, and length of the diagonal.
length = float(input('Enter length of rectangle in
cm : '))
breadth = float(input('Enter breadth of rectangle
in cm : '))
area = length * breadth
perimeter = 2 * (length + breadth)
diagonal = (length*length + breadth*breadth) ** 0.5
print('Area of rectangle is ', area, 'sq cm')
print('Perimeter of rectangle is ', perimeter,
'cm')
print('Diagonal of rectangle is ', diagonal, 'cm')
IV. Write a program that prompts the user to enter the values of principal,
interest rate, and time and compute simple interest and compound interest.
Formulas for calculating simple interest and compound interest are:
simple interest = (principal * rate * time) / 100
compound interest = amount – principal, where amount = principal (1 + rate
/ 100)time
principal = float(input('Enter the principal : '))
time = int(input('Enter the time in years : '))
rate = float(input('Enter the interest rate : '))
simple_interest = (principal * time * rate) / 100
print('Simple interest is ', simple_interest)
compound_interest = principal * (1 + rate / 100) **
time - principal
print('Compound interest is ', compound_interest)
V. Write a program that prompts the user to enter a student name and marks
in 3 subjects. Calculate the percentage marks and display the student's name
with the percentage.
name = input('Enter name : ')
marks_maths = int(input('Enter marks in maths : '))
marks_physics = int(input('Enter marks in physics :
'))
marks_chemistry = int(input('Enter marks in
chemistry : '))
total_marks = marks_maths + marks_physics +
marks_chemistry
percentage = (total_marks/300) * 100
print(name, percentage)
We would suggest you type in the programs shown in the book, run them,
modify them, and experiment with them in different ways. Initially, while
typing and coding, you will make mistakes that the interpreter will flag as
errors. Fixing these mistakes and getting your program to run is an integral
part of the learning process. It will help you become familiar with the syntax
and features of the language. Active engagement with code will also help
you to understand and retain the concepts and have a solid grasp of the topic.
This hands-on approach is the most effective way to learn programming.
2.17 Comments
In your program file, you can not only write Python code but can also
include notes to explain the code. This becomes more important if your
programs are lengthy and complicated and there is a team of programmers
working together. When you are developing a program, you are deep into it
and have an understanding of how it works. However, upon revisiting the
code later, you might forget how you made things work. Understanding a
complicated program by just looking at the code is difficult; reading the
notes will help you understand the code faster and save you time. This is
also true for other fellow programmers who need to read and understand
your code.
These notes are called comments in programming languages. A comment is a
piece of text that is inserted in between the code to explain the purpose of
your code to other programmers or to yourself when you revisit the code.
Code that is properly documented with comments makes the program more
readable and understandable, and so it is easier to maintain and update.
Comments are written only for human readers; they are ignored by the
interpreter, so they have no effect on the execution of the program.
In Python, a comment starts with a hash sign (#) and lasts till the end of the
current line. Any text after the # sign till the end of the line will not be
executed. The interpreter just ignores it. A comment can be written on a new
line or after a statement on the same line. Figure 2.22 shows a code snippet
that contains some comments.
Do not try to understand the code because many structures used in it have
not been introduced yet. The code is here just to illustrate how comments are
used to explain the purpose of the code. Comments should not be written for
code that is doing something obvious; such comments are unnecessary and
should be avoided.
Figure 2.22: Comments in a Python program
If you need to write a multi-line comment (block comment), then you have
to precede each line with the # sign. In IDLE, you can easily comment
multiple lines by selecting those lines, going to the Format menu, and
selecting Comment region.
In addition to documentation, there is another use of comments. You can use
comments to disable part of your program while testing or debugging.
Debugging is the process where you are trying to find out why the code is
not working. You can temporarily comment some parts of the program that
you think might be creating problems.
The code that is commented out will not be executed when you run your
program. So, if your program is not working as expected, then you can
comment a piece of code and see if the code runs fine. Text editors generally
have the facility of commenting out pieces of code, so you do not have to
manually put a # sign in front of each line that you want to disable. Later,
you can remove the commenting signs from your disabled code by choosing
the Uncomment option in your editor.
2.18 Indentation in Python
Indentation is the whitespace (spaces or tabs) that is present before the
beginning of a code line. In most of the languages, indentation is done just to
increase the readability, but in Python, it is very important. Python forces
programmers to structure their code through indenting. So, indentation is not
a matter of style, but a part of syntax in Python. Indentation of each line
matters; wrong indentation can result in either an indentation error or
incorrect behavior of your program.
Python uses indentation for grouping statements to form code blocks. In the
code snippet that we saw in section 2.17, we can see the code blocks being
defined with different indentations. Continuous statements with the same
indentation belong to the same code block. Higher levels of indentations
indicate nested code blocks. Unlike other languages, Python does not use
braces or words like begin or end to define the boundaries of blocks of code.
It uses indentation for this purpose. As we move through the chapters and
learn about different compound statements such as if..else, while, for, def,
etc., we will see how to use indentation for defining blocks.
The code that we have written till now is top-level code of the file; it is not
indented, which means that there should be no whitespace at the beginning
of the statement. So, till we get introduced to compound statements, we will
write all our code without any indentation. The following program will give
an indentation error if you try to execute it.
name = input('Enter name : ')
age = int(input('Enter age : '))
print(name, age)
The error is caused due to an unexpected indentation in the second
statement. The program will execute if you remove the two spaces present at
the beginning of the second statement.
2.19 Container types
In the next few chapters, we will learn in detail about the built-in data
structures or collection types in Python -lists, tuples, dictionaries, and sets.
These are also called containers, as they provide a way of combining and
organizing data. These data structures are used to hold different types of
objects. Here are some examples of literals of these types:
Lists (type list) [1, 2, 3, 4]
Tuples (type tuple) (1, 2, 4, 5)
Dictionaries (type dict) {'a': 1, 'b': 2, 'c': 3}
Sets (type set) {2, 3, 4, 6, 8}
Lists and tuples are sequence types in Python, which means that they are
ordered collections of values. These types contain a left-to-right order
among the items that they contain. We can tell which one is the first element,
which is the second, which is the last, and so on. In these types, the
contained items are accessed using their positions. Dictionaries are the
mapping type as they store objects by keys. In Python 3.6 and earlier
versions, dictionaries were unordered, but from version 3.7, they are
ordered. Sets are neither mapping nor sequences; they are just collections of
unique objects.
2.20 Mutable and Immutable Types
We know that each object has a type, an id, and a value. The type and id of
an object remain the same throughout the program; they cannot be changed.
Whether the contained value can be changed or not depends on the
mutability of the object.
Python types can be categorized as either mutable or immutable depending
on whether the value of an object of that type can be changed or not. If a
type is immutable, the value inside an object of that type cannot be changed.
You can never overwrite the value of an immutable object. If a type is
mutable, then the value contained inside the object of that type can be
changed at run time. Here are some immutable and mutable types in Python.
Immutable - bool int float str tuple
frozenset
Mutable - list set dict
Mutable types support operations that can change the value inside the object
at run time, while immutable types do not provide any operation that can
change the value inside the object. The state of an immutable object is fixed
at the time of creation and cannot be modified later.
You need to keep in mind that mutability has nothing to do with the variable
names. Let us try to understand this. Suppose the variable name x refers to a
mutable object, and the name a refers to an immutable object.
Figure 2.23: Mutable and Immutable objects
The object to which x is referring is a mutable object, so the value inside it
can be changed. We generally say that it can be changed in-place. The object
to which a is referring is an immutable object, so it will remain as it is
throughout its lifetime. You cannot overwrite it; it cannot be changed in-
place.
The variable referencing any object can always be reassigned to a different
object; we can make x, or a refer to a different object. So, mutability is
associated with types and objects, not with variables.
You need to clearly understand the difference between the terms rebinding a
variable and mutating an object. Rebinding a variable means making a
variable refer to a different object, and mutating an object means making in-
place changes in that object. Only mutable objects can be mutated. In our
example, the variable a refers to an int object with value 56; the
operations like a = a + 3 seem to change the value, but remember this is
rebinding. int is an immutable type, so you cannot modify the value inside
an object once it is created. You can only create a new object with a different
value. The value 56 inside the object is not changed to 59. Instead, a new
object with value 59 is created, and a refers to that object. So, any operation
on the immutable types that seems to modify the value results in the creation
of a new object with the modified value.
If a variable refers to an immutable object, you can see changes in that
variable only by rebinding that variable. If a variable refers to a mutable
object, you can see changes in that variable by rebinding it or by making in-
place changes in the object that it is referring to. In our example, we have
variable x referring to a list object and variable a referring to an int
object. We can see changes in x by rebinding x to a different object or by
changing in-place the list object that it is currently referring to. We can see
changes in a only when we rebind it to a different object.
Mutability matters when there are multiple references referring to an object.
Suppose we have three variables x, y, and z, that refer to a list object
and three variables a, b, and c that refer to an int object.
Figure 2.24: Multiple references to objects
If we make any in-place changes to the list object through any of the
variables x, y, or z, then that change will be visible in the other two
variables also because all three of them share the same object. In the case of
immutable objects, these types of side effects will not occur because they
cannot be changed in-place. This distinction is very important to understand,
and it will become clearer as we proceed through the chapters and cover
some of the mutable and immutable types in detail.
2.21 Functions and methods
We will talk about functions and methods in detail later, but since we will be
using built-in functions and methods in the next few chapters, you need to
have a general idea of what they are and how you can use them.
A function is a reusable piece of code with a name, and it can perform
certain operations for you. You can give it some values called arguments; it
performs some work for you, and it might give you a value back. The built-
in functions are the functions that are already written for us and are always
available, so we can easily use them. We have already used some built-in
functions, like print, input, type, and id. We know that to call them,
we need to write their name followed by parentheses, which can include
some arguments. Arguments are values that provide some information to the
function for performing its work. If the function doesn't need any arguments,
then the parentheses remain empty. Here are some examples of built-in
functions:
abs(x) - Returns absolute value of x
bin(x) - Returns binary equivalent of x
oct(x) - Returns octal equivalent of x
hex(x) - Returns hexadecimal equivalent of x
max(a, b, c, ……) - Returns maximum value among the provided
arguments
min(a, b, c, ……) - Returns minimum value among the provided
arguments
A method is like a function, but it is specific to a type, and we access it by
using a dot. To call a method, we write the variable name or a literal
followed by a dot, then the method name, and then the parentheses, which
can include arguments.
'hello'.upper()
list1.append(10)
Here, we are calling the method upper on a string literal, and we are calling
the method append on a variable named list1 that refers to a list
object.
Functions are not specific to any type, so they are called independently
without the dot syntax. You can think of functions as generic operations that
can work with multiple types. For example, the built-in function len can be
used to find the length of a string, list, or a dictionary. Methods are type
specific operations that are attached to types and can act on an object of a
specific type only. For example, the method upper can act only on an
object of type str, and the method append can act only on an object of
type list. We will discuss most of the methods related to the types that we
will see in the coming chapters.
A type defines many methods, and it is not possible to remember all methods
associated with a particular type. So whenever required, you can go to the
interactive prompt and get a listing of all the methods. To know about the
methods available for a data type, just type dir(typename) on the
interactive prompt, and it will show you all the methods available for that
type. For example, to see all the methods for the list type, we can write:
>>> dir(list)
['__add__', '__class__', '__class_getitem__',
'__contains__', '__delattr__', '__delitem__',
'__dir__', '__doc__', '__eq__', '__format__',
'__ge__', '__getattribute__', '__getitem__',
'__gt__', '__hash__', '__iadd__', '__imul__',
'__init__', '__init_subclass__', '__iter__',
'__le__', '__len__', '__lt__', '__mul__', '__ne__',
'__new__', '__reduce__', '__reduce_ex__',
'__repr__', '__reversed__', '__rmul__',
'__setattr__', '__setitem__', '__sizeof__',
'__str__', '__subclasshook__', 'append', 'clear',
'copy', 'count', 'extend', 'index', 'insert',
'pop', 'remove', 'reverse', 'sort']
This is the result that we get. There will be lots of methods with leading and
trailing underscores, and these methods represent the implementation details
of the type and help in customization. The methods towards the end are the
ones without any underscore, and these are the methods that we will be
mostly using.
This command shows you the names of methods. If you want to know more
about a particular method, you can use help. On the interactive prompt, write
help, then inside parentheses, write typename followed by a dot and then the
method name.
>>> help(list.append)
Help on method_descriptor:
append(self, object, /)
Append object to the end of the list.
>>> help(str.upper)
Help on method_descriptor:
upper(self, /)
Return a copy of the string converted to
uppercase.
Here, we are getting help on the append method of list type and the
upper method of string type.
These functions dir() and help() accept both the type name or a
variable name. So, suppose you have a variable s referring to a string object,
you can use dir and help on s also. If you write help(typename)
then it will show you the description of all the methods.
2.22 Importing
There are many predefined functions in the standard library that we can use
in our program, but unlike built-in functions, these functions are not
automatically available in our program. These functions are organized in
modules (Python files), and we have to import them to make them available
in our program. For example, the math module contains many mathematical
functions. The random module provides functions for randomization. In the
following code, we are importing and using sqrt and trunc functions
from the math module.
from math import sqrt, trunc
x = 34
y = 23.4
print(sqrt(34))
print(trunc(23.4))
Output-
5.830951894845301
23
If you import a module by writing import modulename, then all the
names in that module can be used in your program, but they have to be
preceded by the module name and a dot.
import math
x = 34
y = 23.4
print(math.sqrt(34))
print(math.trunc(23.4))
We can import modules from the rich standard library and make use of lots
of pre-existing functionality, and that is why the term 'batteries included' is
used for Python. You can see a list of standard library modules in the official
Python documentation, and to know more about a module, import it on the
shell and use help on it.
>>> import math
>>> help(math)
To see all the available names in a module, you can use the dir function
after importing it.
>>> dir(math)
['__doc__', '__loader__', '__name__',
'__package__', '__spec__', 'acos', 'acosh', 'asin',
'asinh', 'atan', 'atan2', 'atanh', 'cbrt', 'ceil',
'comb', 'copysign', 'cos', 'cosh', 'degrees',
'dist', 'e', 'erf', 'erfc', 'exp', 'exp2', 'expm1',
'fabs', 'factorial', 'floor', 'fmod', 'frexp',
'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose',
'isfinite', 'isinf', 'isnan', 'isqrt', 'lcm',
'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2',
'modf', 'nan', 'nextafter', 'perm', 'pi', 'pow',
'prod', 'radians', 'remainder', 'sin', 'sinh',
'sqrt', 'tan', 'tanh', 'tau', 'trunc', 'ulp']
To get help on a specific name from the module, use help on that name.
>>> import math
>>> help(math.floor)
Help on built-in function floor in module math:
floor(x, /)
Return the floor of x as an Integral.
2.23 Revisiting interactive mode
We know that when we enter a statement on the shell prompt, it will be
executed, and when we enter an expression, it will be evaluated, and its
value will be printed. This automatic printing is there only in interactive
mode. In script mode, you need to use the print function to print the value
of expressions. The print function works in the interactive mode also, but
is not required so you can save some typing.
>>> a = 10
>>> a
10
>>> print(a)
10
A single underscore is a valid identifier name in Python, and in scripts, it is
used to ignore values, as we will see later. In interactive mode, a single
underscore ( _ ) is a special variable name that stores the result of the last
expression that was evaluated. You can use this variable in another
expression on the prompt.
>>> a = 5
>>> a + 2
7
>>> _
7
>>> _ + 5
12
When you enter multiline statements on the interactive prompt, the prompt
string changes from ">>>" to "….".
>>> total_marks = marks_science + marks_maths \
... + marks_english + marks_socials \
... + grace_marks
>>> months = [ 'January', 'February', 'March',
'April',
... 'May', 'June', 'July', 'August',
... 'September', 'October', 'November',
'December']
So, when you type something that occupies more than one line, the prompt
changes to three dots.
While experimenting in the interactive mode, you would often make
mistakes and get errors. In those cases, you would want to edit and rerun
your previous command. You can do this without retyping the previous
command by making use of the command history. Each interactive session
maintains a history of all the commands that you type at the shell prompt.
You can scroll through these commands by pressing "Alt+P" (for the
previous command and "Alt+N" (for the next command). On Mac, you have
to use Command-P and Command-N. Up and down arrow keys can also be
used on some systems for scrolling through commands. By using arrow
keys, take the cursor on the desired command and then press Enter to select
that command. You can also click on a previous command and press Enter to
get that command on your prompt. Once you get a command displayed on
your prompt, you can either edit it and then execute it, or you can just
execute the command as it is.
We know that when a program is executed, its output appears in the shell
window. When the execution of the program is over, the Shell window
retains focus and displays a shell prompt. Now, you can explore the result of
the program execution on this prompt. For example, you can see the final
values of variables you defined in the program. Any names that you had
defined in the program would be available in the Shell window after the
execution of the program.
2.24 Errors
As you start writing programs, you will encounter many errors in your
programs. Understanding and fixing errors is a part of the learning process.
It improves our understanding of the language and problem-solving skills.
We can broadly categorize errors into three types – syntax errors, run time
errors, and logical errors.
Syntax is a set of rules that define how the code instructions should be
written in a language. In the previous chapter, we saw that our source code is
compiled before being executed by PVM. During the compilation step, the
compiler checks the syntax of each instruction and translates it to bytecode.
When it finds anything written in the wrong syntax, it stops the translation
and displays an error message. These errors are called syntax errors or
parsing errors, and they occur due to the incorrect syntax of the code. For
example, you might miss a colon or a quote or use an unbalanced pair of
parentheses. When there is a syntax error in your program, and you try to run
the program, IDLE shows a dialog box, and it also highlights the location
where the syntax error is detected in your program. You need to fix the error
by making changes in your code and running the program again. As a
beginner, you will find yourself making many syntax errors, but as you get
used to the language, their frequency will reduce. It is generally not very
difficult to identify and remove these errors from your program. Some
development environments (not IDLE) underline the syntax errors as you
type the code.
If there are no syntax errors, the byte code is generated, and your program
enters run time. The byte code goes through the Python Virtual Machine,
which executes it by converting it to machine code. Run time is the time
when your program is executing; during this time, your program will interact
with the user and might be connected with multiple external resources. If an
error occurs during this time, then the execution of the program stops
immediately, and it is terminated with an error message. Any error that
occurs at this run time is called a run time error. There are some run time
errors that are caused due to some mistake in your code, and they can be
removed by modifying your code. Some run time errors occur due to
unusual events at run time, and they are not under the control of your
program. To handle them, you have to write the error handling code, which
we will discuss in Chapter 20.
Logical errors occur when your program runs smoothly and gives you the
output, but the output that it gives is not what was intended, so your program
works, but it doesn't do what you expect it to do. These errors occur due to
the wrong logic of the code that you have written. The problem is not with
the code. The program does exactly what it has been told to do. The problem
is that the programmer was not able to communicate properly the solution in
the form of code, or maybe the solution that the programmer has come up
with is not correct. It could be due to things like a missing assignment, the
use of a wrong operator, or an incorrect algorithm. These types of errors are
not reported by the interpreter. The programmer has to identify them, so
these errors are the most difficult to detect and remove. You have to examine
your code and debug the program, and at times, take the help of a debugger.
2.25 PEP8
Python Enhancement Proposals (PEPs) are documents that describe a new
Python feature or provide information to the community. There are many
PEPs that are listed in PEP0 document, which is the index of PEPs and can
be accessed at https://peps.python.org/pep-0000/. Of these PEPs, the most
useful for Python programmers is PEP8, which is a style guide for writing
Python code.
PEP8 was written by Guido van Rossum, Barry Warsaw, and Nick Coghlan
in 2001. You can read it online at https://peps.python.org/pep-0008/. This
document provides various coding conventions and best practices to write
readable and consistent code in Python. According to Guido van Rossum,
"Code is read much more often than it is written", and according to Zen of
Python, "Readability counts." Readability and consistency are important
because the code is written once but read many times by different people for
various reasons, like collaborating on a project or debugging and adding new
features. Writing PEP8-compliant code will make it easier for you and others
to read and understand your Python code.
The guidelines in the document are only recommendations; if you write code
that does not conform to PEP8, your code will still work as long as it follows
the syntax of the language but might not be considered professional by the
Python community. Therefore, it is good to be aware of the best practices
and develop a habit of writing code that adheres to the community
guidelines.
The PEP8 document includes coding conventions for indentation,
whitespaces, naming things, and other coding constructs that we have yet to
learn. We will see the conventions as we get introduced to the coding
structures. However, I would recommend you to read the document at least
once. There are many tools and IDEs that will automatically format your
code according to PEP8.
Exercise
1. Which of the following cannot be used as a variable name?
(A) Null (C) Nil
(B) None (D) Not
2. Which of the following is not a valid identifier name?
(A) min_marks
(B) marks2
(C) net-sales
3. Which of these are the literal values of bool type?
(A) true,false
(B) TRUE,FALSE
(C) True,False
4. Python is a case-sensitive language.
(A) True (B) False
5. x = 56.6
What will be the type of x?
(A) rational (C) float
(B) int (D) decimal
6. All keywords in Python are in lowercase.
(A) True (B) False
7. Which of the following is not a valid int literal in Python?
(A) 0o356
(B) 0x1009
(C) 10,000
8. Which of the following will give an error?
(A) a = 0x3F
(B) b = 0o496
(C) c = 0b110
9. Which of the following is not a valid float literal?
(A) .98 (C) 9e8
(B) 9.8 (D) All are valid
10. x = "True"
What will be the type of x?
(A) int
(B) str
(C) bool
11. The value contained in an object cannot be changed if the object
belongs to __________ type, and the contained value can be changed
if the object belongs to ________ type.
(A) a mutable, an immutable
(B) an immutable, a mutable
12. x = 960
y = x
Do x and y reference the same memory location?
(A) Yes (B) No
13. 4e-5 is equivalent to
(A) 0.000004
(B) 400000.0
(C) 0.00004
14. Python is a __________ typed language.
(A) statically (B) dynamically
15. Which of these functions can be used to get the identity of an object?
(A) identify()
(B) id()
(C) identity()
16. A Python object can be dynamically assigned to any variable in
Python.
(A) True (B) False
17. Existence of a variable name in Python begins with _____
(A) a declaration
(B) an assignment statement
18. To delete a variable named x, what will you write?
(A) delete x
(B) del x
(C) remove x
19. Which of these operators can be used both as a unary operator and a
binary operator.
(A) % (C) /
(B) - (D) *
20. Which of these is the exponentiation operator in Python.
(A) % (C) *
(B) ^ (D) **
21. Which of the following expression evaluates to False
(A) 3 == 3 (C) 3 <= 3
(B) 3 != 3 (D) 3 >= 3
22. What will be the value of expression 23 / 2?
(A) 11 (C) 11.5
(B) 11.0 (D) 12
23. What will be the values of expressions 23//2 and -23//2 ?
(A) 11, -12
(B) 11.0, -12.0
(C) 11.5, -11.5
24. Value of expression 36 ** 0.5 is
(A) 6.0
(B) 6
(C) 12.0
25. Value of expression 23 % 3 is
(A) 2
(B) 3
(C) 7
26. x = 10 // 3
What will be the type of x?
(A) int (B) float
27. What is the value of the expression not(4 > 8) ?
(A) True (B) False
28. Which one is an equivalent logical expression for not(a > b) ?
(A) a < b
(B) a > b
(C) a <= b
29. Which one is an equivalent logical expression for a < 50 and a
> 4?
(A) 4 < a < 50
(B) 50 < a < 4
(C) a < 4 < 50
30. The expression p <= q < r <= s is equivalent to
(A) p <= q and q < r and r <= s
(B) p <= q or q < r or r <= s
31. x == y will return True only when both x and y refer to the same
object
(A) Yes (B) No
32. If a == b is True, the expression a is b will definitely be True.
(A) Yes (B) No
33. A single line comment in Python begins with _____
(A) $ (C) #
(B) /* (D) //
34. What is the value of the expression 2 ** 2 ** 3?
(A) 64 (B) 256
35. What is the value of the expression 27 / 3 / 3?
(A) 27.0
(B) 9.0
(C) 3.0
36. Which of these operators has right to left associativity?
(A) + (C) **
(B) * (D) //
37. Which of these symbols is the line continuation symbol?
(A) # (C) /
(B) $ (D) \
38. Which of the following expressions will give error?
(A) 2+30)/(5-3)
(B) (4+3)(3-5)
(C) None of these
39. What will be the output of the following print call?
print(2,000,000)
(A) 2,000,000 (C) 2e6
(B) 2000000 (D) 2 0 0
40. Which of the following expression shows explicit type conversion?
(A) int(9.8) + 7.3
(B) 3.4 + 5.4
(C) 7 % 2
(D) 17.5 % 3
41. Which of the following expression involves implicit type conversion?
(A) int(9.8) + 7.3
(B) 7 % 2
(C) 17.5 % 3
42. What will be the values of expressions?
3.5/0.2, int(3.5)/0.2, int(3.5/0.2)
43. What will be the output of the following print call
print(3.0e250 * 1.6e150
(A) 4.8e+400 (B) inf
44. What will be the output of the following print call print(2.4e-
250 / 1.2e200)
(A) 2.0e-450 (B) 0.0
45. An object can have only one name associated with it.
(A) True (B) False
46. In Python, types are associated with _______
(A) Objects (B) Variables
47. del statement deletes
(A) Variable names (B) objects
48. What is the value of the expression 35 == '35' ?
(A) True (B) False
49. Correct the following print call so that it correctly prints the strings
literals and values of variables.
name = 'Devank'
age = 10
print('My name is, name, and age is, age')
50. What will be the values of expressions 11//3, int(11//3) ,
-11//3 and int(-11/3) ?
What will be the output of code given in questions 51 to 65?
51. a = 5
print(3 < a < 10)
52. x = 5
x++
print(x)
53. m = 12
n = m = m-10
print(m, n)
54. n = 5
n *= n-1
print(n)
55. x = 2
y = 4
x + 4
y + 5
print(x, y)
56. n1 = 9
n2 = 3
n3 = 6
average = n1 + n2 + n3 / 3
print(average)
57. a = 2
b = 3
a+1 = b
print(a, b)
58. x = 0581
x +=1
print(x)
59. x = 2
y = 3
print(x =< y)
60. salary = 1000
raise = 100
new_salary = salary + raise
print(new_salary)
61. x = 5
y = 6
print(x + y)
62. print('Hello', end = ',')
print('Hi', end = ',')
print('Hey', end = ',')
63. a = 5
b = 6
c = 11
print(a<b or b<10 and c<a)
64. x = +92
y = -92
print(x, y)
65. print('Hello world')
print = 4
print(2 + 5)
66. What will be the output of the following program if numbers 2 and 5
are entered when it is executed?
n1 = input('Enter first number : ')
n2 = input('Enter second number : ')
x = n1 + n2 * 3
print(x)
67. Write a program that enters mass in grams and displays it in grams
and kilograms.
68. Write a program that inputs temperature in Celsius and converts it to
Fahrenheit. The formula for conversion is -
Temperature in Fahrenheit = Temperature in Celsius * 1.8 + 32
69. Write a program that prompts the user to input his/her weight in kgs
and height in cms, and calculates the body mass index (BMI). BMI is
calculated by dividing body weight in kgs by the square of height in
meters. For example, if weight is 70 kg, and height is 170 cm, then
BMI is 70/(1.7 * 1.7) = 24.2
70. Write a program that inputs radius of a circle, and displays its area
and circumference.
Area of a circle = π * radius * radius
Circumference = 2 * π * radius
Import the value of pi(π) from the math module.
71. Write a program that enters a phone number and prints its last 3 digits.
72. Write a program that accepts an integer in decimal form and prints it
in binary, octal, and hexadecimal. Use built-in functions bin, oct,
and hex.
73. Write a program that enters 4 numbers and prints the largest and
smallest number. Use built-in functions max and min.
74. Write a program that enters two numbers and finds the greatest
common divisor of those two numbers. Use gcd function from the
math module.
75. Write a program that enters two numbers and generates a random
number between those two numbers. Use randint function from
the random module.
76. Write a program that enters the base and height of a right-angled
triangle and finds its hypotenuse According to Pythagoras theorem
Hypotenuse2 = Base2 + Height2
Use sqrt function from the math module.
Strings 3
Data comes in many forms; the most common form is textual data. Almost
every program that does something useful must input, store, process, and
output text. In programming, textual data is handled with the help of strings.
A string is a sequence of characters. In Python, the type str is used to
represent a string. In your program, you can specify a string literal by
enclosing a sequence of characters in either single quotes or double quotes.
A string literal can contain zero or more characters, including letters, digits,
special characters, and space. The enclosing quotation marks are not stored
as part of the string; they are used to delimit the string. Here are some
examples of string literals:
'' Empty string
'abc' String with 3 characters
'a' String with 1 character
' ' String containing a single space
'123abc!' String with both alphabetic and nonalphabetic characters
'456' String containing digits
"cdf" String literal enclosed in double quotes
"don't shout" Single quotes inside a double-quoted string
'Book "C in depth" 3 ed' Double quotes inside a single-quoted string
Table 3.1: String literals
If the single quote has to be used as an actual character inside the string, the
string can be enclosed in double-quotes. If a double quote must be used as an
actual character inside the string, the string can be enclosed in single quotes.
You can use single or double quotes to enclose the string literals in your
program. Whichever style you choose, it is better to stick to it. It is not a
good idea to mix the two styles. We will be mostly using single quotes in
this book. Python also supports triple-quoted strings, which we will discuss
later.
In Python, there is no character type that represents a single character. Single
characters enclosed in quotes are considered strings of size 1.
A string literal can be assigned to a variable, and then various string-related
operations can be performed on that variable.
>>> s1 = 'Morning'
>>> s2 = "Evening"
These assignments make variables s1 and s2 refer to string objects. The
type of these objects is str.
Figure 3.1: Objects of type str
The interactive interpreter shows the string enclosed in single quotes, even if
we define the literal using double quotes.
>>> s1
'Morning'
>>> s2
'Evening'
If we print the string using the print function, the enclosing quotes are not
displayed.
>>> print(s1)
Morning
A string is a sequence of single characters. Other types of sequences in
Python consist of lists and tuples, both representing sequences of objects.
Sequences are types that maintain a left-to-right ordering among the
elements they contain. These sequence types have some similarities and
share some capabilities. Operations like indexing, slicing, concatenation, and
repetition apply to all sequence types. The knowledge of slicing will also
come in handy while using advanced Python libraries like NumPy and
Pandas. So, make sure that you understand these concepts and practice them
thoroughly.
3.1 Indexing
To access a single character inside the string, we must specify a numeric
index inside square brackets. Indexing is 0 based, so the first index is 0.
>>> s = 'quintessence'
If we want to access the individual characters of our string s, we can write
s[0] for accessing the first character, s[1] for the second character, s[2]
for the third character, and so on. If a string has n characters, the valid index
values are from 0 to n-1. The string example that we have taken has 12
characters, so the valid index values are from 0 to 11, and thus s[0],
s[1], ……… , s[11] are valid expressions that give us individual
characters of the string. s[11] will give us the last character of the string.
>>> s[0]
'q'
>>> s[11]
'e'
Any index value larger than 11 will give an error. It will be an error to write
s[12] or s[13] or any other index greater than 11.
>>> s[12]
IndexError: string index out of range
Inside the square brackets, we can use any variable name or expression,
provided the expression evaluates to an integer.
>>> i = 5
>>> s[i]
'e'
>>> s[i-3]
'i'
The built-in function len gives the length of the string, which is equal to the
total number of characters in the string.
>>> len(s)
12
The expression len(s)-1 can be used as an index to access the last
character of the string.
>>> s[len(s)-1]
'e'
The length of string s is 12, so inside brackets, we will have 11, and we
know that s[11] will give us the last character. Similarly, to get the second
last character, we can write s[len(s)-2], and to access the third last
character, we can write s[len(s)-3], and so on.
In Python, there is a shortcut for accessing characters from the end of the
string. Instead of writing the expression s[len(s)-1], we can simply
write s[-1]. So, if we want to access the last character of any string, we do
not need to know the length of the string. We can access the character at
index -1. Similarly, we can write s[-2], which is equivalent to
s[len(s)-2] and hence gives the second last character.
>>> s[-1]
'e'
>>> s[-12]
'q'
>>> s[-13]
IndexError: string index out of range
Thus, in Python, it is not an error to write negative indices. We can go
backward in a string using these negative index values. In general, if we
have a string of length n, the valid indices are 0 to n-1 and -1 to -n.
Writing an index greater than or equal to n or less than -n will raise an
IndexError. For our string s, if we write any index greater than or equal
to 12 or less than -12, then an IndexError will be raised.
Indexing a string gives us a one-character string. In languages like C or C++,
there is a separate character type to represent single characters, but in
Python, there is no such type. A single character inside quotes is of type
str.
3.2 Strings are immutable
In the previous chapter, we learned about the mutability of objects. An object
of immutable type cannot be modified, while an object of mutable type is
modifiable. Strings are immutable, meaning you cannot change a string
object in any way. Suppose you have a string object, and the variable s
refers to it.
>>> s = 'ring'
Figure 3.2: Variable s referring to a string object
>>> s[0] = 'p'
TypeError: 'str' object does not support item
assignment
You cannot use the square brackets on the left side of the assignment
operator to change any character inside the string. This is due to the
immutability of the string object. Once it has been created, it cannot be
altered in any manner. You cannot delete any character from the string, insert
new characters, or replace anything. However, you can create a new string
object and assign it to the same variable name.
>>> s = 'ping'
When this statement executes, a new string object is created, and the variable
s starts referring to that new object.
Figure 3.3: Variable s referring to a new string object
We know that a variable in Python is just a name, and it can be reassigned
any number of times and can refer to any type of object. So, the statement s
= 'ping' is a valid statement since a new string object is created and the
name s is reassigned. This statement does not change the string object in any
way. It appears that we are changing the string, but we have just reassigned
the variable.
The statement s[0] = 'p' is not valid since it is trying to change an
immutable object in-place. This concept of objects, names, and assigning is
very important to understand in Python. If you want to change a string, the
only way is to assign the variable name to a new string object. New string
objects can be created in many ways, like slicing, concatenating, or calling
string methods.
By reassigning a string variable, you can change a string variable without
violating the immutability of the string object. It might seem inefficient that
a new string object is created every time a string must be changed. However,
practically, it is not so, as Python's garbage collector will automatically
reclaim the space occupied by any unused objects.
3.3 String Slicing
We have seen how to get a single character from a string by specifying an
index using square brackets. Using the same square brackets, we can also
access a portion of the string. It is called slicing the string. To extract a part
of the string, we must specify 2 integers inside square brackets.
s[i:j]
Inside the square brackets, we have two integers, i and j, separated by a
colon. The expression s[i:j] is a slice of the string; it gives us a new
string object that is a copy of the portion of the string s from index i to
index j-1. Note that the first index is included while the second index is
excluded. So, the slice s[i:j] returns a new string object that contains all
the characters of string s, from index i up to (but not including) index j.
The original string object does not change. Let us see some examples:
>>> s = 'homogeneous'
>>> s[2:6]
'moge'
The expression s[2:6] gives us a new string object that contains all the
characters of string s from index 2 to index 5. The sliced object can be
assigned to a name.
>>> s1 = s[4:7]
>>> s1
'gen'
The name s1 refers to the sliced object returned by the expression s[4:7].
The original object referred to by s remains unchanged.
>>> s
'homogeneous'
>>> id(s)
2182966396016
Now we make the name s refer to a new sliced object.
>>> s = s[3:7]
>>> s
'ogen'
>>> id(s)
2182965695664
id of s has changed, which shows that it refers to a new object.
While writing the slicing expression, we can omit the first or the second
number or both. If we omit the first index, it is assumed to be 0, i.e., the
beginning of the list. So, the slice s[:j] indicates a part of the string s
from index 0 to index j-1. It is equivalent to writing s[0:j]. If we omit
the second index, it is assumed to be the end of the string. So, the slice
s[i:] indicates a part of the string s from index i to index n-1 where n is
the length of the string. It is equivalent to writing s[i:n].
s[:j] Part of string s from index 0 to index j-1 ( same as
s[0:j] )
s[:7] Part of string s from index 0 to index 6 ( same as s[0:7]
)
s[i:] Part of string s from index i to index n-1 ( same as
s[i:n], n is length of string )
s[3:] Part of string s from index 3 to index n-1 ( same as
s[3:n], n is length of string )
We can omit both the indices inside the brackets. Therefore, the slice s[:]
extracts the entire string from the beginning till the end. It gives an exact
copy of the entire string. It is the same as writing s[0:n].
s[:] Part of string s from index 0 to index n-1 (same as s[0:n],
n is length of string )
So, when slicing from the start of the string, we can omit zero, and when
slicing to the end of the string, we can omit n, as they are redundant. Here
are some examples:
>>> s = 'homogeneous'
>>> s[:4]
'homo'
>>> s[5:]
'eneous'
>>> s[:]
'homogeneous'
Omitting both indexes gives us a string object that is an exact copy of the
string. So, if we must make a new string that is a copy of the string, we can
do it this way.
>>> scopy = s[:]
>>> scopy
'homogeneous'
You can specify a negative index also while slicing.
s[0:-1] Part of string s from index 0 to index -2 (same as s[0:n-
1] )
The slice s[0:-1] indicates a part of the string from index 0 to index -1–
1 i.e. -2. As we have seen earlier, writing 0 as the first integer is redundant,
so you can omit the zero and just write it as s[:-1].
The slice s[:-1] represents the whole string, excluding the last character.
If you want a part of the string that excludes the last two characters, you can
use the slice s[:-2]. In general, s[:-m] gives us a string that excludes
the last m characters.
>>> s = 'homogeneous'
>>> s[:-1]
'homogeneou'
This gives a string object that contains the whole string except the last
character. If you want a string object in which the last three characters are
removed, you can write this s[:-3].
>>> s[:-3]
'homogene'
Now, let us write a slice with a negative number as the first index.
>>> s[-5:]
'neous'
The slice s[-5:] starts at index -5 and goes up to the last index, so it
gives you the last 5 characters of the string. Similarly, the slice s[-3:] will
give you the last 3 characters of the string.
When both the indexes are equal, we get an empty string.
>>> s[3:3]
''
We have seen that if we index a string and give an invalid index inside
square brackets, an IndexError occurs. Let us see what happens if we
provide a bad index in slicing.
>>> s[2:100]
'mogeneous'
The end index is greater than the size of the string, but we did not get any
IndexError. We got a slice from index 2 to the end of the string. So, if
the index is greater than or equal to n (length of the string), it means the end
of the list. Similarly, if the first index is less than or equal to -n, it means the
start of the string.
>>> s[-50:6]
'homoge'
Here, the first index is assumed to be at the start of the string. You can see
that slicing is more forgiving than indexing. While indexing, if you give
such bad indexes, then you will get an error.
>>> s[100]
IndexError: string index out of range
While slicing, you can also use a third integer inside the square brackets,
which is the stride or step of the slice.
s[i:j:k] Part of the string s from index i to index j-1, with a
step of k
s[3:10:2] Part of the string containing characters at indexes 3,5,7,9
s[3:18:3] Part of the string containing characters at indexes
3,6,9,12,15
s[i:j:1] Equivalent to s[i:j]
s[6:1:-1] Part of the string containing characters at indexes
6,5,4,3,2
s[20:5:-2] Part of the string containing characters at indexes
20,18,16,14,12,10,8,6
s[::-1] String in reverse order
The slice s[i:j:k] will extract characters from index i to index j-1,
with each subsequent index incremented by k. When the step is omitted, it is
assumed to be 1, so s[i:j:1] is equivalent to s[i:j]. In the previous
examples that we had written, it was assumed to be 1. We can give negative
steps also. In the slice s[6:1:-1] we start at 6 and add -1 each time, so
we get indexes 6,5,4,3,2. Thus, the effect of using a negative slice is that we
get the items in reverse order. The slice s[::-1] will give the whole string
in reverse order. Here are some examples:
>>> s = 'Today is the day.'
>>> s[3:13:2]
'a ste
Each alternate character of the string from index 3 to index 12 is displayed.
>>> s[::2]
'Tdyi h a.'
Each alternate character of the whole string is displayed.
>>> s[::3]
'Tait y'
The whole string is displayed with a step of three characters.
>>> s[::-1]
'.yad eht si yadoT'
This gives the reverse of the whole string.
3.4 String Concatenation and Repetition
We know that when the operators + and * are used on numeric types, they
add and multiply numbers. These operators can also be used on strings, but
they are interpreted differently. The operator + performs string
concatenation, and the operator * performs string repetition.
String literals or string variables can be combined by using the + operator.
>>> 'ab' + 'cd'
'abcd'
>>> name = 'Dev'
>>> 'Hello' + name
'HelloDev'
In the first example, we have combined two string literals. In the second one,
we have combined a string literal with a string variable. In both these cases,
a new string object is created, which is displayed at the prompt. In the
second example, no space is added between the two words. If you want a
space, you must add it explicitly.
>>> 'Hello' + ' ' + name
'Hello Dev'
The new string object returned after concatenation can be assigned to a
name.
>>> s = 'Hello' + ' ' + name
>>> s
'Hello Dev'
>>> name = name + 'raj'
>>> name
'Devraj'
The asterisk symbol, when used with a string and integer, acts as a repetition
operator. We can use the repetition operator to repeat a string.
>>> name = 'Dev'
>>> name * 3
'DevDevDev'
The expression name * 3 returns a string object that contains the
characters of the string name repeated three times. The integer denotes the
number of times the string is repeated. You can think of it as an abbreviation
for n times concatenation. name * 3 is same as name + name +
name. The expression 3 * name also has the same effect but is less
intuitive.
>>> 'Hello ' * 5
'Hello Hello Hello Hello Hello '
>>> print('-' * 40)
----------------------------------------
>>> s = 'Hee..'
>>> s = s * 3
>>> s
'Hee..Hee..Hee..'
In the statement s = s * 3, we assign the string object returned by the
expression s * 3 to the variable s.
Augmented assignment syntax can be used for both concatenation and
repetition operators.
>>> s = 'butter '
>>> s += 'scotch '
>>> s
'butter scotch '
>>> s *= 3
>>> s
'butter scotch butter scotch butter scotch '
s += 'scotch' is equivalent to s = s + 'scotch' and the s *=
3 is equivalent to s = s * 3
The augmented assignment does not make any changes to the original
object. It reassigns the variable name to a new object.
Here are some more examples:
>>> s1 = 'Good Morning !'
>>> s2 = 'Bye Bye See you'
We have these two strings, and we must make a string by concatenating the
first four characters of the first string and the first three characters of the
second string. We can do this by combining the slices of the two strings.
>>> s3 = s1[:4] + s2[:3]
>>> s3
'GoodBye'
This slice s1[:4] gives a string object that contains the first four characters
of the string s1, and the slice s2[:3] gives a new object that contains the
first three characters of the string s2. When these objects are combined
using the + operator, we get a new string object assigned to the name s3.
Now, we want to make a new string from the string s1, such that the first
four characters are repeated three times, and the last character is repeated
five times.
>>> s4 = s1[:4] * 3 + s1[4:-1] + s1[-1] * 5
>>> s4
'GoodGoodGood Morning !!!!!'
If we assign the result to the name s1, we get the effect of changing the
string s1.
>>> s1 = s1[:4] * 3 + s1[4:-1] + s1[-1] * 5
>>> s1
'GoodGoodGood Morning !!!!!'
String literals can also be combined by writing them one after the other.
>>> 'abc''def''hij'
'abcdefhij'
Hence, adjacent string literals are concatenated. This feature is applicable
only for literals. You cannot join string variables or expressions by using this
feature. It is useful when you want to break long string literals.
3.5 Checking membership
The in and not in operators can be used to test for the existence of a
character or substring inside a string. The in operator returns True if a
character or substring exists in the given string; otherwise, it returns False.
The not in operator returns True if a character or substring is not present
in the string.
>>> s = 'good morning !'
>>> 'ing' in s
True
>>> '?' in s
False
>>> 'good morning !' in s
True
>>> 'Good' in s
False
>>> 'you' not in s
True
>>> 'morning' not in s
False
3.6 Adding whitespace to strings
You can add whitespace to your string to organize and present it in a
readable way. Whitespace in programming includes tabs, newlines, and
spaces. The character combination '\n' adds a newline, and the
combination '\t' adds a tab to your string.
>>> print('Sun\tMon\tTue')
Sun Mon Tue
>>> print('Sun\nMon\nTue\n')
Sun
Mon
Tue
>>> print('Days : \n\tSun\n\tMon\n\tTue\n')
Days :
Sun
Mon
Tue
A single print call gives multiple lines of output due to the inclusion of
'\n' character. This way, we can generate multiple lines of output with
only a few lines of code. However, some programmers prefer writing
separate print calls as the '\n' embedded inside a string is difficult to
read.
3.7 Creating multiline strings
A string literal enclosed in single or double quotes cannot span more than
one line of a program. Such a string should be contained in a single line
only. The ending quote should appear on the same line as the starting quote.
You will get a syntax error if you try to write a multiline string inside single
or double quotes.
>>> s = 'Let us get up and get going,
... With a strong heart for whatever may come our
way.
... Keep working, keep trying,
... Learn to work hard and be patient each day.'
...
SyntaxError: unterminated string literal (detected
at line 1)
If you want a string literal that spans across multiple physical lines, you can
use the continuation character.
>>> s = 'Let us get up and get going,\
... With a strong heart for whatever may come our
way.\
... Keep working, keep trying,\
... Learn to work hard and be patient each day.'
>>> s
'Let us get up and get going,With a strong heart
for whatever may come our way.Keep working, keep
trying,Learn to work hard and be patient each day.'
>>> print(s)
Let us get up and get going,With a strong heart for
whatever may come our way.Keep working, keep
trying,Learn to work hard and be patient each day.
The backslash indicates that the string is continued on the next line. Now, we
could define the string literal on multiple lines, but when this string is
printed, we do not get the literal printed on different lines. To achieve this,
we can include newline characters in between the literal. We already know
that '\n' is the newline control character used to begin a new line on a
screen, so we can use it inside the string.
>>> s = 'Let us get up and get going,\n\
... With a strong heart for whatever may come our
way.\n\
... Keep working, keep trying,\n\
... Learn to work hard and be patient each day.'
The '\n' adds a newline character, and the backslash indicates that the
string is continued on the next line.
>>> print(s)
Let us get up and get going,
With a strong heart for whatever may come our way.
Keep working, keep trying,
Learn to work hard and be patient each day.
A better and more common way is to use triple-quoted strings. If you put a
string literal inside triple quotes, it spans across multiple lines naturally. The
triple quotes can consist of three consecutive single quotes('''abc''') or
three consecutive double quotes("""abc""").
s = '''Let us get up and get going,
With a strong heart for whatever may come our way.
Keep working, keep trying,
Learn to work hard and be patient each day.'''
If your literal starts with a triple quote, you can keep adding text to it on
multiple lines. The literal ends with terminating triple quotes.
>>> print(s)
Let us get up and get going,
With a strong heart for whatever may come our way.
Keep working, keep trying,
Learn to work hard and be patient each day.
The newline characters are naturally embedded in a string delimited by triple
quotes. Any spaces at the beginning of the line will also be included in the
string. If we display the string on the prompt instead of printing it using the
print function, we will see the newline characters.
>>> s
'Let us get up and get going,\nWith a strong heart
for whatever may come our way.\nKeep working, keep
trying,\nLearn to work hard and be patient each
day.'
When you delimit a string literal inside triple quotes, Python adds a newline
character at the end of each line. When you print such a string with the
print function, you can see the original lines because each newline
character is interpreted.
When we used backslash to join the lines, then the newline was not added
automatically. If you want to prevent some newlines in a triple-quoted string,
add a backslash at the end of those particular lines.
>>> s = '''Let us get up and get going,
... With a strong heart for whatever may come our
way.\
... Keep working, keep trying,
... Learn to work hard and be patient each day.'''
>>> print(s)
Let us get up and get going,
With a strong heart for whatever may come our
way.Keep working, keep trying,
Learn to work hard and be patient each day.
Python supports triple-quoted strings so that we can write multiline strings.
Using triple quotes improves the readability of long multiline strings in the
source code. Generally, these are used in doctsrings, that we will discuss
later. Another advantage of triple-quoted strings is that we can use them to
write string literals that have to include both single and double quotes.
>>> print('''My height is 5'3" ''')
My height is 5'3"
We have seen that in Python, adjacent string literals are concatenated. If we
place more than one string literal adjacent to each other on a line (with
optional whitespace in between), then they will be automatically
concatenated.
>>>'abc' 'def' 'hij'
'abcdefhij'
If you write the string literals on separate lines and enclose them in
parentheses, even then, they are considered adjacent and will be
concatenated.
>>> s = ('Let us get up and get going,'
... 'With a strong heart for whatever may come our
way.'
... 'Keep working, keep trying,'
... 'Learn to work hard and be patient each day. ')
>>> print(s)
Let us get up and get going,With a strong heart for
whatever may come our way.Keep working, keep
trying,Learn to work hard and be patient each day.
This can be another way of writing strings that span multiple lines. This
approach does not add any newline characters in the string. If you need
newlines, you need to add the newline character explicitly in the literals.
This approach can be helpful if you need to add comments to separate lines
of the string.
>>> s = ('Let us get up and get going,'
... 'With a strong heart for whatever may come our
way.' # prepared for anything
... 'Keep working, keep trying,'
... 'Learn to work hard and be patient each day. ')
# patience is the key
>>> print(s)
Let us get up and get going,With a strong heart for
whatever may come our way.Keep working, keep
trying,Learn to work hard and be patient each day.
The comments are not included in the string. We do not see them when we
print the string. In triple-quoted strings, if you try to add comments like this,
those comments will be added to the string.
In the previous chapter, we had seen that for adding a multiline comment, we
had to precede each line with a # sign. We can also use single triple quotes or
double triple quotes to insert multiline comments in our code.
# This is a multiline comment
# It explains the code
# It has no effect on the code
''' This is also a multiline comment
It explains the code
It has no effect on the code
'''
The triple-quoted string is written all by itself. We are not printing it or
assigning it to any variable. It is an unused string, so we can use it as a
comment. However, this style of writing comments is not recommended, and
in most places, you will find comments that use the # sign. The triple-quoted
strings are used for docstrings, which we will discuss later.
3.8 String methods
The str type supports many methods that can be dot suffixed to the name
of the string. We have seen that str is an immutable type, so it does not
provide any methods that change the original string object. All methods that
seem to make changes in the string are designed such that they return a new
modified string object. They do not touch the original string object because
they are not able to, as objects of type str are immutable.
Let us understand this with the help of an example. The method upper() is
used to change the letters in a string to uppercase. We have a string variable
s.
>>> s = 'Hello'
When we call the method upper on the variable s, it returns a new object
that contains all letters of this string in uppercase.
>>> s.upper()
'HELLO'
The original object to which s was referring remains unchanged. We can
make another variable refer to the object returned by upper.
>>> s1 = s.upper()
>>> s1
'HELLO'
Now, s1 refers to the object returned by the method upper. If we want to
make s refer to this new object, we can write s = s.upper(), then s
will refer to this new object.
So, you cannot change the string object using any method, but you can
assign the new object returned by the method to the string variable referring
to the original string object. str type has lots of methods; we will explore
some of the common ones here. You can try them on the interactive prompt.
To get an up-to-date list of methods, you can call dir(str) or
help(str) on the interactive prompt. To get the description of a particular
method, type help(str.methodname) on the prompt.
3.9 Case-changing methods
The following five case-changing methods can be used to perform case
conversions in strings. All of them return a new object, which is a copy of
string s with some changes in the case of the contained letters.
s.lower() Returns a copy of s, in which each letter is converted to lowercase
s.upper() Returns a copy of s, in which each letter is converted to uppercase
s.swapcase() Returns a copy of s, in which each lowercase letter is converted to
uppercase and vice versa
s.capitalize() Returns a copy of s, in which the first letter of the string is capitalized,
and the rest of the letters are changed to lowercase
s.title() Returns a copy of s, in which the first letter of each word is capitalized,
and the rest of the letters are changed to lowercase
Table 3.2: Case-changing methods
Let us try some of them at the prompt.
>>> s = 'Life is a journey, not a race'
>>> s.lower()
'life is a journey, not a race'
>>> s
'Life is a journey, not a race'
We must assign the returned object to the original variable name to see the
required change.
>>> s = s.title()
>>> s
'Life Is A Journey, Not A Race'
Similarly, while using other methods, if you want to see a change in your
string, you need to reassign it.
When checking for membership or comparing strings, you can ignore the
case by using the upper or lower methods.
>>> 'out' in 'Output'.lower()
True
>>> s = 'telephone'
>>> s[0].upper() in 'AEIOU'
False
>>> response = input('Enter yes or no : ')
Enter yes or no : Yes
>>> response.lower() == 'yes'
True
3.10 Character classification methods
The methods in this group check the contents of the string, and they return
either True or False. All of them start with 'is', and their names are self-
explanatory.
s.isalnum() Returns True if all characters in s are alphanumeric
s.isalpha() Returns True if all characters in s are alphabetic
s.isdecimal() Returns True if there are only decimal characters in s
s.isdigit() Returns True if all characters in s are digits
s.isidentifier() Returns True if s is a valid identifier
s.islower() Returns True if all letters in s are lowercase
s.isupper() Returns True if all letters in s are uppercase
s.istitle() Returns True if s is a title cased string
s.isnumeric() Returns True if all characters in s are numeric
s.isprintable() Returns True if all characters in s are printable
s.isspace() Returns True if all characters in s are whitespace
Table 3.3: Character classification methods
Here are some examples:
>>> s = 'Yes Sir'
>>> s.isalpha()
False
>>> s.isupper()
False
>>> s.istitle()
True
3.11 Aligning text within strings
The following three methods justify a string into a given field size, and by
default, the padding is done with spaces.
s.ljust(size) Returns the string left justified in a string of length size
s.rjust(size) Returns the string right justified in a string of length size
s.center(size) Returns the string centered in a string of length size
Table 3.4: Text alignment methods
The methods ljust(), rjust(), and center() left justify, right
justify, or center a string, respectively, such that the string fits within the
number of spaces provided by the argument size. Here, size is the total
length of the string after padding. These methods can be used in printing
tabular data.
>>> s = 'Be a voice, not an echo'
>>> s.ljust(40)
'Be a voice, not an echo '
>>> s.rjust(40)
' Be a voice, not an echo'
>>> s.center(40)
' Be a voice, not an echo '
If size is less than the length of the string, there is no change.
>>> s.center(4)
'Be a voice, not an echo'
You can specify a fill character for padding instead of default spaces.
>>> s.center(40, '*')
'********Be a voice, not an echo*********'
The string is center justified in a field width of 40, and the padding is done
with an asterisk symbol instead of spaces.
The interactive prompt displays the string object returned by a particular
method. As we have seen before, if we want to see the change in the original
string, we need to assign this string object to the original string variable.
3.12 Removing unwanted leading and trailing
characters
The str type provides methods to remove leading and trailing whitespaces
or other characters. These methods can be used to sanitize data for further
processing. For example, data read from somewhere or input by the user can
be cleaned before storing or processing.
s.lstrip(chars) Returns a copy of the string with leading characters removed
s.rstrip(chars) Returns a copy of the string with trailing characters removed
s.strip(chars) Returns a copy of the string with both leading and trailing characters
removed
Table 3.5: Methods to remove leading and trailing characters
lstrip() and rstrip() remove characters from the left and right sides
of the string, respectively, while strip() removes characters from both
the left and the right sides. The set of characters to be removed is specified
as a string argument. All the characters present in the string argument will be
removed from the left, right, or both sides of the string. Here are some
examples:
>>> '!!..Imagine .. believe .. achieve ..!! ?
'.rstrip('!?. ')
'!!..Imagine .. believe .. achieve'
The argument string is '!?. ' so all exclamation marks, question marks,
full stops, and spaces are removed from the right of the string.
>>> '!!..Imagine .. believe .. achieve ..!! ?
'.lstrip('!?. ')
'Imagine .. believe .. achieve ..!! ? '
Now we have called lstrip, so the characters contained in the argument
string are removed from the left of the string.
>>> '!!..Imagine .. believe .. achieve ..!! ?
'.strip('!?. ')
'Imagine .. believe .. achieve'
Now, the characters are removed from both the left and right of the string, as
we have called strip.
If the argument is omitted or is None, the whitespace characters are
removed.
>>> s = ' All is well '
>>> s.lstrip()
'All is well '
>>> s.rstrip()
' All is well'
>>> s.strip()
'All is well'
These methods return a new object, allowing us to chain subsequent method
calls. In the next example, we have called the method upper on the object
returned by the method strip().
>>> s.strip().upper()
'ALL IS WELL'
As you know, to see these changes in s, you must reassign s to the new
object.
The methods removeprefix and removesuffix can be used to
remove a prefix or suffix from the string. If the prefix or suffix is not present
then a copy of the original string is returned.
>>> 'PyTorch'.removeprefix('Py')
'Torch'
>>> 'Numpy'.removesuffix('Py')
'Numpy'
>>> 'Numpy'.removesuffix('py')
'Num'
3.13 Searching and replacing substrings
One of the essential programming tasks is to search your data for specific
information. Python provides many useful methods for searching and
replacing information in a string.
s.find(substr) Returns index of the first occurrence of the given substring. If not
found, returns -1
s.index(substr) Returns index of the first occurrence of the given substring. If not
found, raises ValueError
s.rfind(substr) Returns index of the last occurrence of the given substring. If not
found, returns -1
s.rindex(substr) Returns index of the last occurrence of the given substring. If not
found, raises ValueError
Table 3.6: Methods to search a substring
The methods find and index return the index of the first occurrence of
the given substring. If not found, find returns -1 while index raises a
ValueError. The methods rfind and rindex are the same as find
and index except that they search through the string backward, i.e., from
right to left, so they find the last occurrence of the substring. Here are some
more methods:
s.count(substr) Returns the number of occurrences of the specified substring in
s
s.startswith(substr) Returns True if s starts with the specified substring, False
otherwise
s.endswith(substr) Returns True if s ends with the specified substring, False
otherwise
s.replace(s1,s2) Returns a copy of the string with all occurrences of the first
string replaced with the second string
Table 3.7: String methods
In all these methods, you can restrict the search by specifying, optional
arguments start and end, as in a slice.
To substitute a substring with another we can use the method replace. It
returns a copy of the string with all occurrences of the first string replaced
with the second string. As usual, the original string remains unchanged, and
a new string object is returned. You can restrict the number of replacements
by providing a third argument. That argument represents the number of
occurrences that have to be replaced. Let us use these methods to understand
them better.
>>> s = '''Focus on present, not on past or future
... Focus on yourself, not on others
... Focus on the process, not on outcome
... Focus on what you can control, not on what you
cannot control'''
>>> s.find('Focus')
0
This call returns the index of the first occurrence of the substring 'Focus'
in the string s. We get 0 here because this substring is present in string s at
the 0th index.
>>> s.rfind('Focus')
111
This method rfind returns the index of the last occurrence of the substring
in the string.
>>> s.find('focus')
-1
We get -1 because 'focus' with f in lowercase is not present in the string
s.
The methods index and rindex are similar to find and rfind, but
instead of returning -1, they raise a ValueError if the substring is not
found. In the previous call, if we use the index method, then instead of -1,
we get a ValueError.
>>> s.index('focus')
ValueError: substring not found
>>> s.count('on')
10
The substring 'on' comes 10 times in the string s.
>>> s.startswith('Focus')
True
>>> s.endswith('?')
False
In these methods that we have seen, we can give the start and end index,
where the search will be performed.
>>> s.find('Focus', 20, 100)
40
The search is performed in the string portion from index 20 to index 99.
These start and end indexes are interpreted as in slice notation. Similarly, we
can use the start and end indexes in all the other methods of this category.
Now, suppose we have a string s.
>>> s = 'Dev; 22; male; graduate; Bareilly'
We want a string that contains everything after the first occurrence of
semicolon. We can get it by using the index method in the slice notation.
>>> s[s.index(';'):]
'; 22; male; graduate; Bareilly'
s.index(';') gives us the index of the first occurrence of the semicolon,
which is 3, and the expression s[s.index(';'):] gives us a slice from
index 3 till the end. So, we get everything after the first occurrence of the
semicolon. The semicolon itself is included. If we do not want it, we can
specify s.index(';')+1 as the start index for the slice.
>>> s[s.index(';')+1:]
' 22; male; graduate; Bareilly'
Now, we assign this slice object to the name s2.
>>> s2 = s[s.index(';')+1:]
>>> s2
' 22; male; graduate; Bareilly'
So, s2 is a string that contains everything after the first occurrence of the
semicolon. Now, instead of index, let us write rindex.
>>> s2 = s[s.rindex(';')+1:]
>>> s2
' Bareilly'
Now we get a string that contains everything after the last occurrence of
semicolon. Let us combine both index and rindex in this slice.
>>> s2 = s[s.index(';')+1: s.rindex(';')]
>>> s2
' 22; male; graduate'
This gives us everything between the first and last occurrence of the
semicolon. We could have also used the find() method here, but it is
better to use the index() method in these types of cases, as find returns -1
if the substring is not found, and -1 is a valid index value in Python. We
might get incorrect results if we use find(). Let us understand this with
the help of an example. Suppose we want everything from the beginning of
the string s to the first occurrence of the substring xy.
>>> s2 = s[: s.index('xy')]
ValueError: substring not found
The substring 'xy' is not present in s, so we get this error. Now, instead of
index, let us use find and see.
>>> s2 = s[: s.find('xy')]
This does not give any error. Let us see what is s2.
>>> s2
'Dev; 22; male; graduate; Bareill'
The find method returned -1 since the substring was not present. So, this
slice represents the whole string from starting to index -2.
Now, let us try the replace method. Again, we take this multiline string s.
>>> s = '''Focus on present, not on past or future
... Focus on yourself, not on others
... Focus on the process, not on outcome
... Focus on what you can control, not on what you
can't control'''
>>> s2 = s.replace('Focus', 'Concentrate')
>>> print(s2)
Concentrate on present, not on past or future
Concentrate on yourself, not on others
Concentrate on the process, not on outcome
Concentrate on what you can control, not on what
you can't control
All the occurrences of 'Focus' are replaced with 'Concentrate'.
>>> s2 = s.replace('Focus', 'Concentrate', 3)
>>> print(s2)
Concentrate on present, not on past or future
Concentrate on yourself, not on others
Concentrate on the process, not on outcome
Focus on what you can control, not on what you
can't control
Now, only the first three occurrences are replaced. By replacing it with an
empty string, we can delete characters from the string.
>>> s2 = s.replace('not', '')
>>> print(s2)
Focus on present, on past or future
Focus on yourself, on others
Focus on the process, on outcome
Focus on what you can control, on what you can't
control
All occurrences of substring 'not' were removed. As a result of removal,
we get double spaces in many places. We want only one space in those
places. For this, we can replace double spaces with a single space by making
one more call to replace method.
>>> s2 = s.replace('not', '').replace(' ', ' ')
>>> print(s2)
Focus on present, on past or future
Focus on yourself, on others
Focus on the process, on outcome
Focus on what you can control, on what you can't
control
This chained call works because the replace method returns a string
object.
3.14 Chaining method calls
Most string methods return a string object, so you can apply multiple
methods to a string to get the desired result. We saw this while using the
rstrip method and the replace method. Here is one more example:
>>> s = ' hello '
>>> s = s.strip().upper().center(20, '*')
>>> s
'*******HELLO********'
The methods are executed from left to right, one at a time. In this example,
the method strip is called on the string s, then the method upper is
called on the string returned by strip and the method center is called on
the string returned by the method upper. The string object returned by
center is assigned to s. The order of the methods matter; the output might
change if the order is changed.
>>> s = ' hello '
>>> s = s.center(20, '*').upper().strip()
>>> s
'*** HELLO ****'
3.15 String comparison
The operators is and is not are used to compare the identity of strings
(and other objects). They check whether the two strings occupy the same
space in memory.
The comparison operators ==, !=,<, >, <= and >= are used to compare
strings. As usual, they return a Boolean value True or False. Two strings are
considered equal if their content is exactly the same.
>>> s1 = 'Python'
>>> s2 = 'Python'
>>> s1 == s2
True
>>> s1 != s2
False
The comparisons performed by the comparison operators are case-sensitive.
For example, 'Python' and 'python' will not be considered equal. To
ignore case and perform case-insensitive comparisons, you can convert both
strings to either lowercase or uppercase by using the upper and lower
methods, as we discussed in section 3.9.
>>> s1 = 'Python'
>>> s2 = 'python'
>>> s1 == s2
False
>>> s1.lower() == s2.lower()
True
>>> s1.upper() == s2.upper()
True
The casefold() method can also be used for caseless matching of the
strings, as it returns a casefolded copy of the string. This method will work
properly even if your string contains Unicode characters.
>>> s1.casefold() == s2.casefold()
True
The comparison operators compare the individual characters according to
the ASCII or Unicode value (code point). Lowercase letters are considered
larger than the corresponding uppercase letters as the lowercase letters have
a bigger code point than the uppercase ones.
>>> ord('P')
80
>>> ord('p')
112
>>> 'Python' < 'python'
True
When the string contains all lowercase or all uppercase letters, the
comparison is done in regular alphabetic order as in a dictionary.
3.16 String conversions
A type can be converted to another type using the type name as a function if
the conversion is supported. Suppose you have a string that represents a
number.
>>> s = '23'
The type of this variable s is str, so you cannot perform any arithmetic
operation supported by int type.
>>> s + 1
TypeError: can only concatenate str (not "int") to
str
However, you can perform operations by converting s to int or float.
>>> x = int(s)
>>> x + 1
24
>>> float(s) / 2
11.5
>>> s = 'UP05788'
>>> n = int(s[2:])
>>> n + 1
5789
You can similarly convert strings to other types like list or set. We will
see these types in the coming chapters. The conversion to int is a bit
different from others as it can take a second argument also (we have
discussed this in the previous chapter).
The type name str can be used as a function to create string objects. If the
argument you send is a string, the function str returns a new string object
that is a copy of the string. If the argument is a non-string type, it returns a
string object that represents the string form of the argument, provided the
argument is convertible to a string.
If we try to concatenate a string with a number, a TypeError is raised. The
number must be converted to a string by using the str function.
>>> s1 = 'UP05'
>>> n = 2456
>>> s1 + n
TypeError: can only concatenate str (not "int") to
str
>>> s1 + str(n)
'UP052456'
The functions bin, oct, and hex can also convert a number to a string in
an appropriate base.
>>> bin(100)
'0b1100100'
>>> oct(100)
'0o144'
>>> hex(100)
'0x64'
3.17 Escape Sequences
Inside a string, the backslash (\) is considered an escape character. It is used
to indicate that the following character has special meaning, so it should not
be treated in the regular way. We have already seen how to include a newline
and a tab using the character combinations '\n' and '\t'. These
character combinations are examples of escape sequences. The combination
'\n' or '\t' is considered a single character known as an escape
character. Here is a list of more escape sequences:
\<newline> Backslash and newline ignored
\' Single Quote
\" Double Quote
\\ Backslash character(\)
\n New Line
\t Horizontal Tab
\v Vertical Tab
\b Backspace
\r Carriage Return
\f Form Feed
\a Bell
\ooo Character with octal value ooo
\xhh Character with hex value hh
\N{name} Character named name in the Unicode database
\uxxxx Unicode character with a 16-bit hex value xxxx
\Uxxxxxxxx Unicode character with a 32-bit hex value xxxxxxxx
Table 3.8: Escape sequences
Escape sequences are special character representations that are represented
by a combination of characters where the first character is a backslash,
followed by one or more characters. When they appear inside a string, they
are replaced by the single character that they represent. Escape sequences let
us embed special non-printing characters (that cannot be typed on a
keyboard) in a string. They also resolve ambiguity, such as printing a single
quote inside a single quoted string.
Let us use these escape sequences in our strings. We know that '\n'
represents a newline character, and when it is written inside a string, it will
start a new line on the screen.
>>> print('How\nare\nyou')
How
are
you
Here we are printing a string that contains the escape sequence '\n'. We
can see that each '\n' is replaced with a newline character; it is printed in
the form of a newline. So, you can print the text inside a single string in
multiple lines. Let us see the length of this string.
>>>len('How\nare\nyou')
11
The escape sequence '\n' is counted as just one character, so we have
3+1+3+1+3, which is 11. If we use the escape sequence '\t', then it is
replaced by a tab character which provides space between 2 values.
>>> print('How\tare\nyou')
How are
you
An escape sequence is called so as it escapes the usual meaning of a letter or
character (like n in '\n') and gives it a whole new meaning.
When Python does not recognize the character after a backslash as an escape
code, it just keeps the backslash literally in the string. For example:
>>> print('H\el\lo')
H\el\lo
Here, e and l are not escape codes, so the backslash is literally included in
the string. This means that the backslash is included as itself in the string and
is not treated specially. The replacement is done only when the backslash is
followed by a valid escape code.
Now, suppose we want to print or use a string that contains some Windows
Path.
>>> print('C:\textfiles\newFile')
C: extfiles
ewFile
Both '\t' and backslash '\n' are recognized as escape sequences. So,
they are replaced by their respective characters. However, we do not want
this replacement to be done in this case. We want to print the backslash
literally, even when followed by an escape code. To print a literal backslash
character, you must use double backslashes.
>>> print('C:\\textfiles\\newFile')
C:\textfiles\newFile
Now, the backslashes are printed literally. We could also use raw strings, as
we will discuss shortly.
If we try to print a string containing a single quote and enclosed inside single
quotes, we will get a syntax error.
>>> print('Don't run')
SyntaxError: unterminated string literal
One solution to this problem is to enclose the whole string inside double
quotes instead of single quotes. Another solution is to use an escape
sequence.
>>> print('Don\'t run')
Don't run
Here, the interpreter sees that the single quote is preceded by a backslash, so
it will print a single quote; it will not use this single quote to end the string.
This way, you can insert a single quote inside a string enclosed in single
quotes, and similarly, you can insert a double quote inside a double-quoted
string.
3.18 Raw string literals
If you want to turn off the backslash escape mechanism in a string, you can
precede the string literal with the letter r. These are called raw strings. They
treat backslash as a literal character and not as an escape character. Every
character inside a raw string stays the way it is written inside the string. Here
are some examples:
>>> s = r'hello\n'
>>> print(s)
hello\n
Raw strings can be helpful when you have strings that contain many
backslashes like Windows path and regular expressions.
>>> print(r'C:\Deepali\newFiles')
C:\Deepali\newFiles
Here, '\n' is not considered an escape sequence. Since the string is
preceded by r, it is a raw string. The interpreter considers the backslash as a
normal character of the string and not as a start of an escape sequence. If we
remove r, then '\n' is considered an escape sequence.
>>> print('C:\Deepali\newFiles')
C:\Deepali
ewFiles
3.19 String formatting
We have the following 3 variables of type str, int, and float.
>>> name = 'Raj'
>>> age = 23
>>> wt = 43.567
We know that we can create a string by concatenating strings literal and
variables.
>>> s = 'My name is ' + name + ', I am ' + str(age)
+ ' years old and my weight is ' + str(wt) + ' kg'
>>> s
'My name is Raj, I am 23 years old and my weight is
43.567 kg'
We had to use the conversion functions to convert non-string variables to
string, and using the + operator many times was not very readable. Another
way is to use the print function, in which you can send the strings and
variables separated by commas.
>>> print('My name is', name, ', I am', age, 'years
old and my weight is', wt, 'kg')
My name is Raj , I am 23 years old and my weight is
43.567 kg
Here, we have all the string literals and variables separated by commas. Till
now, we have been using these simple approaches for displaying our data,
but these approaches were not very readable. Python has different formatting
styles that we can use to do more value formatting and display the output in
an organized way.
We need to format strings to present data in a better way. This is required
when data is to be displayed to the program's user in a readable and
understandable manner. In the following image, you can clearly see the
difference between the data displayed without any formatting and after
formatting.
Figure 3.4: Unformatted and formatted data
String formatting also allows us to interpolate values of variables into
strings, which means that we can insert values inside strings using different
formats. You need to format strings for better display on the screen. String
formatting is also required when you need to substitute variables.
There are three ways of formatting strings in Python. There is no need to
learn all of them, but knowing them is good as you might encounter them in
someone else's code. The first is the old-style formatting, which uses the %
operator like C language. This style is still supported but is deprecated.
>>> name = 'Raj'
>>> age = 23
>>> wt = 47.5
>>> s = 'My name is %s, I am %d years old and my
weight is %f kg' % (name, age, wt)
>>> s
'My name is Raj, I am 23 years old and my weight is
47.500000 kg'
In Python 3, a newer style was introduced, which used the format method
of string class. This was introduced in Python 3 but was backported to
Python 2.6.
>>> name = 'Raj'
>>> age = 23
>>> wt = 47.5
>>> s = 'My name is {}, I am {} years old and my
weight is {} kg'.format(name, age, wt)
>>> print(s)
My name is Raj, I am 23 years old and my weight is
47.5 kg
The curly braces act as placeholders for the data, and the values are sent as
arguments to the format method.
In Python 3.6, a new formatting approach was introduced that used
formatted string literals, also called f-strings.
>>> name = 'Raj'
>>> age = 23
>>> wt = 47.5
>>> s = f'My name is {name}, I am {age} years old
and my weight is {wt} kg'
>>> print(s)
My name is Raj, I am 23 years old and my weight is
47.5 kg
Using these f-string literals, you can embed Python expressions inside a
string literal using curly braces. They are called f-strings because you get a
formatted string literal by prefixing a string with the letter f.
So, when we have a string literal prefixed with f, any variable inside curly
braces is substituted with its value. You can see that this style is much clearer
than the previous two. It is the simplest one because you can directly insert
the names inside the string literal. In this book, we will mostly use the f-
string formatting. You might encounter the format method style in some
other code, so it is discussed in the next section. In the rest of this section,
we will discuss f-strings.
Using f-strings, you can simply write your string; whenever you want to
substitute the value of a variable, just put it inside curly braces. You can even
write Python expressions inside curly braces or call functions and methods
directly.
>>> name = 'Raj'
>>> age = 23
>>> wt = 47.567
>>> f'After 10 years {name.upper()} will be {age +
10} years old'
'After 10 years RAJ will be 33 years old'
We have called the str method upper and used the expression age +
10. Curly braces are used to hold the variables or expressions; they are not
displayed. If you want to print left and right curly braces, double them up.
>>> f'He is {{ {name}, {age} }}'
'He is { Raj, 23 }'
The double curly braces are displayed as a single curly brace.
You can specify a field width where the given value will be displayed.
>>> f'His name is {name:8} and he is {age:6} years
old'
'His name is Raj and he is 23 years old'
The numbers 8 and 6 represent the field width, so the variable name is
displayed in a width of 8, and age is displayed in a field width of 6. By
default, the text is left-aligned, and numbers are right-aligned in their field.
We can force left alignment by using less than sign <. Similarly, the right
alignment can be forced using the greater than sign > and center alignment
by caret ^ sign.
>>> f'His name is {name:>8} and he is {age:<6}
years old'
'His name is Raj and he is 23 years old'
Now name is left-aligned, and age is right-aligned.
>>> f'His name is {name:^8} and he is {age:^6}
years old'
'His name is Raj and he is 23 years old'
Now, both name and age are center-aligned in their fields.
To print an integer in a fixed point format, write :f.
>>> f'Age is {age:f} and weight is {wt}'
'Age is 23.000000 and weight is 47.567'
The variable age is an integer, but since we have included :f, it is printed
with a point. We can also control the number of digits that are displayed.
>>> f'Age is {age:.3f} and weight is {wt}'
'Age is 23.000 and weight is 47.567'
Now, only three decimal digits are displayed. We can also specify the width.
>>> f'Age is {age:<10.3f} and weight is {wt}'
'Age is 23.000 and weight is 47.567'
The number 10 is the field width, and the less than symbol is for left
justification. Now, let us format the float value wt.
>>> f'Age is {age:<10.3f} and weight is {wt:.3}'
'Age is 23.000 and weight is 47.6'
We have specified a colon, a dot, and the number 3. This number represents
the total number of digits displayed. So, we can see that a total of three digits
are displayed. Let us specify a width for it.
>>> f'Age is {age:<10.3f} and weight is {wt:8.3}'
'Age is 23.000 and weight is 47.6'
Now, eight spaces are reserved to display this value. If you want to control
the number of digits displayed after the decimal, use the letter f.
>>> f'Age is {age:<10.3f} and weight is {wt:8.3f}'
'Age is 23.000 and weight is 47.567'
The number 3 represents the number of digits displayed after the decimal.
By default, your output fields will be padded using spaces; if you want a
character to be used for padding, you can place it just after the colon before
the alignment specifier. The character is used to display data when the data is
too small to fit in the assigned field width. It is called the fill character,
which can be any character except '{' or '}'.
>>> f'My name is {name:*^10} and age is {age:->12}'
'My name is ***Raj**** and age is ----------23'
The variable name is center-aligned in a field width of 10, while the asterisk
is a fill character. The variable age is right-aligned in a field width of 12,
and the dash is a fill character. The fill character must be specified before the
alignment specifier, and if you want to specify a fill character, it is necessary
to specify an alignment specifier. We know that numbers are right-justified
by default, but we have still specified the right alignment specifier because
we wanted padding done by dashes instead of spaces.
Escape sequences are interpreted as usual inside f-strings also. If you want to
suppress the escape mechanism, you can write raw f strings.
>>> print(fr'\name: {name}')
\name: Raj
This \n is not considered an escape sequence here.
We can write triple-quoted f-strings that span multiple lines.
>>> s = f'''My name is {name}, I am {age} years old
... and my weight is {wt} kg'''
>>> s
'My name is Raj, I am 23 years old \nand my weight
is 47.567 kg'
>>> print(s)
My name is Raj, I am 23 years old
and my weight is 47.567 kg
An integer can be displayed in hexadecimal, octal, or binary base.
>>> num = 1247
>>> f'{num:x} {num:o} {num:b}'
'4df 2337 10011011111'
We can use lowercase e or uppercase E to display a number in exponential
notation.
>>> num1 = 0.00000082478
>>> num2 = 3345600000000
>>> f'{num1:e} {num2:e} {num1:E} {num2:E}'
'8.247800e-07 3.345600e+12 8.247800E-07
3.345600E+12'
If we have a big number and want to print the thousands separator, we can
write a comma after the colon.
>>> f'{num2:,}'
'3,345,600,000,000'
Many times, in our programs, we need to display the value of variables and
expressions with their names.
>>> name = 'Raj'
>>> age = 23
>>> print(f'name = {name}, age = {age}')
name = Raj, age = 23
>>> a = 14
>>> b = 12
>>> print(f'a + b = {a + b} , a - b = {a - b}')
a + b = 26, a - b = 2
>>> print(f'min(a,b) = {min(a,b)}, max(a,b) =
{max(a,b)}')
min(a,b) = 12, max(a,b) = 14
Instead of duplicating the name of the thing to be printed, we can specify it
once with an equal to sign, inside the curly braces.
>>> print(f'{name = }, {age = }')
name = 'Raj', age = 23
>>> print(f'{a + b = }, {a - b = }')
a + b = 26, a - b = 2
>>> print(f'{min(a,b) = }, {max(a,b) = }')
min(a,b) = 12, max(a,b) = 14
3.20 String formatting using the format()
method of string class
f-strings were introduced in Python 3.6. If you are using an older version,
you have to use the format method to format strings.
>>> name = 'Raj'
>>> age = 23
>>> wt = 47.567
>>> s = 'My name is {}, I am {} years old and my
weight is {} kg'.format(name, age, wt)
>>> s
'My name is Raj, I am 23 years old and my weight is
47.567 kg'
When the curly braces are empty, the interpreter will substitute based on the
order of arguments sent in the format method. In the above example, the
first pair of curly braces are replaced with name, the second pair with age,
and the third pair with wt.
We can use index numbers inside curly braces to decide what goes where
while substituting values inside the string.
>>> s = 'My name is {0}, I am {1} years old and my
weight is {2} kg'.format(name, age, wt)
>>> s
'My name is Raj, I am 23 years old and my weight is
47.567 kg'
The value 0 refers to the first argument, 1 refers to the second argument, and
2 refers to the third argument. This way, you can change the order of the
variables and use a data value even more than once.
>>> s = 'Age {1} years, Name {0}, weight {2} kg,
bye from {0}'.format(name, age, wt)
>>> s
'Age 23 years, Name Raj, weight 47.567 kg, bye from
Raj'
In addition to positional arguments, we can send keyword arguments also.
These keyword arguments are called by their name.
>>> s = '{msg}, my name is {n}, I am {a} years
old'.format(n=name, a=age, msg='Hello')
>>> s
'Hello, my name is Raj, I am 23 years old'
We can mix both positional and keyword arguments in the same string.
>>> s = '{msg}, I am {1} years old and my weight is
{0} kg'.format(wt, age, msg='Hello')
>>> s
'Hello, I am 23 years old and my weight is 47.567
kg'
We can use conversion codes s, d, or f; the code s to display the value as a
string; d to display the values as a decimal integer (base 10), and f to display
the value as a float with decimal places. When using f conversion for values,
you can limit the number of digits displayed after the decimal point. This can
be done by adding a dot followed by the number of digits after the decimal
you want displayed.
>>> num1 = 123
>>> num2 = 345.43678
>>> print('number1 is {:.2f}'.format(num1))
number1 is 123.00
>>> print('number2 is {:.2f}'.format(num2))
number2 is 345.44
The float value will be rounded off if it has more decimal places than the
number of places we want to display.
You can use 0 if you do not want any decimal places to be displayed.
>>> print('number2 is {:.0f}'.format(num2))
number2 is 345
You can specify a width in which a given value is displayed.
>>> name = 'Raj'
>>> age = 23
>>> print('My name is {:8} and I am {:6} years
old'.format(name,age))
My name is Raj and I am 23 years old
By default, strings are left-justified in their width, and numbers are right-
justified. To change the justification, you can use symbols <, > or ^.
< for left justification
> for right justification
^ for center justification
>>> print('My name is {:^8} and I am {:<6} years
old'.format(name, age))
My name is Raj and I am 23 years old
In the following example, a total of four digits of number are displayed in a
width of 10.
>>> number = 78.386367
>>> print('number is {:10.4}'.format(number))
number is 78.39
In the next example, number is displayed in a width of 10 with four
decimal places.
>>> print('number is {:10.4f}'.format(number))
number is 78.3864
If you want, you can specify a fill character for padding within the given
field. By default, this padding is done with spaces. The alignment specifier
should be provided to specify a padding character.
>>> print('My name is {:*^8} and age is
{:.>6}'.format(name, age))
My name is **Raj*** and age is ....23
If there is a sign character, padding is done after that. A 0 preceding the
width performs zero padding.
>>> print('My name is {:*^8} and age is
{:>06}'.format(name, age))
My name is **Raj*** and age is 000023
You can provide a sign for numeric values.
+ Positive numbers have a + sign, and negative numbers have a -
sign
- Negative numbers have a minus sign
<space> Positive numbers preceded with space and negative numbers with
a - sign.
To specify an output type, you can use any of the following characters.
String - s
Integers - b for binary, d for decimal base 10 notation, x or X for
hexadecimal, o for octal notation
Floating point - e or E for exponential notation, f for fixed point notation
>>> num = 246
>>> print('{:x}'.format(num))
f6
>>> print('{:X}'.format(num))
F6
>>> print('{:o}'.format(num))
366
>>> print('{:b}'.format(num))
11110110
>>> num1 = 0.000000000412
>>> num2 = 124300000000000
>>> print('{:e}'.format(num1))
4.120000e-10
>>> print('{:e}'.format(num2))
1.243000e+14
You can display your numeric data with a comma as the thousands separator.
>>> print('{:,}'.format(num2))
124,300,000,000,000
3.21 Representation of text - character
encodings
For beginners in programming, this might seem like a complicated topic. If
you have mastered the string processing and formatting concepts presented
in the chapter, you can skip this part and move on to the next chapter without
losing continuity. However, understanding encodings is important when
sending or receiving data over the internet and dealing with text that includes
symbols, emojis, or different languages like Hindi, Russian, or Korean. This
section will give you a basic understanding of encodings and how computers
handle text as binary data. You can always come back to it later, but make
sure to read it before you dive into the chapter on working with files. If you
are curious about how computers handle text, you might find this section
interesting. Before reading this section, it will be good to have a basic idea
of the binary, decimal, and hexadecimal number systems.
Computers understand only 0s and 1s, so all forms of data, whether
numbers, text, or pictures, are represented and stored in binary form inside
the computer. Textual data is a sequence of characters like letters, digits,
symbols, punctuation marks, etc. Humans understand these characters, but
for a computer, each character is a number represented in binary form. So, to
represent different characters on a computer, each one must be assigned a
unique number. These numbers can be represented and stored as a sequence
of bits (0s and 1s) inside the computer.
Since data has to be transferred between computers, it is essential that
different computers use the same numeric codes for characters. This helps
ensure that text displayed or processed on one system can be correctly
understood and rendered on another. Thus, for effective communication
between devices, there needs to be a uniform and universal way of encoding
characters. To achieve this, the American Standard Code for Information
Interchange (ASCII) was introduced in the 1960s. This standard defines
numeric codes for 128 unique characters. It uses integers from 0 to 127 to
represent different characters like uppercase letters, lowercase letters, digits,
punctuation symbols, spacing characters, and other non-printing control
characters. For example, the ASCII code for uppercase A is 65(hex 0x41),
for lowercase a is 97(hex 0x61), and for digit 1 is 49(hex 0x31).
In ASCII, each character translates to an integer from 0 to 127. These 128
numbers can be represented by using 7 bits – 0000000(0) to 1111111(127).
Thus, ASCII is a 7-bit encoding which can be implemented with only 7 bits.
The basic storage unit of a computer is a byte, which is a group of 8 bits.
With 8 bits, 256(28) unique characters can be represented (00000000(0) to
11111111(255). The 8th bit is not utilized while using ASCII coding. If that
bit is also used, 128 more characters could be represented. This resulted in
different inconsistent encodings, which used the remaining 128 numbers
(128 to 255) in different ways. Different countries and organizations started
using these spare 128 numbers to represent their own language symbols.
ASCII was a universal standard, but these new encodings clashed and were
not standardized.
Thus, there was a need for a universal coding standard that could
accommodate characters from different scripts and languages used in the
world. This led to the development of UNICODE in the 1990s. It is
maintained by the Unicode Consortium, and its latest version, Unicode 15.1,
contains a total of 149,813 characters, which include symbols from different
languages of the world and even emojis. The Unicode specifications are
continually updated to add new characters.
Each Unicode character is given a unique name and identification number
called a code point. The Unicode code points are usually written in
hexadecimal notation (4 to 6 hex characters) preceded by U+. For example,
the code point for character A is written as U+0041, and its name is LATIN
CAPITAL LETTER A; the code point for digit 1 is U+0031, and its name is
DIGIT ONE. The hexadecimal number system is used for code points as it
provides a compact representation of large numbers and a more human-
friendly representation of binary data.
The Unicode standard contains many tables that list characters and their
corresponding code points and names. The first 128 characters of the
Unicode standard are the same as in the ASCII table, so ASCII is a subset of
Unicode. You can get the Unicode symbols, their names, and code points
from the Unicode website or the charmap utility in Windows. Note that the
Unicode names are not case-sensitive.
Unicode is a text encoding standard like ASCII; both define unique numbers
for different characters. They do not specify anything about the
implementation, i.e., how these unique numbers should be stored in memory
or transmitted over the network. Implementation of ASCII characters is
simple as they are small in number (only 128), so each character can fit in a
single byte. However, Unicode characters are large in number; thus, a single
byte is not sufficient to represent each Unicode character. There could be
different ways to represent a Unicode character as binary data. Thus, to
represent Unicode characters as bit patterns, different Unicode encoding
schemes are used. Unicode standard specifies the code points for various
characters, while these schemes provide the format for representing a
character in one or more bytes. These schemes specify how a Unicode
character will be represented in memory, files, or during data transmission.
Some schemes are fixed length schemes while others are variable length.
Fixed length schemes use the same number of bytes to represent each
character, while variable length encodings represent different characters with
different numbers of bytes.
Unicode standard is implemented by different encoding schemes like UTF-8,
UTF-16, and UTF-32. The scheme UTF-32 is a fixed length encoding
scheme that uses four bytes to represent each Unicode character. This
encoding is not efficient in terms of space as characters that could be
represented in one or two bytes also occupy four bytes. This encoding
wastes a lot of space for representing common characters and thus is rarely
used. UTF-16 and UTF-8 are variable-length encoding schemes, and from
these, UTF-8 (Unicode Transformation Format -8) is more widely used. It is
supported by most programming languages, websites, and operating
systems.
UTF-8 is a variable-length encoding that uses one to four bytes to represent
each Unicode character, depending on the character's code point value. The
first 128 code points are represented with a single byte per character, which
means that the ASCII characters are encoded in the same way in UTF-8,
making it compatible with existing ASCII text. Since UTF-8 is backward
compatible with ASCII, using UTF-8 will not break any software based on
ASCII. Any valid UTF-8 text is also valid ASCII text.
For other non-ASCII characters, UTF-8 uses two, three, or four bytes per
character. Thus, storing ASCII text is efficient since only one byte per
character is taken. Less commonly used characters are represented using
three or four bytes. UTF-8 is popular because it is compatible with ASCII
and requires less space for English text and other Western languages.
The str type in Python is a sequence of Unicode characters so that we can
include all characters listed in the Unicode standard in our Python strings. In
the following strings, we have some Unicode characters that are not in
ASCII. You can copy and paste them from somewhere if unavailable on your
keyboard.
>>> s = 'Hello World ☺'
>>> c = 'Copyright © '
>>> greeting = '🙏 नमस्कार मेरा नाम दीपाली है 🙏'
>>> message = 'ನಾನು ಬೆಂಗಳೂರನ್ನು ಪ್ರೀತಿಸುತ್ತೇನೆ 💚'
>>> bday_wish = '생일 축하해 신의 축복이 있기를 🎂💐'
The Unicode characters can also be placed inside string literals with the help
of escape sequences. We can insert a character by its code point by using the
escape sequences \xhh, \uxxxx, \Uxxxxxxxx. Smaller numbers can be
written using \x, and bigger ones using \u and \U. If you write smaller
numbers with \u and \U you must do the left padding with zeros.
Characters can also be included by their Unicode name if we use the escape
sequence \N{name}.
>>> '100\xA5'
'100¥'
>>> '\u2660\u2663\u2665\u2666'
'♠♣♥♦'
>>> '\N{Black Smiling Face} Hello World \N{White
Smiling Face}'
'☻ Hello World ☺'
>>> '\U0001F929\U0001F607\U0001F60E\N{rolling on
the floor laughing}'
'🤩😇😎🤣 '
>>> '\xA9\u00A9\U000000A9\N{Copyright sign}'
'©©©©'
The module unicodedata contains a function named name that takes a
Unicode character and returns its Unicode name in uppercase, and the
function lookup that takes a case-insensitive name and returns a Unicode
character.
>>> import unicodedata
>>> unicodedata.name('♠')
'BLACK SPADE SUIT'
>>> unicodedata.lookup('black spade suit')
'♠'
To see the names of all the characters used in a string, we can write the
following loop. Do not worry about how the loop works. We will study the
details of loops in the coming chapters.
>>> import unicodedata
>>> s = 'नमस्ते Hello 🙏'
>>> for i in range(len(s)):
... print(unicodedata.name(s[i]))
DEVANAGARI LETTER NA
DEVANAGARI LETTER MA
DEVANAGARI LETTER SA
DEVANAGARI SIGN VIRAMA
DEVANAGARI LETTER TA
DEVANAGARI VOWEL SIGN E
SPACE
LATIN CAPITAL LETTER H
LATIN SMALL LETTER E
LATIN SMALL LETTER L
LATIN SMALL LETTER L
LATIN SMALL LETTER O
SPACE
PERSON WITH FOLDED HANDS
In Python 3, a string of type str is a sequence of Unicode characters. There
is no encoding scheme associated with the string. When the string is stored
in memory or disk or passed over a network, it is encoded using an encoding
scheme. The interpreter will do most things for us, and we do not have to
worry about encoding as long as we are doing regular string processing
operations on our computer. When we exchange data with other sources, we
need to be aware of the encoding schemes used by the source and our
system.
Most of the Python implementations use the UTF-8 encoding scheme by
default. So, the default encoding for Python source files (.py files) is UTF-8.
You can use another encoding by inserting a comment of this form at the
beginning of your .py file.
# -*- coding: encoding-name -*-
# -*- coding: ascii -*-
# -*- coding: windows-1252 -*-
We can use the built-in functions ord and chr to convert a character to a
code point and vice versa. The function ord returns the Unicode code point
for a one-character string, and the function chr returns a Unicode string of
one character representing the Unicode code point provided to it. The ord
function will raise a TypeError if you send a string of length longer than
one.
>>> ord('A')
65
>>> ord('🙏')
128591
>>> hex(ord('🙏'))
'0x1f64f'
>>> chr(0x1f64f)
🙏
' '
>>> chr(65)
'A'
We know that the str type strings are immutable sequences of Unicode
characters (or code points). Python also supports strings made up of raw
bytes. The type for these strings is bytes, and they are immutable
sequences of plain bytes or 8-bit values. 8-bit values can range from 0 to
255, so each element in a bytes string is an integer in the range 0 to 255.
The bytes type is used to manipulate raw binary data. You can write a
bytes literal like a str literal by enclosing it in single, double, or triple
quotes, but with the letter b prefixed before the opening quote.
>>> y = b'\x44\x35\xC8'
>>> type(y)
<class 'bytes'>
b'\x44\x35\xC8' is a bytes literal that contains three bytes that we
have specified with \x escape sequence in hexadecimal notation.
>>> y
b'D5\xc8'
When a bytes value is displayed, ASCII printable characters and escape
sequences like \n, \t are printed while other bytes are shown with
hexadecimal escape sequence \x. This is why, while displaying the above
bytes string, the ASCII-compatible characters D and 5 are represented as
characters while the last byte is displayed with an escape sequence. This is
the reason why str strings and bytes strings that contain only ASCII
characters will look similar when displayed using print or on an
interactive prompt.
We can also specify ASCII characters in a bytes literal, so we could write
the above literal as:
>>> y = b'D5\xc8'
The len built-in function, when used with bytes type, returns the number
of bytes contained.
>>> len(y)
3
We can convert a regular str string to bytes string by calling the
encode() method on the string. To convert the encoded plain bytes to a
Unicode string of type str we can use the decode() method on a bytes
string. These methods take an encoding argument according to which they
will do the encoding or decoding.
😄
>>> 'AS '.encode('utf-8') # bytes representation
of string according to utf-8 encoding
b'AS\xf0\x9f\x98\x84'
>>> b'AS\xf0\x9f\x98\x84'.decode('utf-8') #
converting encoded bytes back to Unicode string
'AS 😄'
😄
>>> 'AS '.encode('utf-32')
b'\xff\xfe\x00\x00A\x00\x00\x00S\x00\x00\x00\x04\xf
6\x01\x00'
>>>
b'\xff\xfe\x00\x00A\x00\x00\x00S\x00\x00\x00\x04\xf
6\x01\x00'.decode('utf-32')
'AS 😄'
If the encoding is not specified, the default coding is utf-8, but it is better to
be explicit and always specify the encoding argument.
Attempting to encode a string that contains characters not specified in the
😄
encoding results in a UnicodeEncodeError. For example, we cannot encode
😄
'AS ' using the ascii or latin-1 encoding as these encodings do not have
the ' ' character.
😄
>>> 'AS '.encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode
character '\U0001f604' in position 2: ordinal not
in range(128)
>>> 'AS 😄'.encode('latin-1')
UnicodeEncodeError: 'latin-1' codec can't encode
character '\U0001f604' in position 2: ordinal not
in range(256)'
We can use a second argument to ignore the characters that cannot be
encoded or replace them with a question mark.
>>> 'AS😄'.encode('ascii', 'ignore')
b'AS'
>>> 'AS😄'.encode('ascii', 'replace')
b'AS?'
We have seen that the decode method returns a string by decoding the
bytes in the bytes string according to the specified encoding. The decoding
should be done using the same encoding scheme used to encode that data. If
not, you might get wrong, garbled text or UnicodeDecodeError. For
example, if the binary data we get from some source was encoded in UTF-16
and we try to decode it using UTF-8 or any other encoding, we will get an
error or sometimes wrong text.
>>> data = 'AS ' 😄
>>> binary_data = data.encode('utf-32')
>>> binary_data.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte
0xff in position 0: invalid start byte
>>> data = '±µ'
>>> binary_data = data.encode('utf-8')
>>> binary_data.decode('latin-1')
'±µ'
>>> data = 'εθ'
>>> binary_data = data.encode('utf-8')
>>> binary_data.decode('utf-16')
'뗎룎'
The built-in bytes() function can also be used to create a bytes object
from a str string according to the encoding specified.
😄
>>> bytes('AS ', 'utf-8')
b'AS\xf0\x9f\x98\x84'
The built-in len function, when used on str type strings, counts the
Unicode characters. It does not count bytes.
>>> len('µ')
1
'µ' is a string of 1 character irrespective of the number of bytes that will be
used to store it. The number of bytes will depend on the encoding scheme
used.
>>> len('µ'.encode('latin-1'))
1
>>> len('µ'.encode('utf-8'))
2
>>> len('µ'.encode('utf-16'))
4
>>> len('µ'.encode('utf-32'))
8
The len function, when used on a bytes string, returns the number of
bytes. The following examples show that UTF-8 is a variable length
encoding that uses different numbers for different characters.
>>> len('A'.encode('utf-8'))
1
>>> len('µ'.encode('utf-8'))
2
>>> len('₹'.encode('utf-8'))
3
>>> len(' 😄'.encode('utf-8'))
4
There is another type in Python called bytearray which is a mutable
variant of bytes.
Exercise
1. s = 'Morning'
The expression s[len(s)] will:
(A) Give last character of the string
(B) Show error
2. Strings objects are _______
(A) mutable (B) immutable
3. If s = 'Rainbow', what is s[2]?
(A) 'a' (B) 'i'
4. If s = 'Rainbow', what is s[-2]?
(A) 'o' (B) 'b'
5. What will this code display?
s = 'Hello' + 2
print(s)
(A) Hello2
(B) HelloHello
(C) TypeError
6. Type of 'x' is:
(A) char (B) str
7. The first character of a string s is given by:
(A) s[-1]
(B) s[0]
(C) s[1]
8. The last character of a string of length n is given by
(A) s[-1]
(B) s[n-1]
(C) Both
9. If s = 'rose', then the assignment statement
s[2] = 'p' will:
(A) change the string to 'rope'
(B) change the string to 'rpse'
(C) give error
10. If s = 'hello', what will be the value of s.len()?
(A) 5
(B) 6
(C) Error
11. A variable that is referencing an immutable value cannot be
reassigned to another object.
(A) True (B) False
12. If s = 'hello world', what is s.capitalize()?
(A) 'Hello World'
(B) 'HELLO WORLD'
(C) 'Hello world'
13. s = 'Small gains are better than no gains'
What is the value of s.count('n', -10)
(A) 0 (C) 4
(B) 3 (D) Error
14. s = 'Hello world'
What is the value of s.find('word')
(A) 0 (C) 6
(B) -1 (D) Raises ValueError
15. What is the value of s after this assignment:
s = 'Good' + ' ' * 2 + 'Evening' + '!' * 3
(A) Good*2Evening!*3
(B) 'Good Evening!!!'
(C) Gives Error
16. What will be the output of the following code?
s1 = '<>'
s1 *= 3
print(s1)
(A) <><><>
(B) Gives error, as strings are immutable
17. What is the value of this expression?
'.... Where ??? '.strip('.?')
(A) 'Where' (B) ' Where '
(C) '.... Where'
(D) strip() does not take arguments
18. The expression 'cd' not in 'abcde' returns
(A) True
(B) False
(C) Gives Error
19. The expression s[s.rindex('$'): ] will give a string that
contains everything _____
(A) before the first occurrence of $ in s
(B) after the first occurrence of $ in s
(C) before the last occurrence of $ in s
(D) after the last occurrence of $ in s
20. s.find('Count', 20, 70)
For the above expression, search will be performed in the portion of
string
(A) from index 20 to index 69
(B) from index 20 to index 70
21. Which of these represents a newline character?
(A) '\l'
(B) '\i'
(C) '\n'
22. What is the length of this string?
len('Hi\tthere\n\n')
(A) 10
(B) 13
(C) 14
23. Which statement will display the given text on the screen?
E:\python\numbers.py
(A) print('E:\\python\\numbers.py')
(B) print('E:\python\\numbers.py')
(C) print(r'E:\python\numbers.py')
(D) All of these
24. Which of these will give syntax error:
(A) print("Let's face it")
(B) print('Don't just exist, live')
(C) print('It\'s okay to take a break')
25. By default, text is ____ aligned and numbers are ____ aligned in their
field.
(A) left, right
(B) right, left
26. To perform centre alignment of a value in a field width, which symbol
is used.
(A) < (C) ^
(B) > (D) &
27. What will this code display?
fruit = 'banana'
price = 154.25
print(f'Price of {fruit} is {price:.6f}')
(A) Price of banana is 154.250
(B) Price of banana is 154.250000
28. n = 23414565755
What will the following statement print?
print(f'{n:,}')
(A) 23414565755
(B) 23,414,565,755
(C) 234,145,657,55
29. Which statement will you write for displaying the following number
in exponential notation?
number = 0.00000000354
(A) print(f'{number:e}')
(B) print(f'{number:E}')
(C) Any of these
30. number = 2455
Which statement will display the above number in hexadecimal base?
(A) print(f'{number, x}')
(B) print(f'{number:x}')
(C) print(f'{number:h}')
For questions 31 to 46, use the following string
s = 'Ideas are easy, execution is hard.'
31. Display the first 5 characters of the string.
32. Display the last 5 characters of the string.
33. Display the 5th character of the string.
34. Display the last character of the string using negative index.
35. Display the reverse of the string.
36. Display the string without the last character.
37. Display the string without the last 5 characters.
38. Display the string without the first 5 characters.
39. What will you get when you write s[100].
40. What will you get when you write s[-40].
41. What will you get when you write s[6:100].
42. What will you get when you write s[-40:5].
43. Make another string s1 that is an exact copy of s.
44. Make another string s2 from s by excluding the last 3 characters.
45. What will you get by writing s[5:5]
46. Display every alternate character of the string, starting from index 4
to index 14.
47. Write a statement to change a string such that its first character and
last characters are exchanged. If the string is 'Hello World', it
becomes 'dello Worlh'.
48. Make a string s3, by concatenating the last 4 characters of a string s1
and first 3 characters of a string s2.
49. Make a string s1 from string s, in which the first 2 characters are
repeated 5 times, and the last character is repeated 3 times. For
example, if the string s is 'Hello World !', then the string s1 is
'HeHeHeHeHello World !!!'
50. Write a program that inputs an email id and extracts the username and
domain name from the email id. For example, if email is
[email protected] then username is myname and domain
name is somesite.com
(Hint : Use index() method)
51. Write a program to extract whatever is enclosed inside asterisks in a
string. For example, if the string is 'Deepa 35 *9/11/1977* Najibabad',
the portion extracted is 9/11/1977.
(Hint : Use index() and rindex() methods)
52. s = ' welcome to bengaluru '
Write a single statement to strip all the whitespaces from left and right
of this string s and convert it to title case.
53. s = 'he he that he that he that that he he
that'
Write a single statement to replace all occurrences of 'he' with 'she'
and first 3 occurrences of 'that' with 'this'.
54. Make a new string s1 from a string s, such that the first half of the
string s is changed to uppercase and the second half to lower case.
For example, if string s is 'Hello World', string s1 will be 'HELLO
world'
55. Write a single statement to check whether a string s begins with 'Line'
and ends with 'Done'.
56. Write a statement to create a new string named code from three
strings named name, dob and city. The string code should
contain every alternate character from string name(only up to 8th
character), the first two characters and last 2 characters from string
dob and the first three characters from string city. The string code
will be 11 characters long.
If name = 'Johny Abraham' dob = '09/11/1987' city = 'London' code
will be 'JhyA0987Lon'
If name = 'Marie Claire' dob = '12/04/1991' city = 'Paris' code will be
'MreC1291Par'
57. Write a statement to print a line that contains 80 dashes.
58. Write a statement to print 5 blank lines. ('\n' is the newline character)
59. Write a statement to find the reverse of an integer n.
60. s = ' Python '
Will the following two statements give same result.
(i) s.rjust(20, '-').strip()
(ii) s.strip().rjust(20, '-')
61. What will be the output of the following code?
s = 'Python'
print(s[len(s)-3], s[-3])
62. What will be the output of the following code?
s = 'And'
letters = '_abcdefghijklmnopqrstuvwxyz'
print(letters.index(s[0].lower()),
letters.index(s[1]), letters.index(s[2]))
63. How will you write a print function call that ends in a colon and a
newline?
64. What will be the output of the following code?
s = "caattt's curiosity killed the cat"
print(s.removesuffix('cat'))
print(s.strip('cat'))
Lists and Tuples 4
Lists are ordered collections of items. They can be considered similar to
arrays in other languages. They are more flexible and powerful as they do
not have fixed sizes and can store elements of different types. Lists are the
most commonly used sequence types in Python. Here are some examples of
list literals:
[27, 13, 14, 26, 19]
[ ]
['papaya', 'apple', 'banana']
[10, 15, 'black', None, 3.5, True, 15,]
The elements of a list are separated by commas and are enclosed in square
brackets. The first example is a list with five elements, and the second
represents an empty list. The elements in a list can be of different types. For
example, the fourth list contains values of type int, str, NoneType,
float, and bool. Although lists allow mixed types, they are often used to
store values of the same type. They are commonly used to represent
collections of similar items, such as a list of names or a list of numbers. By
storing values of the same type in a list, we can conveniently perform the
same operation on all the elements of a particular list.
Values in a list need not be unique; it can have duplicate values. This means
that the same value can appear multiple times at different positions in the
list. For example, in the fourth list, the value 15 occurs twice.
We can place a trailing comma at the end of the values in a list literal. For
example, in our fourth list literal, we have a comma after the last element,
15, just before the closing square bracket. This trailing comma is ignored
and does not cause any syntax error. This can be useful when you want to
add elements to a multiline list or rearrange it while editing your code.
Like integer or string literals, list literals can also be assigned to variables.
list1 = [12, 43, 21, 67, 54, 11]
When this assignment statement executes, Python creates a list object and
makes the name list1 refer to that object.
A list is a referential data structure, which means that it stores references to
its elements. Here is how we can visualize list1.
Figure 4.1: List object
The name list1 refers to the list object, and the list object stores
references to different objects that represent the elements of the list. So,
although we generally say that a list contains elements, it technically
contains references to those elements.
The list type is mutable; this is the first mutable type that we are discussing.
‘Mutable’ means that an object of type list can be changed, and its
contents can be altered. You can add new elements or delete/overwrite
existing elements from the list object. This is why a list can dynamically
contract or expand at runtime; its size is not fixed. The interpreter
dynamically allocates more memory when required and also dynamically
releases the memory no longer required by the list.
We have discussed some properties of a list. Now, before going further, let
us discuss why we need the list data type. The list type provides a way to
combine related data in order. Let us see an example. Suppose we have this
travel itinerary for a 3-week trip:
1. Delhi 2. Bareilly 3. Srinagar 4. Agra 5. Jaipur 6. Mumbai 7. Goa 8.
Bangalore 9. Kolkata 10. Varanasi
The order of the destination cities is important here. If we need to manage
this trip in our program, then without the list type, we would make ten
variables.
destination1 = 'Delhi'
destination2 = 'Bareilly'
destination3 = 'Srinagar'
destination4 = 'Agra'
destination5 = 'Jaipur'
destination6 = 'Mumbai'
destination7 = 'Goa'
destination8 = 'Bangalore'
destination9 = 'Kolkata'
destination10 = 'Varanasi'
Using a list, we can have all of them in only one data structure and access
them using a single name. Since a list is an ordered data structure, the order
is preserved.
trip = [
'Delhi', 'Bareilly', 'Srinagar', 'Agra',
'Jaipur',
'Mumbai', 'Goa', 'Bangalore', 'Kolkata',
'Varanasi'
]
Now suppose we decide to cut 'Agra', 'Jaipur', and 'Mumbai'
from the trip. If we defined 10 variable names, we would have to delete three
variable names. This would create confusion, as now, after the name
destination3, we have the name destination7. In the case of a list,
we can easily delete the items from anywhere inside the list. Similarly, if we
have to add more cities to the trip, it would be easier if we use a list.
Suppose you need to include another trip that involves 20 cities. In that case,
you can just make another list instead of defining 20 other names, which is
obviously tedious and difficult to maintain in the program.
When we use a list, we can easily insert new items, delete items, replace
items, or reorder them. By using a list, we can group related data under one
name. Structuring the data inside a list also makes it easier to process it
using loops, as discussed in the coming chapters.
Strings, lists, tuples, and range objects are sequences, as they are ordered
collections of items. All the sequence operations like indexing, slicing,
concatenation, and repetition that we have seen for strings are also valid for
lists. However, lists are mutable, so they support other operations that can
make in-place changes. This means that you can make changes in the list
object itself instead of creating a new changed object, as we had to do in
strings.
4.1 Accessing individual elements of a list by
indexing
In our program, we can print the whole list by sending the list’s name to the
print function. On the interactive prompt, we can just write the name of
the list, and it will be printed. Most of the time, in our program, we would
like to access individual elements of the list.
Similar to strings, the elements of a list can be accessed by writing integer
index values enclosed in square brackets. Like strings, lists also use zero-
based indexing. If L is the name of the list, then to access the first element,
we write L[0]; for the second element L[1], and so on. A list of size n has
elements indexed from 0 to n-1. As in strings, we can also give negative
index values to index backward. So, L[-1] represents the last element,
L[-2] the second last element, and so on. For a list of length 6, indices
0,1,2,3,4,5 and -1,-2,-3,-4,-5,-6 are valid indices. Any integer less than -1 or
more than 5 will be invalid. If you try to access a list element at an invalid
index, the interpreter will raise an IndexError.
>>> L = [10, 20, 30, 40, 50, 60]
>>> L
[10, 20, 30, 40, 50, 60]
>>> L[1]
20
>>> L[-1]
60
>>> L[10]
IndexError: list index out of range
4.2 Getting parts of a list by slicing
We can extract a portion of the list by slicing. The slice operations that we
saw for strings work for lists also in the same way. Slicing a list gives us a
part of the list as a new list object. As in strings, we can get a slice of the list
by specifying indices separated by colons inside square brackets. The
detailed syntax of slicing is not repeated here, as it is exactly the same as in
strings. Here are a few examples:
L = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110]
L[2:7] Gives a list that contains elements from index 2 to index 6
[30, 40, 50, 60, 70]
L[2:77] Gives a list that contains elements from index 2 to index 10 (No IndexError)
[30, 40, 50, 60, 70, 80, 90, 100, 110]
L[:5] Gives a list that contains elements from index 0 to index 4 (first 5 elements)
[10, 20, 30, 40, 50]
L[5:] Gives a list that contains elements from index 5 to index 10
[60, 70, 80, 90, 100, 110]
L[-5:] Gives a list that contains elements from index -5 to index 10 (last 5 elements)
[70, 80, 90, 100, 110]
L[2:9:2] Gives a list that contains every second element from index 2 to index 8
[30, 50, 70, 90]
L[::2] Gives a list that contains every second element starting from first index till last
index
[10, 30, 50, 70, 90, 110]
L[:] Gives a list that is an exact copy of the list L
[10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110]
L[::-1] Gives a list that is the reverse of the list L
[110, 100, 90, 80, 70, 60, 50, 40, 30, 20, 10]
Table 4.1: Examples of list slicing
If the first number inside the square brackets is omitted, it is considered zero;
if the second is omitted, it is considered the last index. The third number
represents the step and is optional; if it is omitted, it is considered 1. The
slice L[:] gives an exact copy of the list, and the slice L[::-1] gives the
reverse of the list.
You can assign these slices to variable names. For example, if you wish to
make a list L1 that is the reverse of the list L, you can write this:
L1 = L[::-1]
The following statement will make L2 a copy of the list L.
L2 = L[:]
4.3 Changing an item in a list by index
assignment
Since lists are mutable, it is possible to change a list object in-place. We can
change any element in the list by assigning it to an index. In the following
example, we are changing the element at index 1.
>>> L = [12, 43, 21, 67, 54, 11, 13]
>>> L[1] = 50
>>> L
[12, 50, 21, 67, 54, 11, 13]
If the index is out of range, then an IndexError will be raised.
>>> L[7] = 100
IndexError: list assignment index out of range
Python performs bounds checking while indexing, so accessing or assigning
off the end of a list is an error. The statement L[7] = 100 will not just
silently grow the list; instead, it throws an error. There are specific methods
to grow a list, which we will see in a while.
4.4 Changing a Portion of the list by slice
assignment
You can modify portions of the list by assigning them to slices. When a list
slice is used on the left side of an assignment, the range specified in the slice
will be replaced by what is on the right-hand side. Suppose we have this list:
>>> L = [12, 43, 21, 67, 54, 11, 13]
The following assignment statement replaces the elements at index 2, 3, and
4 with the three elements of the list on the right side:
>>> L[2:5] = [300, 400, 500]
>>> L
[12, 43, 300, 400, 500, 11, 13]
Slice assignment can replace multiple elements of the list in a single step.
The length of the list on the right side need not be equal to the length of the
slice that is being assigned.
>>> L = [12, 43, 21, 67, 54, 11, 13]
>>> L[2:5] = ['a', 'b', 'c', 'd', 'e']
>>> L
[12, 43, 'a', 'b', 'c', 'd', 'e', 11, 13]
In this example, the length of the slice is three while five items are being
assigned, and we can see that all five elements are included in the resultant
list. So, the length of the slice and the length of the list that is being assigned
need not be the same; the list will shrink or expand to accommodate the new
values. This flexibility only exists if you do not provide a step in the slice.
When the slice includes a step, the lengths of the slice being assigned to and
the length of the list on the right side should be the same. If the step is not
provided, their lengths can be different.
In our example, we have used a list on the right side of an assignment
statement. You can use any other iterable; for example, you can have a string
or a tuple also on the right side. Let us use a string.
>>> L = [12, 43, 21, 67, 54, 11, 13]
>>> L[3:6] = 'abcdef'
>>> L
[12, 43, 21, 'a', 'b', 'c', 'd', 'e', 'f', 13]
Now, the specified portion of the list is occupied by characters of the string
on the right side.
We can delete a portion of the list by assigning an empty list to a slice.
>>> L = [12, 43, 21, 67, 54, 11, 13]
>>> L[3:6] = []
>>> L
[12, 43, 21, 13]
Here, all the elements from index 3 to index 5 are deleted from the list.
We know that the slice L[:] represents the whole list, so assigning to it will
replace the whole list with the list on the right side.
>>> L = [12, 43, 21, 67, 54, 11, 13]
>>> L[:] = [1, 2, 3, 4]
>>> L
[1, 2, 3, 4]
If you want to clear the whole list, you can write this:
>>> L = [12, 43, 21, 67, 54, 11, 13]
>>> L[:] = []
>>> L
[]
We can insert multiple new elements in a list by squeezing them into an
empty slice at the desired location.
>>> L = [12, 43, 21, 67, 54, 11, 13]
>>> L[3:3] = [10, 20, 30, 40]
>>> L
[12, 43, 21, 10, 20, 30, 40, 67, 54, 11, 13]
The four elements of the list on the right side are inserted in the list L
starting at index 3. This way, you can insert new elements without deleting
any existing ones.
If you want to add some items to the beginning of the list, you can write this:
>>> L = [12, 43, 21, 67, 54, 11, 13]
>>> L[0:0] = [7, 8, 9] # or L[:0] = [7, 8, 9]
>>> L
[7, 8, 9, 12, 43, 21, 67, 54, 11, 13]
When assigning to slices, there should always be an iterable on the right
side, even if it contains zero or no elements.
>>> L[5:5] = 90
TypeError: can only assign an iterable
>>> L[5:5] = [90]
>>> L
[7, 8, 9, 12, 43, 90, 21, 67, 54, 11, 13]
In strings, index assignments and slice assignments are not possible as they
are immutable. List objects are mutable, so they can be changed in-place;
hence, index and slice assignments are allowed.
These slice assignments are not commonly used in practice as there are
specific list methods for performing most insertion and deletion operations.
The names of these methods are self-explanatory; hence, using them is
simpler than using slice assignments in most cases. In the following few
sections, we will explore these methods.
4.5 Adding an item at the end of the list by
using append()
The append() method adds a single item at the end of the list, and it
returns None.
>>> numbers = [10, 20, 30, 40]
>>> numbers.append(50)
>>> numbers
[10, 20, 30, 40, 50]
We have taken a list of four integers and added another integer to it at the
end using the append method.
4.6 Adding an item anywhere in the list by
using insert()
The append method inserts a new item only at the end of the list. If we
want to add a new item at a particular index in the list, we can use the
insert method. By using this method, we can insert a new element at any
place in the list. Like append, this method also returns None.
In the following list, we have inserted a new element 25 at index 2 using the
insert method.
>>> numbers = [10, 20, 30, 40, 50]
>>> numbers.insert(2, 25)
>>> numbers
[10, 20, 25, 30, 40, 50]
The new item is inserted just before the element that was at the index where
we want to insert. The element 30 was at index 2, and the new element 25 is
inserted before 30, so now 25 is at index 2. All the elements, including 30
and after it, are shifted right to make room for the new value.
Inserting at index 0 inserts the new item at the beginning of the list.
>>> numbers.insert(0, 8)
>>> numbers
[8, 10, 20, 25, 30, 40, 50]
If we provide a big index past the end of the list, the element is inserted at
the end of the list like append. It will not show any error.
>>> numbers.insert(1000, 5)
>>> numbers
[8, 10, 20, 25, 30, 40, 50, 5]
We got no error, and 5 was inserted at the end of the list.
Adding a new element in between the list or removing it from between the
list is costly, as internally, some elements have to be shifted. In case of
insertion, some elements might have to be shifted right to make place for a
new element. In case of deletion, some elements might have to be shifted left
to fill the gap. If the list is large, this shifting can take a lot of time. Insertion
or deletion from the end is more efficient, as no shifting is required.
4.7 Adding multiple items at the end by using
extend() or +=
You can add multiple items at the end of the list by using the extend
method. This method takes an iterable as an argument, and it will add all
elements of this iterable to the end of the list. This method also returns
None. An iterable object is an object that can produce an item on request.
All three sequences - lists, strings, and tuples are iterables. Dictionaries and
sets are also iterables.
In the following example, we have called the extend method on the
numbers list and sent another list nums as the argument.
>>> numbers = [10, 20, 30]
>>> nums = [1, 2, 3, 4]
>>> numbers.extend(nums)
[10, 20, 30, 1, 2, 3, 4]
All the elements of nums list are added at the end of the numbers list. The
method append will add only one item at the end of the list, while you can
use extend to add multiple items at the end of the list. So, instead of
multiple calls to append, you can use the extend method as a shorthand.
A single extend call is more efficient than repeated append calls.
If we call append and send a list as an argument, that list will be added as
one item.
>>> numbers = [10, 20, 30]
>>> nums = [1, 2, 3, 4]
>>> numbers.append(nums)
>>> numbers
[10, 20, 30, [1, 2, 3, 4]]
We can use other iterable types also in extend, like tuple, or string.
>>> numbers = [10, 20, 30]
>>> numbers.extend('abcd')
>>> numbers
[10, 20, 30, 'a', 'b', 'c', 'd']
All characters of the string argument are added at the end of this list.
The augmented assignment index can also be used to add items from an
iterable.
>>> numbers += [98, 99, 100]
>>> numbers
[10, 20, 30, 'a', 'b', 'c', 'd', 98, 99, 100]
4.8 Removing a single element or a slice by
using the del statement
We can use the del statement to delete a single element or a slice from the
list. del is a keyword in Python; it is not a list-specific method like
append or extend.
del L[i] Removes the element at index i
del L[i:j] Removes elements from index i to index j-1
del L[i:j:k] Removes elements from index i to index j-1 with a stride of k
Table 4.2: del statement
>>> numbers = [10, 20, 30, 40, 50, 60]
>>> del numbers[4] # Deletes element at index 4
>>> numbers
[10, 20, 30, 40, 60]
>>> numbers = [10, 20, 30, 40, 50, 60, 70, 80, 90,
100]
>>> del numbers[2:7] # Deletes elements from index
2 to index 6
>>> numbers
[10, 20, 80, 90, 100]
All the elements after the deleted element are shifted left to fill any gap
made by the deleted element. The statement del numbers[:] deletes all
the elements from the list.
4.9 Removing an element by index and
getting it by using pop()
If you want to remove an item from the list and also get the removed item,
you can use the method pop.
L.pop(i) Removes and returns the element at index i in the list
L.pop() Removes and returns the last element of the list
If we do not specify any index as an argument, then this method removes
and returns the last element of the list. So, pop() without any argument is
the same as pop(-1).
>>> numbers = [10, 20, 30, 40, 50, 60, 70, 80]
>>> x = numbers.pop(4) # removes the element at
index 4
>>> x
50
>>> numbers
[10, 20, 30, 40, 60, 70, 80]
>>> y = numbers.pop(0) # removes the first element
>>> y
10
>>> numbers
[20, 30, 40, 60, 70, 80]
>>> z = numbers.pop() # removes the last element
>>> z
80
>>> numbers
[20, 30, 40, 60, 70]
If you try to give a non-existent index as an argument, then an
IndexError will be raised.
The object returned by pop is generally assigned to a variable so that it can
be used later. If the returned object is not assigned to any variable, then the
returned object ceases to exist, and the memory occupied by it is reclaimed
by the interpreter.
4.10 Removing an element by value using
remove()
If you want to remove an element from the list but do not know its index in
the list, then you can use the remove method. L.remove(x) will remove
the first occurrence of item x from the list L, and it returns None. If x is
not found in the list, then it raises ValueError.
>>> numbers = [10, 20, 30, 40, 50, 60, 70, 80]
>>> numbers.remove(20)
>>> numbers
[10, 30, 40, 50, 60, 70, 80]
>>> numbers.remove(25)
ValueError: list.remove(x): x not in list
numbers.remove(20) removes the first occurrence of item 20 from the
list. If there are multiple occurrences of the item and you want to remove all
occurrences, you can use a loop or a list comprehension. We will see how to
do this in the coming chapters.
4.11 Removing all the elements by using
clear()
The method clear will remove all items from the list, making it empty.
>>> numbers = [10, 20, 30, 40, 50, 60, 70, 80]
>>> numbers.clear()
>>> numbers
[]
Let us summarise all the removal methods. If you want to delete an item by
index, you can use the del statement. If you want to delete an item by index
and also want to use the deleted item, use the pop method. If you want to
delete an item by value, use the remove method. To delete all the elements
from the list, use the clear method.
The method clear was introduced in Python 3. Before that, we could clear
the list by using del statement or slice assignment only.
>>> del numbers[:]
>>> numbers[:] = []
Note that if we assign an empty list to the list name, it does not clear the list
in-place.
>>> numbers = [] # assigns a new empty list, not
an in-place clearing
Clearing the list in-place is important when there are other references
referencing the list. For example, when we send the list as an argument to a
function, the in-place approach should be used.
4.12 Sorting a List
The elements of a list can be sorted by using the list sort method. It sorts
the list in-place, which means that it will change your list object. The
elements are sorted in ascending order, i.e., they are arranged from smallest
to largest. If the elements are strings, they are sorted according to their
ASCII values. This method returns None.
>>> L = [23, 76, 34, 12, 89, 14]
>>> L.sort()
>>> L
[12, 14, 23, 34, 76, 89]
To change the sorting order, add the argument reverse=True.
>>> L = [23, 76, 34, 12, 89, 14]
>>> L.sort(reverse=True)
>>> L
[89, 76, 34, 23, 14, 12]
The numbers are now sorted from largest to smallest.
Now, let us use the sort method to sort a list of strings.
>>> L = ['Cow', 'Zebra', 'Ant', 'Bear', 'Crow',
'Wolf']
>>> L.sort()
>>> L
['Ant', 'Bear', 'Cow', 'Crow', 'Wolf', 'Zebra']
>>> L.sort(reverse=True)
>>> L
['Zebra', 'Wolf', 'Crow', 'Cow', 'Bear', 'Ant']
We get the results in alphabetical order, but this order will be disturbed if the
list contains strings in both lower case and upper case.
>>> L = ['Cow', 'Zebra', 'Ant', 'bat', 'crow',
'Wolf']
>>> L.sort()
>>> L
['Ant', 'Cow', 'Wolf', 'Zebra', 'bat', 'crow']
In the sorted list, we first have all the uppercase strings and then the
lowercase strings. This is because the strings are sorted according to their
ASCII values. ASCII values of uppercase letters are less than those of
lowercase letters, so uppercase letters come before lowercase letters.
Therefore, the sort method performs a case-sensitive sort in the case of
strings. To perform case insensitive sort, i.e., to ignore the case while
sorting, you can send str.lower as the argument for the key parameter.
>>> L = ['Cow', 'Zebra', 'Ant', 'bat', 'crow',
'Wolf']
>>> L.sort(key=str.lower)
>>> L
['Ant', 'bat', 'Cow', 'crow', 'Wolf', 'Zebra']
Now, the sorting is done in regular alphabetical order, and this is because the
sorting is not done on original strings. The str.lower function is applied
to each string to get a key, and then the sorting is done on those keys. So the
sorting is done on these values: 'cow', 'zebra', 'ant', 'bat',
'crow', 'wolf'. The original values of the list remain unchanged; they
are not changed to lowercase.
We could do the same thing by sending the str.upper function as the
argument for the key parameter.
>>> L = ['Cow', 'Zebra', 'Ant', 'bat', 'crow',
'Wolf']
>>> L.sort(key=str.upper)
>>> L
['Ant', 'bat', 'Cow', 'crow', 'Wolf', 'Zebra']
Now the sorting is done on these values: 'COW', 'ZEBRA', 'ANT',
'BAT', 'CROW', 'WOLF'
You can send any one-argument function for the key parameter, which will
be applied to each element of the list to produce its key. The produced key
will be used for sorting. In the next example, we will use the len function
for the key parameter.
>>> L = ['Cow', 'Zebra', 'Ant', 'bat', 'crow',
'Wolf']
>>> L.sort(key=len)
>>> L
['Cow', 'Ant', 'bat', 'crow', 'Wolf', 'Zebra']
Now, sorting is done on the following values:
len('Cow')->3, len('Zebra')->5, len('Ant')->3,
len('bat')->3, len('crow')->4, len('Wolf')->4
Now, the strings are sorted according to their length.
The sort method will not work if the list contains elements of mixed types.
If the list contains all strings or all numbers, it is fine, but when a list
contains elements of unrelated types, you will get an error. For example, a
list of integers and floats will be sorted, but sorting a list of integers and
strings will give an error.
>>> L1 = [12.4, 12, 13.77, 88, 9.2]
>>> L1.sort()
>>> L1
[9.2, 12, 12.4, 13.77, 88]
>>> L2 = ['Seven', 'Five', 12, 'Six', 'Two', 300,
99]
>>> L2.sort()
TypeError: '<' not supported between instances of
'int' and 'str'
The list L1 that contains integers and floats is sorted but the list L2 that
contains strings and integers gives TypeError because the types are
unrelated.
The sort method will change the list object in-place, so the original order
of the list elements will be lost. If you do not want to modify your original
list and want just a sorted copy of the original list, you can use the
sorted() built-in function. This function does not sort the list in-place,
which means that it does not change your list object. It just returns a new list
object that is a sorted copy of the list. The returned list object can be
assigned to another name.
>>> L = [81, 2, 13, 99, 7]
>>> L1 = sorted(L)
>>> L1
[2, 7, 13, 81, 99]
>>> L
[81, 2, 13, 99, 7]
We can see that the list L has not changed. The arguments for reverse and
key parameters can be used with the sorted function also.
4.13 Reversing a List
The reverse method reverses the order of the elements of the list in-place.
It returns None.
>>> L = [2, 5, 3, 1, 7, 4]
>>> L.reverse()
>>> L
[4, 7, 1, 3, 5, 2]
If you do not want your list to be changed, use the reversed built-in
function. This function does not return a list. It returns an iterable object that
has to be converted to a list.
>>> L = [2, 5, 3, 1, 7, 4]
>>> L1 = list(reversed(L))
>>> L1
[4, 7, 1, 3, 5, 2]
>>> L
[2, 5, 3, 1, 7, 4]
We have converted the return value of reversed function to a list and
assigned it to L1. The list L1 contains the elements of list L in reversed
order, and the list L remains unchanged.
As we have seen before, we can get a reversed copy of the list by using the
slice L[::-1]
>>> L = [2, 5, 3, 1, 7, 4]
>>> L1 = L[::-1]
>>> L1
[4, 7, 1, 3, 5, 2]
>>> L
[2, 5, 3, 1, 7, 4]
4.14 Finding an item in the list
The membership operators in and not in can be used to check whether
an element is present in the list. If we want to know the index of an element,
then we can use the index method. It returns the index of the first
occurrence of the item in the list. If the item is not present, then it raises
ValueError. The search can be restricted by providing optional start and
end values.
item in L Returns True if item present in list L, otherwise False
item not in L Returns True if item not present in list L, otherwise False
L.index(item) Returns the index of the first occurrence of item in the list
L.index(item,i,j) Returns the index of the first occurrence of item in a portion of the
list starting from index i to index j-1
Table 4.3: Finding an item in the list
First, let us check the presence of an item in a list using the in operator.
>>> numbers = [82, 31, 55, 12, 7, 56, 99, 12, 99,
67, 12]
>>> 31 in numbers
True
>>> 31 not in numbers
False
>>> 100 not in numbers
True
Now, let us use the index method.
>>> numbers.index(12)
3
We get the index of the first occurrence of 12 in the list. Let us specify a
start value for searching.
>>> numbers.index(12, 4)
7
Now, item 12 was searched in the portion of the list starting from index 4 till
the end of the list. We can specify an end value also.
>>> numbers.index(12, 4, 10)
7
The search was done from index 4 to index 9. If the searched item is not
present in the list, then ValueError is raised.
>>> numbers.index(100)
ValueError: 100 is not in list
To count the number of occurrences of an item, we can use the method
count. If the item is not present in the list, it will return 0.
>>> numbers.count(12)
3
4.15 Comparing Lists
The == and != operators can be used to compare two lists for value equality.
The == operator will evaluate to True if the lists have the same content,
while the != operator will evaluate to True if the contents of the list are
different. The lists are compared element by element, starting from the first
index till the last index.
>>> L1 = [1, 2, 3]
>>> L2 = [1, 2, 3]
>>> L3 = [1, 20, 30]
>>> L1 == L2
True
>>> L1 != L3
True
>>> L2 == L3
False
If you want to check whether the two lists refer to the same object, you can
use the is and is not operators.
>>> L4 = L1
>>> L1 is L2
False
>>> L1 is L4
True
>>> L1 is not L4
False
You can also use <, <=, >, and >= operators with lists. These operators will
work only if the lists contain compatible types of data that support greater-
than and less-than comparisons.
>>> L1 = [1, 2, 3, 4, 5, 6,7]
>>> L2 = [1, 2, 3, 7, 8]
>>> L1 < L2
True
The two lists are compared element by element till there is a mismatch in the
elements being compared. The result will be the result of comparing the two
mismatched elements. For example, here, mismatched elements are 4 and 7;
since 4 is smaller, L1 is considered smaller than L2.
4.16 Built-in functions used on lists
We have already seen how the built-in functions sorted and reversed
can be used with lists. Here are some more built-in functions that can work
with lists.
len(L) Returns the size of the list
min(L) Returns the smallest value of the list
max(L) Returns the largest value of the list
sum(L) Returns the sum of all the elements of the list if the elements are of numeric type
Table 4.4: Built-in functions
>>> numbers = [82, 31, 55, 12, 7, 56, 99, 12, 99,
67, 12]
>>> len(numbers)
11
>>> max(numbers)
99
>>> min(numbers)
7
>>> sum(numbers)
532
>>> average = sum(numbers)/len(numbers)
>>> average
48.36363636363637
4.17 Concatenation and Replication
Like strings, we can perform concatenation and repetition in lists using the +
and * operators.
L1 + L2 Returns a new list object which has all elements of lists L1 and L2
n * L Returns a new list object in which all elements of list L are repeated n times
L * n
Table 4.5: Concatenation and replication in lists
The + operator combines two lists, and the * operator can be used with a list
and an integer to replicate the list a specified number of times. If n<=0, the
result is an empty list.
>>> L1 = [1, 2, 3]
>>> L2 = [6, 7, 8]
>>> L3 = L1 + L2
>>> L3
[1, 2, 3, 6, 7, 8]
The expression L1 + L2 returned a new list object which contained all the
elements of the first list and then all the elements of the second list, and we
have assigned this list object to name L3.
>>> L1 * 4
[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]
>>> 4 * L1
[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]
The expressions L1 * 4 and 4 * L1 both give a list object in which the
elements of the list L1 are repeated 4 times.
The augmented assignment statement syntax is also available for these
operators.
>>> L1 = [1, 2, 3]
>>> L1 += [10, 11, 12, 13]
>>> L1
[1, 2, 3, 10, 11, 12, 13]
>>> L2 = [6, 7, 8]
>>> L2 *= 3
>>> L2
[6, 7, 8, 6, 7, 8, 6, 7, 8]
These statements make in-place changes in the list object, so in the above
two examples, list objects L1 and L2 are changed in-place.
There is a difference between augmented assignment syntax and simple
assignment when used with lists.
Augmented Assignment Simple Assignment
L1 += L2 L1 = L1 + L2
L *= n L = L * n
Let us try to understand this difference with an example:
>>> L1 = [1, 2, 3]
>>> id(L1)
1750909251072
>>> L1 = L1 + [5, 6, 7]
>>> L1
[1, 2, 3, 5, 6, 7]
>>> id(L1)
1750909253056
>>> L2 = [1, 2, 3]
>>> id(L2)
1750909253952
>>> L2 += [5, 6, 7]
>>> L2
[1, 2, 3, 5, 6, 7]
>>> id(L2)
1750909253952
We took two lists L1 and L2; with L1, we used the simple assignment, and
with L2, we used the augmented assignment syntax. We can see that the id
of L1 has changed, but that of L2 has not changed. This means that in the
case of simple assignment, a new object was created which was assigned to
L1, while in the case of augmented assignment syntax, in-place changes
were made to the list object L2.
The result is the same whether we use augmented or simple assignments, but
the implementation is different. The augmented assignment is more efficient
since it makes in-place changes, while in the case of a simple assignment, a
new object is created. If you are dealing with large lists, the creation of a
new object will require a lot of temporary space. Also, if there are many
references referring to the list object, then making in-place changes is the
correct approach.
As we have seen in section 4.7, the augmented assignment syntax(L1+=L2)
is like the extend method (L1.extend(L2)). It appends all the items of
the iterable to the end of the list.
In the case of strings and tuples, the augmented assignment statements work
like simple assignment statements since strings and tuples are immutable,
and in-place changes are not possible.
4.18 Using a list with functions from the
random module
To select a random item from the list or shuffle the list, you can use the
choice and shuffle functions from the random module of the standard
library.
The random.choice() function returns a randomly selected element
from the list.
>>> import random
>>> colors = ['red', 'blue', 'green', 'yellow']
>>> random.choice(colors)
'blue'
>>> random.choice(colors)
'green'
The random.shuffle() function reorders the elements in the list.
>>> cities = ['Etah', 'Kasganj', 'Dhampur',
'Najibabad', 'Bareilly', 'Chennai', 'Bangalore']
>>> random.shuffle(cities)
>>> cities
['Bangalore', 'Kasganj', 'Najibabad', 'Chennai',
'Bareilly', 'Etah', 'Dhampur']
>>> random.shuffle(cities)
>>> cities
['Bareilly', 'Najibabad', 'Chennai', 'Kasganj',
'Dhampur', 'Bangalore', 'Etah']
This function modifies the list in-place; it does not return a new list.
4.19 Creating a list
We know that the simplest way of creating a list is to write the list literal and
make a variable name refer to it.
L = [11, 22, 33, 44]
You would often like to construct your list dynamically at run time. You can
do this by starting with an empty list and adding items at run time using the
append or extend methods.
L = []
item = input('Enter an item : ')
L.append(item)
item = input('Enter another item : ')
L.append(item)
If you have an existing iterable that you want in list form, then you can use
the list function. This function can be used to convert other iterables to a
list.
>>> L = list('blue')
>>> L
['b', 'l', 'u', 'e']
The function call list('blue') produces a list of individual characters
of the string 'blue'. The list function can take any object of iterable
type so that you can use other collections like dictionaries, tuples, or sets.
You can also make a new list by making a copy of an existing list. We will
discuss copying in detail in the following sections.
4.20 Using range to create a list of integers
Lists containing a range of integers are very common. We can use the built-
in range function to create these types of lists. The range function
generates a sequence of integers.
range(3,10) 3, 4, 5, 6, 7, 8, 9
range(2,7) 2, 3, 4, 5, 6
The call range(3,10) generates integers from 3 up to 10. The first
number is included, but the second number is excluded. Similarly, the call
range(2,7) generates integers from 2 up to 7(excluding 7). If we place
these calls inside the list function, we will get a list of integers.
>>> list(range(3, 10))
[3, 4, 5, 6, 7, 8, 9]
>>> list(range(2, 7))
[2, 3, 4, 5, 6]
We can use a step as the third argument to the range function, as we had
used in slice notation.
>>> list(range(1, 20, 2))
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
>>> list(range(20, 100, 10))
[20, 30, 40, 50, 60, 70, 80, 90]
>>> list(range(10, 2, -1))
[10, 9, 8, 7, 6, 5, 4, 3]
>>> list(range(100, 20, -10))
[100, 90, 80, 70, 60, 50, 40, 30]
In the call range(1, 20, 2), 2 is the step, so we get a list of odd
numbers from 1 to 19. In the second example, we have used 10 as the step
value. The step can be negative also, as we have in the last two examples.
If there is only one argument in the range function, then we get a list from
0 to that number minus 1.
>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(range(6))
[0, 1, 2, 3, 4, 5]
We will use this range function for loops also, so it is good to become
familiar with it.
4.21 Using the repetition operator to create a
list of repeated values
The repetition operator can be used to initialize a list with the same initial
value for all the elements. Here are some examples:
>>> [0] * 15
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>> [''] * 5
['', '', '', '', '']
>>> [None] * 4
[None, None, None, None]
The expression [0] * 15 gives a list of 15 elements, all initialized to 0.
The expression [''] * 5 creates a list that contains 5 empty strings. In the
last example, we get a list of size 4 with all its elements None.
There is another Pythonic way of creating a list called List Comprehension.
We will discuss that later in a separate chapter.
4.22 Creating a list by splitting a string
We have seen that the list function can break a string into individual
characters.
>>> L = list('I love Python')
>>> L
['I', ' ', 'l', 'o', 'v', 'e', ' ', 'P', 'y', 't',
'h', 'o', 'n']
If you want to break the string into words and make a list of words in the
string, you can use the split method.
s.split(sep) splits the string using sep as the separator string.
The split method of str type splits a string on a separator to a list of
substrings. If the separator is not specified or is None, then any whitespace
acts as a separator. Whitespace can be space, tab, or a newline.
>>> L = 'I love Python'.split()
>>> L
['I', 'love', 'Python']
Here are some more examples:
>>> phone = '011-395-343343'
>>> phone.split('-')
['011', '395', '343343']
>>> date = '22/11/1987'
>>> date.split('/')
['22', '11', '1987']
>>> student = 'Sam 23 Mechanical A+'
>>> student.split()
['Sam', '23', 'Mechanical', 'A+']
In the call student.split(), we have not provided any separator, so
splitting is done on whitespace characters.
We can limit the number of splits by specifying a second argument.
>>> phone = '011-395-343343'
>>> phone.split('-',1)
['011', '395-343343']
We have sent 1 as the second argument, so now only one split is done.
If we have a multiline string and we want to break it into single line strings,
then we can use either the split method with newline character('\n') as
the separator or we can use the splitlines method. The method
splitlines() splits a multiline string into a list of single-line strings. In
the next example, we have a multiline string enclosed in triple quotes.
>>> quote = '''When failure knocks you down,
... rise again,
... keep moving
... never give up'''
>>> quote.split('\n')
['When failure knocks you down,', 'rise again,',
'keep moving', 'never give up']
>>> quote.splitlines()
['When failure knocks you down,', 'rise again,',
'keep moving', 'never give up']
If we call the split method on this multiline string without any argument,
then splitting will be done on each whitespace character instead of only
newline characters.
>>> quote.split()
['When', 'failure', 'knocks', 'you', 'down,',
'rise', 'again,', 'keep', 'moving', 'never',
'give', 'up']
Note that split() and splitlines() are string methods, not list
methods. They are called on string objects but return list objects.
4.23 Converting a list of strings to a single
string using join()
The string method named join() is the reverse of the split() method.
It takes a list of strings as an argument and returns a string in which the
string elements of the list have been joined by a separator string. The method
is called on the separator string, and the list of strings is sent as the
argument. Here are some examples:
>>> L = ['15', 'May', '2005']
>>> '/'.join(L)
'15/May/2005'
We have called the join method on the string ‘/’ and sent list L as the
argument. This call gave us a string object in which the elements of the list
have been joined by ‘/’. Let us try calling this method on different strings.
>>> '.'.join(L) # joined by dots
'15.May.2005'
>>> ' '.join(L) # joined by spaces
'15 May 2005'
>>> ''.join(L) # called on empty string
'15May2005'
The list sent as the argument should be a list of strings only, not a list of
integers or floats or any other type. Instead of a list, we can use any other
iterable that contains strings. So, we can even use a tuple of strings or sets of
strings.
If we send a string as the argument, then all the characters of the string are
joined.
>>> '-'.join('Python')
'P-y-t-h-o-n'
>>> ' '.join('Python')
'P y t h o n'
4.24 List of Lists (Nested lists)
Lists can contain elements of any type, including lists. We get a nested list
when a list appears as an element in another list. Here is an example of a
nested list:
L = ['blue', [3,4,5], 34]
The inner list [3,4,5] is the second element of the list L, so to access it,
we can write L[1], and to access the first element of the inner list, we can
write L[1][0]. To access the second element of the inner list, we can write
L[1][1] and so on. In the next example, we have a list with all its
elements as lists.
listA = [[1, 2, 3], [4, 5, 6], [7, 8, 9], [9, 10,
11]]
The nested list structure is often used to represent matrices. For example, the
following matrix of three rows and four columns can be represented by a
nested list of three elements where each element is a list of size 4.
Figure 4.2: Matrix of size 3 X 4
>>> A = [
... [1, 4, 8, 3],
... [2, 5, 6, 3],
... [1, 9, 5, 8]
... ]
We can extract a single row using a single index, and a single element of the
matrix using double indexes.
>>> A[0]
[1, 4, 8, 3]
>>> A[1]
[2, 5, 6, 3]
>>> A[1][2]
6
4.25 Copying a list
If we have a list L , the statement L1 = L does not make an independent
copy of the list, and no new object is formed. This assignment only makes an
alias. Both L and L1 refer to the same object. If we make any changes in L
or in L1, the changes will be reflected in the other one also. This is called
aliasing.
Figure 4.3: Variables L and L1 refer to the same list object
When dealing with objects of immutable types, like integers or strings, then
aliasing does not matter much, as neither of the variables can cause a change
to the shared object. Immutable objects cannot be changed in any way. But
when working with mutable types like lists and dictionaries, this aliasing can
lead to unexpected and undesirable behavior, as it can cause unwanted
changes in an object. This is because the mutable objects can be changed.
With immutable objects, there is no such problem. That is why Python itself
aliases small strings for optimization.
If we need to make an independent copy of a list, we have three ways. First
is by using the slice notation which we have already seen. The second is by
using the list function, and the last one by using the list copy method. In
all these three ways, new list objects are created.
>>> L = [1, 2, 3, 4]
>>> L1 = L # makes an alias
>>> L2 = L[:] # makes a copy by slice notation
>>> L3 = list(L) # makes a copy by list function
>>> L4 = L.copy() # makes a copy by copy method
>>> L1
[1, 2, 3, 4]
>>> L2
[1, 2, 3, 4]
>>> L3
[1, 2, 3, 4]
>>> L4
[1, 2, 3, 4]
Let us see the ids of objects that L, L1, L2, L3, L4 are referring to.
>>> id(L)
1453383532736
>>> id(L1)
1453383532736
>>> id(L2)
1453383547200
>>> id(L3)
1453340650560
>>> id(L4)
1453383543744
We can see that the ids of L and L1 are the same, which shows that they
refer to the same object and, hence, are aliases. ids of L2, L3 and L4 are
all different from the id of L, which shows that they are separate independent
copies and not aliases. So, in all these three cases, new objects are created.
Any changes you make to any of these copies will not be reflected in the
original object.
>>> L2[0] = 35
>>> L2
[35, 2, 3, 4]
>>> L
[1, 2, 3, 4]
We changed the first element of L2 to 35, but L remains unchanged. Now,
let us make some changes in L3 and L4.
>>> L3.append(45)
>>> L3
[1, 2, 3, 4, 45]
>>> L
[1, 2, 3, 4]
>>> L4[1] += 100
>>> L4
[1, 102, 3, 4]
>>> L
[1, 2, 3, 4]
Any changes made to L2, L3, or L4 do not affect L. However, any changes
made to the alias L1 will affect the original object to which L is referring.
>>> L1[0] = 99
>>> L1
[99, 2, 3, 4]
>>> L
[99, 2, 3, 4]
We can see that L has changed now. Similarly, any changes made to L will
be reflected in the alias L1.
>>> L[1] = 1000
>>> L
[99, 1000, 3, 4]
>>> L1
[99, 1000, 3, 4]
We changed list L, and the alias L1 also changed. This change in L will not
change L2, L3, or L4 since they are independent copies.
>>> L2
[35, 2, 3, 4]
>>> L3
[1, 2, 3, 4, 45]
>>> L4
[1, 102, 3, 4]
4.26 Shallow copy and deep copy
We saw three ways of copying a list. The copy created in these three ways is
a shallow copy; it is just a top-level copy. Let us see what it means.
L = [1, 2, 3, 4]
L2 = L[:] # shallow copy
L2 = list(L) # shallow copy
L2 = L.copy() # shallow copy
Figure 4.4: Variables L and L2 refer to different list objects
We have a list L, and if we create a copy L2 using any of the three ways we
have seen, a new list object is created. This list object contains references to
elements from the original list, meaning the contained objects are not copied.
This is just a one-level copy. This shallow copy will not create any problems
if your list contains only immutable objects, but if your list contains mutable
objects, then this shallow copy can produce unwanted results. Let us see
how.
Now, suppose our list L contains two integers and a list, and we make a copy
L2 by using the copy method. We get a new list object that contains
references to the three contained objects.
>>> L = [12, 13, ['a','b']]
>>> L2 = L.copy()
>>> L2
[12, 13, ['a', 'b']]
Figure 4.5: Copying the list using the copy method
L and L2 refer to different list objects since we have used the copy
method. Now, suppose we make in-place changes to the contained list
through L2.
>>> L2[2].append('c')
>>> L2
[12, 13, ['a', 'b', 'c']]
>>> L
[12, 13, ['a', 'b', 'c']]
Figure 4.6: In-place changes made to the contained list through L2
L2[2] refers to the inner list, so L2[2].append('c') calls append on
the inner list. This call gives us a new element in the inner list. Since the
contained list object is shared by both L and L2, changes made through one
are reflected in the other also.
The inner list changed for L also because the nested list was not copied; only
the reference to it was copied. Immutable contained objects cannot pose any
such problem because they cannot be changed in-place. For example, in our
list, the integer object is immutable; it cannot be changed in-place, so there
is no problem in sharing it. If we write L2[0] = 22, there will be no side
effect; a new object will be created, and L2[0] will refer to this new object
now.
>>> L2[0] = 22
>>> L2
[22, 13, ['a', 'b', 'c']]
>>> L
[12, 13, ['a', 'b', 'c']]
When you have only immutable objects inside your list, a shallow copy is
sufficient. When you have mutable objects inside your list, you must
perform a deep copy to avoid surprises.
To get a deep copy, you need to use the deepcopy() function from the
module copy. It will give you a complete and independent copy of a deeply
nested data structure. It will recursively traverse objects to copy all their
parts. The deepcopy function will not do just one level copying; it extends
the copying to the last level.
>>> L = [12, 13, ['a','b']]
>>> L2 = L.copy()
>>> from copy import deepcopy
>>> L3 = deepcopy(L)
We made a shallow copy using the copy method, then imported the
deepcopy function from the copy module and made a deep copy using
this function. Now, let us see the id of the inner list for all three lists.
>>> id(L[2])
2038538420416
>>> id(L2[2])
2038538420416
>>> id(L3[2])
2038538490816
We can see that L[2] and L2[2] are referring to the same list object, but
L3[2] is referring to a new list object. Let us make some changes in the
inner list through L3.
>>> L3[2].append('c')
>>> L3
[12, 13, ['a', 'b', 'c']]
>>> L
[12, 13, ['a', 'b']]
Now, there was no change in L.
So, we saw the difference between shallow copying and deep copying. In a
shallow copy, only object references are copied; the objects themselves are
not copied. This leads to the aliasing of contained objects. Most of the time,
shallow copying will be fine; deep copying is required only when you have
nested structures like lists within lists or dictionaries within lists.
4.27 Repetition operator with nested lists
When the repetition operator is used with nested lists, we can get unexpected
results. Let us understand this with an example.
>>> L = [12, ['a', 'b']]
>>> L1 = L * 3
>>> L1
[12, ['a', 'b'], 12, ['a', 'b'], 12, ['a', 'b']]
We have a list L, and we made another list L1 by repeating this L three
times. Now, in the list L1, we will make some changes.
>>> L1[1][0] = 'z'
>>> L1
[12, ['z', 'b'], 12, ['z', 'b'], 12, ['z', 'b']]
>>> L
[12, ['z', 'b']]
We had changed only L1[1][0] to 'z', but L1[3][0] and L1[5][0]
also have been changed to 'z'. Even our list L has been changed.
This is because when a new list is built using the repetition operator. Python
copies each item by reference; it will not create new objects. It just creates
references to the same objects. For immutable objects, it is not a problem,
but it can be a problem for mutable objects. So, if the list contains mutable
objects, using the repetition operator on a list can produce unexpected side
effects.
Here is the figure for the example that we have seen:
Figure 4.7: List L1 contains references to the objects of list L
We can see that the new list L1 that we created using the repetition operator
contains references to the objects of the original list. The repetition operator
does not create any new object. The inner list object has four references
referring to it, so any changes made to it through any of the references will
be reflected in all four places. We also have four references to the integer
object, but this will not create any problem as this is an immutable object, so
it cannot be changed in-place.
We can confirm the fact that we have seen in the figure by using the id
function.
>>> id(L[1])
53567698
>>> id(L1[1])
53567698
>>> id(L1[3])
53567698
>>> id(L1[5])
53567698
The ids are the same. which means that all of them refer to the same list
object.
Let us see one more case where this can create problems. We have seen that
we can use the repetition operator to create lists in which all elements have
the same initial values. Suppose we want to create a list of empty lists:
[[], [], [], []]
To get this list, we write the following statement:
>>> L = [[]] * 4
>>> L
[[], [], [], []]
We get a list containing four empty lists, but if we make in-place changes to
any of these inner lists, all the inner lists will be affected since they all refer
to the same object. Let us append an item to the first sublist.
>>> L[0].append(12)
>>> L [[12], [12], [12], [12]]
We appended a value to the first sublist of L, but that value has been
appended to all the sublists of L. This is because we have only one list
object, and all the sublists refer to that same object. Let us see one more
example:
Suppose we want to create a matrix of size 3 X 4 with all its elements
initialized to 0.
[ [0,0,0,],
[0,0,0,],
[0,0,0,]
]
To represent this matrix, we create a nested list using the repetition operator.
>>> L = [[0] * 3] * 4
>>> L
[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]]
We get the properly initialized nested list, but changing any inner list will
result in unexpected results.
>>> L[1][2] = 34
>>> L
[[0, 0, 34], [0, 0, 34], [0, 0, 34], [0, 0, 34]]
To avoid these surprising side effects, you should not use the repetition
operator with nested lists. You can write the list directly, or if the desired list
is big, you can use list comprehensions, which we will discuss in a separate
chapter.
Now, let us write the list directly.
>>> L = [[], [], [], []]
>>> L
[[], [], [], []]
>>> L[0].append(12)
>>> L
[[12], [], [], []]
There is no problem now, as all the sublists refer to different objects.
4.28 Tuples
Like lists, tuples are ordered sequences of elements, but they are immutable,
which means that once a tuple is defined, it cannot be changed. You cannot
dynamically add or remove elements as you do in lists. All the elements have
to be defined at the time of creation. The word ‘tuple’ can be pronounced as
either ‘toople’ or ‘tupple’. A tuple allows mixed types and can have duplicate
values. It is a referential data structure like a list, which means that it
contains just references to objects. So, a tuple is like a list, but unlike a list, a
tuple is immutable, which means that a tuple object cannot be changed in-
place. A tuple object, once created, cannot be modified. For example, the
following tuple will always contain the four references in the same order.
They will always refer to the same objects. You cannot make these
references refer to some other object, nor can you add or remove any item
from this tuple.
Figure 4.8: Tuple object
So, a tuple is a fixed-length data structure whose items cannot be changed.
When you have data that needs to be ordered and will not change, put it
inside a tuple. Here are some examples of tuple literals.
('Joe', 22, 15000)
('Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat')
()
(2,)
A tuple literal is written as a comma-separated series of values enclosed in
parentheses. An empty pair of parentheses denotes an empty tuple, and a
tuple of only one element should contain a comma following that element
before the closing parenthesis. If you write the tuple in the last example as
(2) instead of (2,), it will be wrong because without the trailing comma,
the expression is considered a simple parenthesized numeric expression. The
expression (2,) is of type tuple, but the expression (2) is of type int.
So, for a tuple of size 1, you need the trailing comma.
Sometimes, the parentheses enclosing the elements of a tuple can be omitted.
For example, the tuple ('Joe', 22, 15000) can be written as
'Joe', 22, 15000 also. It is better to put the parentheses as it improves
clarity and makes the tuple more visible. In some cases, you are not allowed
to omit these parentheses, as we will discuss later in the functions chapter
when you send a tuple as a function argument.
Tuples are sequences like strings and lists, supporting usual sequence
operations like indexing and slicing. Elements of a tuple can be accessed by
writing an index inside square brackets or by using slices.
>>> days = ('Sun', 'Mon', 'Tue', 'Wed', 'Thu',
'Fri', 'Sat')
>>> days[2]
'Tue'
>>> days[-1]
'Sat'
>>> days[2:5]
('Tue', 'Wed', 'Thu')
Since tuples are immutable, index and slice assignments are not allowed.
>>> days[2] = 'Tuesday'
TypeError: 'tuple' object does not support item
assignment
We can create a tuple by writing the tuple literal, and the parentheses may be
omitted, as we have discussed, or we can use the tuple function to convert
any iterable to a tuple.
>>> numbers = (10, 20, 30, 40)
>>> numbers
(10, 20, 30, 40)
>>> days = 'Sun', 'Mon', 'Tue'
>>> days
('Sun', 'Mon', 'Tue')
>>> t1 = tuple([1, 2, 3])
>>> t1
(1, 2, 3)
>>> t2 = tuple('yes')
>>> t2
('y', 'e', 's')
>>> t3 = tuple(range(3,7))
>>> t3
(3, 4, 5, 6)
Tuples support concatenation and repetition like other sequence types.
>>> t1 = (1, 2, 3)
>>> t2 = (4, 5, 6)
>>> t1 + t2
(1, 2, 3, 4, 5, 6)
>>> t1 * 3
(1, 2, 3, 1, 2, 3, 1, 2, 3)
The expression t1 + t2 gives us a tuple in which we have elements of
both t1 and t2, and the expression t1*3 gives us a tuple in which
elements of the tuple t1 are repeated three times. We can use the augmented
assignment syntax also.
>> t1 += t2
>>> t1
(1, 2, 3, 4, 5, 6)
The statement t1 += t2 does not change the tuple object referred to by
t1. It just rebinds the name t1 to a different object. A new object will be
created, and that will be assigned to the name t1. It is actually equivalent to
writing:
t1 = t1 + t2
Tuples can be compared for their values and identities, and we can use the
in and not in operators with tuples to check the membership of items.
>>> t1 = (1, 2, 3, 'black')
>>> t2 = (1, 2, 3, 'black')
>>> t1 == t2
True
>>> t1 != t2
False
>>> t1 is t2
False
>>> t1 is not t2
True
>>> 2 in t1
True
>>> 2 not in t1
False
There are only two methods available for a tuple - count and index. The
call T.count(x)returns the number of occurrences of x in tuple T, and
T.index(x) returns the index of the first occurrence of x in tuple T. As in
list methods, you can also send additional arguments to restrict your search.
Since a tuple is immutable, the delete, append, or insert operations are not
defined for tuples. There is no copy method for a tuple, so if you want to
copy a tuple, you can use the copy and deepcopy functions of the copy
module.
Tuples are immutable, so we cannot change a tuple, but if the tuple contains
a mutable object, for example, a list, we can change that referenced object.
So, a tuple itself cannot be changed, but what it contains can be changed if
mutable. Let us see an example:
>>> student = ('Ted', 25, [88, 70, 92])
>>> student[1] = 90
TypeError: 'tuple' object does not support item
assignment
We have a tuple named student that contains a string, an integer, and a
list. It contains references to three objects. Since a tuple is immutable, it
cannot be changed. Its length will always be 3, and it will always contain
references to these objects. You cannot make these references refer to any
other object. So that is why when we write student[1] = 90, we get an
error.
Figure 4.9: Tuple containing a mutable object
From the three objects whose references are contained inside the tuple, the
first two(str and int) are immutable, but the third one, which is a list, is
mutable, so it can be changed in-place. So, we can write:
>>> student[2][1] = 90
>>> student
('Ted', 25, [88, 90, 92])
This is valid because although student refers to an immutable object,
student[2] refers to a mutable object. So, we can make any in-place
changes in student[2]. Thus, the second reference in the list now refers
to a new integer object with a value of 90. When we printed the tuple, we
could clearly see a change in it. If a tuple contains a mutable type, we might
see a change in it.
The other immutable type that we have seen is str. A string can never be
changed in any way because it is not a referential structure and does not
contain references to characters. It physically holds the characters in
contiguous memory.
We have seen that tuples are like lists except for the fact that they are
immutable. They have only two methods available. You must be wondering
why we need tuples when the list type is already there. The answer is that we
need tuples because of their immutability. Since they are immutable, they
provide a sort of safety to your data. If you have a sequence of items, and
you create a list out of them and pass that list in the program, chances are
that it might be modified at some point in your program because lists are
mutable and can be changed. However, if you put your data inside a tuple, it
cannot be changed. So, it is safe to use a tuple if you do not want your data
to be changed. There can be no aliasing problems in tuples because of their
immutability.
Tuples are processed faster than lists. This is because their contents do not
change, so Python can implement some optimizations, which make tuples a
little faster than lists.
Tuples allow a function to return multiple values. We will discuss this later
when we learn about functions. Some built-in methods and functions like
enumerate, divmod, zip use this feature and return multiple values in
the form of tuples. So, even if you do not create your own tuple, you might
have to use tuples that are returned by functions or methods that you use
from standard library or other packages.
Tuples can be used as keys in a dictionary. We will learn about dictionaries
in the next chapter. Only immutable types like strings and integers can be
used as keys of a dictionary. We cannot use a list as a key as it is mutable. A
tuple can be used as a key if it contains only immutable elements; if it
contains any mutable element directly or indirectly, it cannot be used as a
key.
We have seen that tuples are safer and faster than lists, allow us to return
more than one thing from a function, and can be used as dictionary keys. So,
suppose you have an ordered sequence of values that you are sure will not
change. In that case, it is better to use a tuple for better performance and
safety. Using a tuple also conveys the message to the reader of your program
that you do not intend the sequence of values to be changed.
Although both lists and tuples allow data of mixed type, lists are usually
homogeneous, while tuples are usually heterogeneous. In the real world,
tuples are mostly used to store records. Lists are generally iterated over
using loops, while tuple elements are usually accessed using unpacking. In
the next section, we will discuss tuple packing and unpacking.
4.29 Tuple packing and unpacking
The following assignment statement packs data into a tuple.
>>> employee = ('Raj', 20, 'Delhi', 15000)
The four values are packed into a tuple, and this tuple is assigned to the
name employee. We could write this statement without the parentheses
also.
>>> employee = 'Raj', 20, 'Delhi', 15000
This is called packing a tuple. Unpacking is the reverse of packing. We can
use tuple unpacking to extract data from it.
>>> name, age, city, salary = employee
In this statement, we are assigning a single tuple to multiple variables. So
here, the first value of employee tuple is assigned to name, second to
age, third to city, and fourth to salary. The packing and unpacking can
be done at the same time in a single line.
>>> name, age, city, salary = ('Raj', 20, 'Delhi',
15000)
Here, first, the 4 values that are there on the right side are packed into a
tuple, and then they are unpacked. The variable name is bound to the string
'Raj', age is bound to 20, city is bound to Delhi and salary is bound
to 15000. Parentheses are not necessary, so you can write it like this also.
>>>> name, age, city, salary = 'Raj', 20, 'Delhi',
15000
This is why you can do multiple assignments in a single statement in Python.
>>>> a, b, c = 2, 30, 1
When we write a statement like this, multiple assignments are being done.
This is also called simultaneous assignment; a is assigned value 2, b is
assigned value 30, and c is assigned value 1. We have seen this in the second
chapter. What actually happens is that the three values on the right-hand side
are automatically packed into a tuple. Then, that tuple is automatically
unpacked, with its elements assigned to the three variables on the left-hand
side. So, now you know that behind this multiple assignment technique of
Python, there is tuple packing and unpacking going on.
One application of tuple unpacking is swapping the values of two variables
without using a temporary variable. In other languages, you would swap the
values of two variables, x and y, like this.
temp = x
x = y
y = temp
In Python, you can do it in a single statement by using a tuple assignment.
x, y = y, x
This is the Pythonic way of swapping two values. There was no need to
create any temporary variable to hold the data temporarily while swapping
the values. The right-hand side is evaluated first, so the two values are
packed in a tuple, and then that tuple is unpacked. The first value is assigned
to x, and the second value is assigned to y. So, the old value of y is assigned
to x, and the old value of x is assigned to y. The unnamed tuple that is
automatically packed and unpacked implicitly serves as the temporary
variable.
The unpacking works not only for tuples. It can work for any iterable type.
>>> x, y, z = [1, 2, 3]
>>> print(x, y, z)
1 2 3
>>> first, second, third = 'not'
>>> print(first, second, third)
n o t
>>> d, m, y = '22/11/1987'.split('/')
>>> print(d, m, y)
22 11 1987
>>> a, b, c, d = range(3, 7)
>>> print(a, b, c, d)
3 4 5 6
In the first example, we are unpacking a list. X gets the value 1, y gets 2, and
z gets 3. Next, we have unpacked a string so the variables first, second
and third get values 'n', 'o' and 't' respectively. In the next example,
the split method returns a list, so the variables d, m, and y get the
values 22, 11, and 1987, respectively. In the last example, we are using
unpacking with the range function, so variable a is 3, b is 4, c is 5, and d
is 6. We can use this trick to assign names to a range of values.
>>> black, white, green, blue, red, yellow =
range(1,7)
>>> (MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY,
SATURDAY, SUNDAY) = range(7)
This example that defines integer constants for days of the week is from the
built-in calendar module.
We can call the split method on the string returned by input function,
and unpack the list returned by the split method.
>>> c1, c2, c3 = input('Enter three colours :
').split()
Enter three colours : red blue green
>>> print(c1, c2, c3)
red blue green
This way, we can break the input and, therefore, ask the user to enter values
for multiple variables using a single input call.
Unpacking will give you an error if the number of variables on the left side
is not equal to the number of elements in the right-side collection.
>>> x, y = [1, 2, 3]
ValueError: too many values to unpack (expected 2)
>>> w, x, y, z = [1, 2, 3]
ValueError: not enough values to unpack (expected
4, got 3)
In the next example, we have a list that contains a string, an integer, and a
tuple. If we try to unpack the list with five variables on the left, we get an
error because the number of values on the left-hand side is not equal to the
number of values in the right-hand side collection.
>>> L = ['Dev', 10, (29, 4, 2013)]
>>> name, age, d, m, y = L
ValueError: not enough values to unpack (expected
5, got 3)
The correct way to unpack is by enclosing variables d, m, and y in
parentheses.
>>> name, age, (d, m, y) = L
>>> print(name, age, d, m, y)
Dev 10 29 4 2013
So, name gets the value 'Dev', age gets the value 10, and variables d, m,
and y get the values 29, 4, and 2013 respectively. This is one of the
examples of situations where parentheses of a tuple cannot be omitted.
While unpacking, we can ignore some values from the tuple if we do not
need them. In the following example, we have a tuple named employee,
and we are unpacking it.
>>> employee = ('Raj', 25, 'Delhi', '[email protected]',
'XY289', 15000)
>>> name, age, city, email, id, salary = employee
Suppose we do not want the last two values of this tuple. We want to unpack
only the first four values. We have seen that if we write only four variables
on the left side, we will get an error as the number of variables on the left is
not equal to the number of values on the right side.
>>> name, age, city, email = employee
ValueError: too many values to unpack (expected 4)
The solution to this problem is to give any dummy name to satisfy the
syntax. The convention is to use an
underscore, which is a valid name in Python.
>>> name, age, city, email, _ , _ = employee
>>> name, _ , city, email, _ , salary = employee
In the first statement, we have ignored the last 2 values; in the second
statement, we have ignored the second and fifth values. This way, we can
ignore some values and satisfy the interpreter.
You could use any other variable name here instead of the underscore. There
is nothing special about this underscore. For example, you can use the name
dummy here.
>>>name, dummy, city, email, dummy, salary =
employee
However, using an underscore is a convention and it is easier to type a single
underscore than typing any other variable name. If you want to ignore
multiple adjacent values, you can use an asterisk before a variable name.
>>> name, *_ , salary = employee
Here, name will get the first value of the tuple, salary will get the last
value, and all other values in between are ignored. Again, using an
underscore here is the convention. You can use any other variable also. For
example, we have used the name skip here.
>>> name, *skip, salary = employee
All the values that we have ignored will actually be collected in a list named
skip. In the previous statement where we used *_, the name of the list will
be _, which is a valid name.
You might not always want to throw the values, so in that case, you can use
a meaningful variable name instead of the throwaway variable _.
>>> record = ('Ted', 25, 'Paris', 'Java', 'C++',
'C', 'Python')
>>> name, age, city, *languages = record
>>> languages
['Java', 'C++', 'C', 'Python']
Here, we know that the first element is name, the second is age, the third is
city, and after that, every element is a language. So, we have collected the
remaining elements in the list named languages. Here is another
example:
>>> author = ('Learn C', 'Python Programming',
'Data structures', 'Alex', '[email protected]')
>>> *books, name, email = author
>>> books
['Learn C', 'Python Programming', 'Data
structures']
Here, we know that the second last element is name, the last is email, and
before that, everything is the name of a book. So, we have placed the starred
variable in the beginning.
Exercise
What will be the output of the code given in Questions 1 to 48?
1. listA = [11, 22, 33, 44]
print(listA[2.0])
2. listA = [1, 2, 3, 4]
listA[3] = 100
print(listA)
3. listA = [4, 5, 6, 7, 8, 9, 10, 11,
12, 1, 3, 14, 15, 16, 17]
print(listA[2:9:2])
4. listB = [10, 20, 30]
listB[3] = 40
print(listB)
5. listA = [1, 2, 3, 4, 5, 6, 7, 8, 9]
listA[2:4] = [10, 20, 30, 40, 50]
print(listA)
6. listA = [1, 2, 3, 4, 5, 6, 7, 8, 9]
listA[3:5] = []
print(listA)
7. listA = [1, 2, 3, 4, 5, 6, 7, 8, 9]
listA[3] = []
print(listA)
8. listA = [1, 2, 3, 4, 5, 6]
print(listA[4:4])
9. listA = [1, 2, 3, 4, 5]
listA[3:3] = 'abcd'
print(listA)
10. numbers = [2, 4, 11, 6, 3, 9, 19]
print(10 not in numbers)
11. listA = [1, 2, 3]
listA = listA * 3
print(listA)
12. listA = ['ab', 'cd', 'ef', 'gh']
x = sum(listA)
print(x)
13. L = [''] * 3
print(L)
14. listA = list('Welcome')
print(listA)
15. L = list(range(5))
print(L)
16. L = list(range(100, 0, 10))
print(L)
17. L = list(range(3, 15, 3))
print(L)
18. avengers = 'Thor,Iron man,Hulk,Ant-Man'
listA = avengers.split(',')
print(listA)
19. print('ab-cd-de-fg-hi-jk'.split('-',3))
20. listA = [1, 2, 3]
numbers = [10, listA, 20]
del listA
print(numbers)
21. a = 1
b = 2
c = 3
list1 = [a, b, c]
b = 100
print(list1)
22. L = [[]] * 3
L[2].append('x')
print(L)
23. names = ['Ami', 'Sam', 'Amitabh', 'Jim']
print(names[-2][-3])
24. names = ['Ami', 'Jim', 'Tim', 'Ron']
names.append(['Dev', 'Raj', 'Sam'])
print(len(names))
25. listX = [0] * 5
listX[1] = 45
print(listX)
26. listA = [[0]] * 4
listA[1].extend([4,5])
listA[2].append(9)
print(listA)
27. x = [[11, 2, 6], [5, 9, 1]] * 3
x[0].sort()
x[1] = sorted(x[1])
print(x)
28. t = (6, 7, 8)
t = t * 2
print(t)
29. t = (1, 2, 3, 4)
x, y, z = t
print(x, y, z)
30. t = (1, 2, 3, 4, 5, 6, 7, 8)
x, _, y, *_ = t
print(x, y, _)
31. listA = [4, 5, 6, 7, 8, 9, 10]
listA[2:5] = []
print(listA, end=' ')
listA[2] = []
print(listA)
32. listA = [4, 3, 2, 6]
listA = listA.sort(reverse=True)
print(listA, end=' ')
listB = [9, 4, 3]
listB = listB.append(5)
print(listB)
33. numbers = [1, 2, 3]
numbers.extend([4, 5, 6])
print(len(numbers), end=' ')
numbers.append([7, 8, 9])
print(len(numbers))
34. x = [1, 2, 3]
y = [x] * 4
z = x * 4
print(y, z)
35. date = '09/08/1973'
print('-'.join(date.split('/')))
36. t2 = 4, 5, 6
print(type(t2))
37. t1 = ('hello')
t2 = ('hello',)
print(type(t1), type(t2))
38. a, b, c = range(1, 3)
print(a, b, c)
39. t = (1, 2, 3, 4, 5, 6)
a, b, _, c, d, e = t
print(_)
40. t = (1, 2, 3, 4, 5, 6)
a, b, *_, e = t
print(_)
41. numbers = [1, 2, 3, 4]
print(numbers[:], numbers[::-1])
42. x = list(range(1, 6, 2))
y = list(range(1, 7, 2))
print(x == y)
43. print([10, 20, 30, 40, 50, 60][2:4][1])
44. x = 1, 2, 3
a, b, c = 1, 2, 3
print(x, a, b, c)
45. L1 = [1, 2, 3, 4]
L1.append([])
L2 = [1, 2, 3, 4]
L2.extend([])
print(L1, L2)
46. L1 = [3, 2, 5]
L2 = [6, 8, 1, 9]
x = sorted(L1) + sorted(L2)
y = sorted(L1 + L2)
print(x, y)
47. L1 = [1, 2, 3]
L1 += 100
L2 = [1, 2, 3]
L2[1] += 100
print(L1, L2)
48. numbers = [98, 11, 22, 9, 6, 32, 5]
print(sorted(numbers)[2:4])
49. What are the valid indices for a list of length 4?
(A) 1, 2, 3, 4 (C) 0, 1, 2, 3, -1, -2, -3
(B) 0, 1, 2, 3 (D) 0, 1, 2, 3, -1, -2, -3, -4
50. fruits = ['fig', 'apple', 'mango', 'orange']
What is the result of fruits.index('banana') ?
(A) Returns -1 (C) Raises ValueError
(B) Returns None (D) Raises IndexError
51. marks = [86, 93, 93, 67, 92, 89, 92, 93, 52,
92, 91]
What is the value of marks.count(max(marks)) ?
(A) 93 (C) 3
(B) 92 (D) 0
52. Which of these expressions will search for element 12 in last 5
elements of a list L?
(A) L.index(12, 5) (B) L.index(12, -5)
53. listA = [3, 4, 5, 6]
The expression listA += [10]
(A) reassigns listA to a different object (B) makes in-place changes
in listA
54. What is the value of the following expression?
[1,2,3] + 'abc'
(A) [1, 2, 3, a, b, c]
(B) '123abc'
(C) Raises TypeError
55. Which one of these will create an empty list?
(A) listA = []
(B) listA = list()
(C) Both
56. Which of these is not a tuple?
(A) (23) (C) (23,)
(B) (23,5)
57. t = (1, 2, 3, 4)
Which of these are valid operations for tuple t?
(i) t[1] = 100 (ii) t = t + (100,)
(A) only (i) is valid (C) both (i) and (ii) valid
(B) only (ii) is valid (D) both (i) and (ii) invalid
58. student = ('Dev', 32, [12, 13, 14], (88,98))
Which one of these is a valid operation?
(A) student[0] = 'Joseph' (C) student[2][1] = 34
(B) student[0][1] = 'r' (D) student[3][1] = 34
59. Will this code give an error?
L = ['Dev', 25, (12,)]
name, age, d = L
(A) Yes (B) No
In questions 60 to 77, write statements to perform the given
operations on the following list.
numbers = [1, 2, 3, 4, 5, 6, 7, 8]
60. Change the second last element of the list to 200.
61. Replace the elements 3,4,5,6 with elements
30,40,50,60,70,80
62. Replace all the elements from index 3 onwards with the characters of
the string 'pqr'.
Resulting list should be [1, 2, 3, 'p', 'q', 'r']
63. Insert new elements 10, 20, 30, 40, 50 starting at index 5.
Resulting list should be [1, 2, 3, 4, 5, 10, 20, 30,
40, 50, 6, 7, 8]
64. Delete all elements from index 2 to index 5. Resulting list should be
[1, 2, 7, 8]
65. Make a new list named cpy that is a copy of the numbers list.
66. Make a new list named rev that is reverse of the numbers list
67. Add 100 at the end of the list
68. Add 200 in the beginning of the list
69. Add 150 at index 3
70. Add 12,13,14,15 at the end of the list in one step.
71. Delete element 5 from the list.
72. Delete the last element from the list
73. Delete the element at index 5 and store it in a variable
74. Delete the first element from the list
75. Delete all the elements of the list
76. Use the del keyword to delete the element at index 5.
77. Use the del keyword to delete the last 3 elements.
Use the following list for questions 78 to 92
numbers = [12, 32, 55, 67, 3, 55, 68, 22, 55,
89, 55, 1, 19, 32]
Write code to perform the following operations:
78. Find the number of occurrences of 55 in the list.
79. Find the index of the first occurrence of 55 in the list.
80. Find the index of last occurrence of 55 in the list.
81. Find the index of the first occurrence of 55 in a portion of the list,
starting from index 4 to index 9.
82. Find the index of the smallest element of the list.
83. Replace the largest element of the list with 1000.
84. Find the second largest and third smallest elements from the list.
85. Make a new list that contains the three largest elements of the list.
86. Find the sum of the five smallest elements of the list.
87. Find the minimum value of the first half of the list.
88. Find the average of all the elements of the list.
89. Make a new list that contains the 5 largest elements from the list.
90. Make a new list that contains the 5 smallest elements from the list.
91. Sort the list in descending order.
92. Make a new list that contains all the elements of the numbers list in
ascending order. The original list should not change.
93. Sort this list of strings based on their length.
fruits = ['banana', 'fig', 'Mango',
'pomegranate', 'Apple']
94. Perform case insensitive sort on this list of strings.
fruits = ['banana', 'fig', 'Mango',
'pomegranate', 'Apple']
95. Write a statement to create a list of size 20 with all elements
initialized to None.
96. Create the following list by using the range function.
[1000, 900, 800, 700, 600, 500, 400, 300, 200,
100]
97. Create a list of all multiples of 7 greater than 50 and less than 150,
using the range function.
98. listD = ['Pluto', 'Goofy', 'Donald Duck',
'Alice']
Create a string using the join method in which all these strings are
joined by a comma
99. Write an expression that will give you the reverse of the string present
at index 2 of the following list.
fruits = ['apple', 'banana', 'grapes',
'guava']
100. What will be the output of this code?
student = ('John', 25, [88, 90, 92])
student[2].extend([89, 98])
print(student)
101. What will be the output?
L = [1, 2, 3]
X = ['a', L]
X[1][0] = 100
print(L)
What can you do to avoid the side effect that is seen in this code?
102. How would you write this code in Pythonic way?
x = 3
y = 2
temp = x # save old value of x
x = y * x # change x
y = temp # set y to old value of x
print(x, y)
103. What is the difference between L1 = L1 + L2 and
L1.extend(L2) ?
104. What is the difference between listA.clear() , del listA
and del listA[:]. ?
105. What is the difference between L1 = L.sort() and L1 =
sorted(L) ?
106. What is the difference between L[:3] = [], L[3]=[] and
L[3:]=[] ?
107. Rewrite the following code using tuple unpacking.
employee = ('Ken', 'London', 26, 4000)
name = employee[0]
city = employee[1]
age = employee[2]
salary = employee[3]
108. Write code to swap first and last values of a list L.
109. Use input function and split method to input 5 colours, separated
by hyphens(-). Collect the input in a list.
Dictionaries and Sets 5
In the previous chapter, we discussed how to store data using lists and tuples.
In this chapter, we will discuss two more data structures named dictionaries
and sets. Dictionaries help you organize and structure your data in a better
way. It is easier to represent real-world data using a dictionary. Both
dictionaries and sets are internally implemented in such a way that they
perform very fast searching.
5.1 Dictionaries
The dictionary data structure is a collection of key-value pairs. Each element
of a dictionary is a key-value pair which is also known as an item. Here is an
example of a dictionary literal:
countries = {'IN': 'India', 'GR': 'Germany', 'MX':
'Mexico', 'JP': 'Japan'}
This dictionary contains four key-value pairs. The strings 'IN', 'GR',
'MX', and 'JP' are keys, and the strings 'India', 'Germany',
'Mexico', and 'Japan' are the corresponding values. The key-value
pairs are separated by commas and are enclosed inside curly braces. In each
pair, the key and the value are separated by a colon. The dictionary literal
has been assigned to the name countries. Typing the name of the
dictionary on the shell prompt or printing it by using the print function
will display all its contents.
>>> countries
{'IN': 'India', 'GR': 'Germany', 'MX': 'Mexico',
'JP': 'Japan'}
In our example dictionary, both keys and values are of str type. They can
be of other types also, but there is a restriction on the type of keys. The keys
can be of immutable type only; you cannot have a key of mutable type.
Therefore, a key can be a string, an integer, a tuple, or any other immutable
type; however, most of the time, it is a string. There is no such restriction on
values; they can be of mutable or immutable types. So, a value in a
dictionary could be a string, integer, list, tuple dictionary, or any other type.
The other restriction on keys is that they must be unique; duplicate keys are
not allowed. Again, there is no such restriction on values. They can be
duplicated, and the same value can be associated with any number of keys.
So, you cannot have key-value pairs where the keys are the same, but you
can have key-value pairs where the values are the same. A key can appear
only once, while a value can occur many times.
Like lists and tuples, you can have a trailing comma in a dictionary literal
also.
countries = {'IN': 'India', 'GR': 'Germany', 'MX':
'Mexico', 'JP': 'Japan',}
Dictionaries are mutable data structures like lists, so a dictionary can shrink
or grow at run time, and its elements can be changed. Like a list, a dictionary
is also a referential data structure which means that it contains references to
objects; both keys and values are object references.
Searching in dictionaries is performed by keys. You can provide the name of
the key to retrieve the value associated with that key. For example, in our
countries dictionary, we can get the name of a country from its
abbreviation, which is used as the key. Dictionaries are highly optimized, so
this lookup is very fast. If we try to structure our data of country names and
abbreviations by using a list, it would be difficult to implement and also
would be inefficient.
Now let us discuss how we can access a value corresponding to a given key.
In lists and strings, we use an integer index inside the square brackets to
access a value; in dictionaries, we will use a key inside the square brackets
to retrieve a value. For example, the expression countries['IN'] will
give us the value associated with the key 'IN'.
>>> countries['IN']
'India'
>>> countries['MX']
'Mexico'
Let us discuss some more examples where dictionaries can be used. You will
generally need to create a dictionary when you have some data that is in
tabular form. In Figure 5.1, we have some data samples written in tables.
The first one is the record of a student; the left column is the field name, and
the right column is the value of that field. In the second table, the left
column is the product name, and the right column is its price, and in the third
one, the left column contains the designation, and the right column
represents the associated salary.
Figure 5.1: Data in tabular form
First, let us represent the student data using a list.
student = ['John', 'M', 'Paris', 21, [89,78,91],
True]
When we need to access a student’s name, we will write student[0], and
when we need the student’s age, we will write student[3]. The problem
with this representation is that we must remember that the name is at index
location 0, the gender is at index location 1, and so on. All the values are
there in the list, but there is no information about the values, so a list is
possibly not the best choice here. When the values are identified by their
names, we need to use a dictionary. Let us put the same data in a dictionary.
>>> student = {'name': 'John',
'gender': 'M',
'city': 'Paris',
'age': 21,
'marks': [89, 78, 91],
'is_sporty': True
}
Now there is more information, and we know what each value represents.
The keys are used to describe the data, and values represent the actual data.
To increase the readability of our dictionary, we have placed each key-value
pair on a separate line.
As we have seen before, we can get the value associated with a key by
writing the dictionary name with the key inside the square brackets.
>>> student['name']
'John'
>>> student['city']
'Paris'
>>> student['grade']
KeyError: 'grade'
>>> student['marks'][1]
78
If the key we specify is not present in the dictionary, then we get a
KeyError. 'grade' is not a key in the dictionary, so the expression
student['grade'] raises an error. The last expression will give us a
value of 78 since students['marks'] is a list, and we can use numeric
indexing on that list.
We can see that by using a key as the index instead of an integer index, the
code becomes more readable and self-documenting. Therefore, when item
names are more meaningful than item positions, it makes more sense to have
the items in a dictionary.
Here is the dictionary for the next data sample:
>>> prices = {'pencil': 10,
'pen': 22,
'eraser': 12,
'sharpener': 13,
'marker': 32
}
Here, the product name is the key, and its associated price is the value. If you
need to access the price of a marker, you can write prices['marker'].
If you need to find the total price of 2 pencils, 3 markers, and 5 erasers, you
can write this:
>>> total = 2 * prices['pencil'] + 3 *
prices['marker'] + 5 * prices['eraser']
We can use the built-in function len to find the length of the dictionary.
>>> len(prices)
5
The length of the dictionary is 5, and the len function returns the number of
items in a dictionary, i.e., the number of key-value pairs.
The following dictionary is for the last table of the figure.
>>> salary = {'programmer': 10000,
'manager': 20000,
'accountant': 15000
}
Here, keys represent the designation names, and values are the associated
salaries.
So, when you have your data in a table, the best data structure for this type
of data is a dictionary. We can extract any value from a dictionary by using
the associated key inside the square brackets. This data structure is named
so, as it resembles a real-life dictionary, in which there is a word and its
associated definition; here, we have a key and its associated value. You
associate a key with a value, which is also called an associative data
structure. It is also known as mapping type since it maps keys to associated
values.
Information lookup is faster in dictionaries. They allow faster access to
values as there is no need to go through each item sequentially as in a list.
Values can be easily located by directly going to the key. This is because of
the highly optimized hashing algorithm used to implement dictionaries. This
is the reason why keys of mutable types are not allowed.
Before version 3.7, dictionaries were unordered structures, which means that
the items in a dictionary would not necessarily be in the same order in which
you defined or inserted them. When printing a dictionary, the items would
not necessarily be displayed in the order in which they were defined. Python
3.7 onwards dictionaries are ordered data structures, and dictionary elements
are guaranteed to be in insertion order. When you print a dictionary or iterate
over it in a loop, you will see that the order of elements is the same in which
they were defined or added to the dictionary.
Built-in functions like max, min, sorted work for dictionaries also, but all
of them work for keys only. If you need to use them for values, you can do it
using lambda functions which is discussed later in this book.
5.2 Adding new key-value pairs
The following assignment statement will add a new key-value pair to the
dictionary.
d[k] = val
This will insert the key k with the value val in the dictionary d. Let us
insert a new key-value pair in our prices dictionary.
>>> prices['ruler'] = 30
>>> prices
{'pencil': 10, 'pen': 22, 'eraser': 12,
'sharpener': 13, 'marker': 32, 'ruler': 30}
We know that duplicate keys are not allowed in a dictionary. Let us see what
happens when we try to add a new key-value pair and the key already exists
in the dictionary.
>>> prices['pencil'] = 15
>>> prices
{'pencil': 15, 'pen': 22, 'eraser': 12,
'sharpener': 13, 'marker': 32, 'ruler': 30}
We do not get any error; assigning a value to an existing dictionary key
replaces the old value with the new value.
If in a dictionary literal, a key is specified more than once, then also the
interpreter will not complain, and it will assign the last occurrence of value
to the key.
>>> d = {'x': 1, 'y': 2, 'z': 3, 'x': 100}
>>> d
{'x': 100, 'y': 2, 'z': 3}
5.3 Modifying Values
In the previous section, we already saw how to change the value associated
with a particular key. The following assignment will replace the old value
associated with key k with the new value.
d[k] = val
Let us change the price of a pen in our prices dictionary.
>>> prices = {'pencil': 10, 'pen': 22, 'eraser':
12, 'sharpener': 13, 'marker': 32}
>>> prices['pen'] = 25
>>> prices
{'pencil': 10, 'pen': 25, 'eraser': 12,
'sharpener': 13, 'marker': 32}
The old value 22 is replaced is replaced with the new value 25. The syntax
for adding a new key-value pair and modifying existing values is the same.
If the key is not present, the assignment statement d[k] = val will insert
the key and the value in the dictionary, and if the key is present, it will
update the value.
You can also use augmented assignment statements to change the values. For
example, in the salary dictionary that we had written, suppose you want
to increase the salary of the programmer and decrease the salary of the
manager. You can write the following augmented assignment statements:
>>> salary = {'programmer': 10000, 'manager':
20000, 'accountant': 15000}
>>> salary['programmer'] += 1000
>>> salary['manager'] -= 1000
>>> salary
{'programmer': 11000, 'manager': 19000,
'accountant': 15000}
5.4 Getting a value from a key by using the
get() method
We have seen that we can access individual values in a dictionary using the
key as the index. If we have a dictionary named d, we can write d[k] to
access the value associated with the key k. The problem with this approach
is that if the key k is not present in the dictionary d, then a KeyError will
be raised. To avoid this error, you can use the get() method. This method
returns the associated value like d[k], but if the key is not found, instead of
raising an error, it returns None. You can specify another value to be
returned instead of None if the key is not present. So, if you think there are
any chances of the key not existing in the dictionary, it is better to use the
get method instead of the square bracket notation.
d.get(k) Returns the value that is associated with key k
If k not present, returns None
d.get(k,val) Returns the value that is associated with key k
If k not present, returns val
Table 5.1: The get method
The get method takes a key as the argument and returns the value
associated with it. It takes an optional second argument, which is the value
to be returned when the key does not exist.
>>> prices = {'pencil': 10, 'pen': 22, 'eraser':12,
'sharpener':13, 'marker':32}
>>> prices['pen']
22
>>> prices.get('pen')
22
>>> prices['stapler']
KeyError: 'stapler'
>>> prices.get('stapler')
When we used the get method on a non-existent key, nothing was printed
on the prompt. If we use the print function, we can see that it returns
None.
>>> print(prices.get('stapler'))
None
We can specify any other value to be returned instead of None.
>>> prices.get('stapler', 0)
0
>>> prices.get('stapler', 5)
5
5.5 Getting a value from a key by using the
setdefault() method
The setdefault method also accesses the value from a key, but if the key
is missing, it will add that key to the dictionary. The value for that key is set
to None or you can provide your own value also.
d.setdefault(k) Returns the value that is associated with key k
If k is not present, returns None and adds the key k to dictionary
with value None
d.setdefualt(k,val) Returns the value that is associated with key k
If k is not present, returns val and adds the key k to dictionary
with value val
Table 5.2: The setdefault method
The setdefault method can take two arguments. The first argument is
the key for which you want to retrieve the value. The second argument is
optional. It is the value that will be assigned to the key instead of the default
None.
>>> prices = {'pencil': 10, 'pen': 22, 'eraser':
12, 'sharpener': 13, 'marker': 32}
>>> prices.setdefault('pen')
22
>>> prices.setdefault('stapler') # Returns None
>>> prices
{'pencil': 10, 'pen': 22, 'eraser': 12,
'sharpener': 13, 'marker': 32, 'stapler': None}
We can see that the key is added with the value None. If we want the key to
be added with a value other than None, we can specify that value.
>>> prices.setdefault('gum',10)
10
>>> prices
{'pencil': 10, 'pen': 22, 'eraser': 12,
'sharpener': 13, 'marker': 32, 'stapler': None,
'gum': 10}
5.6 Getting all keys, all values, and all key-
value pairs
The following three methods return special list-like iterable objects called
dictionary views. These objects are dynamic, so any changes in the
dictionary are reflected in these objects.
d.keys() Returns an object providing a view on keys of the dictionary d
d.values() Returns an object providing a view on values of the dictionary d
d.items() Returns an object providing a view on keys and values of the dictionary d
Table 5.3: Methods to get all keys, values, and key-value pairs
To get all the keys, use the d.keys() method; to get all the values, use the
d.values() method; and to get all the key-value pairs, use the
d.items() method.
>>> prices = {'pencil': 10, 'pen': 22, 'eraser':
12, 'sharpener': 13, 'marker': 32}
>>> prices.keys()
dict_keys(['pencil', 'pen', 'eraser', 'sharpener',
'marker'])
>>> prices.values()
dict_values([10, 22, 12, 13, 32])
>>> prices.items()
dict_items([('pencil', 10), ('pen', 22), ('eraser',
12), ('sharpener', 13), ('marker', 32)])
The methods keys(), values(), and items() return a dict_keys
object, dict_values object, and dict_items object. You can use the
list function to convert these objects to list type if required.
>>> list(prices.keys())
['pencil', 'pen', 'eraser', 'sharpener', 'marker']
>>> list(prices.values())
[10, 22, 12, 13, 32]
>>> list(prices.items())
[('pencil', 10), ('pen', 22), ('eraser', 12),
('sharpener', 13), ('marker', 32)]
These three methods do not return lists to save the time and memory used in
creating a list that might have no use. For large dictionaries, lists also will be
large and hence will consume more space. These methods return a view
object, and if you want a list, you can convert explicitly. The dictionary view
objects are iterable, and we can use them in a for loop to process all the
items of a dictionary. We will discuss this in Chapter 7 that covers loops.
Python 3.8 onwards, the dictionary views are reversible. If we use the built-
in reversed function, the keys and values will be iterated over in the
reverse order of the insertion.
>>> d = {'a': 10, 'b': 20, 'c': 30}
>>> d
{'a': 10, 'b': 20, 'c': 30}
>>> list(reversed(d))
['c', 'b', 'a']
>>> list(reversed(d.keys()))
['c', 'b', 'a']
>>> list(reversed(d.values()))
[30, 20, 10]
>>> list(reversed(d.items()))
[('c', 30), ('b', 20), ('a', 10)]
We can also use the sorted function on these views.
>>> sorted(d.keys())
['a', 'b', 'c']
>>> sorted(d.values())
[10, 20, 30]
>>> sorted(d.items())
[('a', 10), ('b', 20), ('c', 30)]
5.7 Checking for the existence of a key or a
value in a dictionary
In the previous chapters, we saw that the in and not in operators can
check whether a value exists in a list, tuple, or string. These operators can
also check whether a key or a value exists in a dictionary. These membership
operators can be used with the dictionary view objects to check for
membership of keys and values.
x in d Returns True if x is present as a key in the dictionary d, otherwise
False
x in d.keys()
x in d.values() Returns True if x is present as a value in the dictionary d,
otherwise False
(k,val) in Returns True if (k,val) pair is present in the dictionary d,
d.items() otherwise False
Table 5.4: Checking for the existence of a key or a value in a dictionary
If you want to know whether a key is present in the dictionary, you can
simply write x in d or x in d.keys(). To check if x is present in the
dictionary as a value, you can write x in d.values(). To check for a
key-value pair, you can use the items method.
>>> prices = {'pencil': 10, 'pen': 22, 'eraser':
12, 'sharpener': 13, 'marker': 32}
>>> 'pen' in prices
True
>>> 'pen' in prices.keys()
True
>>> 100 in prices.values()
False
>>> 100 not in prices.values()
True
>>> 22 in prices.values()
True
>>> ('pencil',10) in prices.items()
True
>>> ('pencil',12) in prices.items()
False
5.8 Comparing dictionaries
The equality operators == and != can be used to compare two dictionaries.
The expression d1==d2 will return True if the two dictionaries contain the
same key-value pairs. We can also use the use the keys(), values(),
and items() methods with these operators. The other comparison
operators (<, >, <=, >=) are not defined for a dictionary.
>>> d1 = {'x': 1, 'y': 2, 'z': 3}
>>> d2 = {'x': 1, 'y': 2, 'z': 3}
>>> d3 = {'x': 100, 'y': 200, 'z': 300}
>>> d1 == d2
True
>>> d1 == d3
False
>>> d1.keys() == d3.keys()
True
5.9 Deleting key-value pairs from a dictionary
The del statement can be used to delete a key-value pair from the
dictionary. del d[k] will remove both the key k and its associated value
from the dictionary. If the key is not present, then a KeyError will be
raised.
If you want to delete a key-value pair and store the deleted value in a
variable, you can use the pop method. This method will delete the key-value
pair, and it will return the value associated with the key. If the key is not
present, then a KeyError will be raised. If you do not want the
KeyError to be raised in case of missing key, you can send a second
argument to the pop function, which will be returned if the key is not
present. For example, the call d.pop(k,-1) will return -1 if key k is not
present.
del d[k] Removes key k and its associated value from the dictionary d
d.pop(k) Removes key k and its associated value from the dictionary d, and returns
the value d[k]
d.pop(k, Returns val if key k is not present in the dictionary
val)
Table 5.5: Deleting key-value pairs
In lists, you could use the pop() method without any argument, and it
would give you the last element, but in dictionaries, you cannot use pop()
without an argument.
>>> prices = {'pencil': 10, 'pen': 22, 'eraser':
12, 'gum': 13, 'marker': 32, 'ruler': 30}
>>> del prices['marker']
>>> prices
{'pencil': 10, 'pen': 22, 'eraser': 12, 'gum': 13,
'ruler': 30}
The key value pair corresponding to the key 'marker' has been removed.
>>> x = prices.pop('pencil')
>>> x
10
>>> prices
{'pen': 22, 'eraser': 12, 'gum': 13, 'ruler': 30}
This call to method pop removed the key-value pair ('pencil', 10),
and it also returned the value 10, which we stored in the variable x.
We will get a KeyError if the key is not present in the dictionary.
>>> prices.pop('book')
KeyError: 'book'
To avoid this KeyError, we can send a second argument.
>>> prices.pop('book',0)
0
Now, 0 is returned from the pop method for the non-existent key 'book'.
The method popitem() removes and returns a random key-value tuple
pair from the dictionary.
>>> prices
{'pen': 22, 'eraser': 12, 'gum': 13, 'ruler': 30}
>>> prices.popitem()
('ruler', 30)
>>> prices
{'pen': 22, 'eraser': 12, 'gum': 13}
>>> prices.popitem()
('gum', 13)
>>> prices
{'pen': 22, 'eraser': 12}
The method clear() removes all key-value pairs from the dictionary and
makes it empty.
>>> prices.clear()
>>> prices
{}
If you try to empty the dictionary by assigning an empty dictionary, then
there can be problems if other variables are referring to the dictionary.
>>> prices = {}
This will not delete all the items from the dictionary. This will create a new
empty dictionary and make the name prices refer to that empty dictionary.
5.10 Creating a Dictionary at run time
We have seen how to create dictionaries by writing dictionary literals. All
key-value pairs are written inside the curly braces, with keys and values
separated by colons.
prices = {'pen': 22, 'eraser': 12, 'gum': 13,
'ruler': 30}
Creating a dictionary this way is fine if you know the initial data beforehand.
If you want to create your dictionary dynamically at run time, you can start
by creating an empty dictionary and adding key-value pairs to it. Let us start
with an empty dictionary and add key-value pairs to it.
prices = {}
fruit = input('Enter name of a fruit: ')
price = int(input('Enter its price: '))
prices[fruit] = price
fruit = input('Enter name of a fruit: ')
price = int(input('Enter its price: '))
prices[fruit] = price
fruit = input('Enter name of a fruit: ')
price = int(input('Enter its price: '))
prices[fruit] = price
print(prices)
Sample run -
Enter name of a fruit: Apple
Enter its price: 50
Enter name of a fruit: Banana
Enter its price: 25
Enter name of a fruit: Guava
Enter its price: 27
{'Apple': 50, 'Banana': 25, 'Guava': 27}
In this program, we must repeat the code for entering key-value pairs. In
Chapter 7, we will learn how to avoid this code repetition and input multiple
keys and values in a dictionary using loops. We can let the users enter the
keys and values, or the input can be taken from a file and stored in the
dictionary at run time.
5.11 Creating a dictionary from existing data
by using dict()
We can create dictionaries from existing data that is present in other data
structures like lists or tuples. The dict() function can be used to convert a
sequence of two value sequences into a dictionary. The first item in the
sequence is used as the key, and the second item is the value. For example,
suppose we have a list of 2 item lists.
>>> list1 = [['a', 1], ['b', 2], ['c', 3]]
>>> d1 = dict(list1)
>>> d1
{'a': 1, 'b': 2, 'c': 3}
We sent the list to the dict function and got a dictionary. The first item in
each inner list is taken as the key, and the second item is taken as the value.
Instead of a list of lists, we can have a list of tuples, a tuple of tuples, or a
tuple of lists. The main thing is that the length of inner sequences should be
exactly 2, as they represent the key-value pairs.
>>> t1 = ('x', 4), ('y', 5), ('z', 6)
>>> d2 = dict(t1)
>>> d2
{'x': 4, 'y': 5, 'z': 6}
>>> t2 = ['x', 4], ['y', 5], ['z', 6]
>>> d3 = dict(t2)
>>> print(d3)
{'x': 4, 'y': 5, 'z': 6}
>>> d4 = dict((['x', 4], ['y', 5], ['z', 6]))
>>> d4
{'x': 4, 'y': 5, 'z': 6}
While defining the two tuples, t1 and t2, we have omitted the enclosing
parentheses, but when we send the tuple literal directly inside the dict
function, we have to put the parentheses.
We can send a list or tuple of strings of length 2, as strings are also
sequences.
>>> d5 = dict(['X1', 'Y2', 'Z3'])
>>> d5
{'X': '1', 'Y': '2', 'Z': '3'}
The first character from the string is taken as the key, and the second
character as the value.
We can also create a new dictionary by passing keyword arguments to the
dict function.
>>> d6 = dict(pencil=12, eraser=45, sharpener=30)
>>> d6
{'pencil': 12, 'eraser': 45, 'sharpener': 30}
The names will become the keys, and the values will become the
corresponding values in the dictionary. But this way you can have only
strings as keys.
Dictionaries can also be created by zipping together two sequences. For
example, suppose we have the following two lists: the first one contains
country names, and the other one contains corresponding capitals at the same
offsets.
>>> countries = ['France', 'Austria', 'Japan',
'India']
>>> capitals = ['Paris', 'Vienna', 'Tokyo', 'New
Delhi']
We can create a dictionary from these two lists using the zip function. This
function walks through multiple sequences and creates tuples from items at
the same offsets.
>>> d7 = dict(zip(countries, capitals))
>>> d7
{'France': 'Paris', 'Austria': 'Vienna', 'Japan':
'Tokyo', 'India': 'New Delhi'}
The two lists were sent to the zip function and its return value was sent to
the dict function, and we get a dictionary with keys from the first list and
values from the other list.
An empty dictionary can also be created by using the dict function,
although using empty braces is the preferred style.
>>> d8 = dict()
>>> d8
{}
5.12 Creating a dictionary by using the
fromkeys() method
dict.fromkeys(I, value) creates a new dictionary with keys from
iterable I and values set to value. If value is not provided, then the
values for all the keys are set to None.
Suppose we have a list named stationery, and we send it to the
fromkeys method, with the second argument as 0.
>>> stationery = ['pencil', 'marker', 'eraser',
'sharpener']
>>> prices = dict.fromkeys(stationery, 0)
>>> print(prices)
{'pencil': 0, 'marker': 0, 'eraser': 0,
'sharpener': 0}
We get this dictionary, in which keys are taken from the list, and the value
for all the keys is set to 0. This method is generally used to create default
dictionaries. If we do not provide the second argument, all values will be
None.
>>> d1 = dict.fromkeys(stationery)
>>> print(d1)
{'pencil': None, 'marker': None, 'eraser': None,
'sharpener': None}
>>> d2 = dict.fromkeys(range(7))
>>> print(d2)
{0: None, 1: None, 2: None, 3: None, 4: None, 5:
None, 6: None}
This method is usually directly called as dict.fromkeys() rather than
being called on an existing dictionary. It can also be called using an empty
dictionary literal.
>>> prices = {}.fromkeys(stationery, 0)
>>> print(prices)
{'pencil': 0, 'marker': 0, 'eraser': 0,
'sharpener': 0}
There is another way of creating a dictionary called dictionary
comprehension expression, which we will discuss later in a separate chapter.
5.13 Combining dictionaries
We can copy the key-value pairs of a dictionary into another dictionary by
using the update method. The call d.update(d1) merges all entries of
dictionary d1 into dictionary d. If there is a key that is present in both
dictionaries, the value in dictionary d is overwritten by the value in
dictionary d1.
>>> prices1 = {'apple': 10, 'mango': 15, 'banana':
20}
>>> prices2 = {'grapes': 25, 'banana': 17,
'papaya': 12}
>>> prices1.update(prices2)
>>> prices1
{'apple': 10, 'mango': 15, 'banana': 17, 'grapes':
25, 'papaya': 12}
All the entries of prices2 are added to prices1. The key 'banana'
was present in both dictionaries, and we can see that the value in prices1
was overwritten by the value in prices2.
The update method can also accept an iterable object of key-value pairs.
>>> L = [['guava', 23], ['fig', 30], ['mango', 25]]
>>> prices1.update(L)
>>> prices1
{'apple': 10, 'mango': 25, 'banana': 17, 'grapes':
25, 'papaya': 12, 'guava': 23, 'fig': 30}
The update method can accept keyword arguments also.
>>> prices1.update(lemon=15, melon=65)
>>> prices1
{'apple': 10, 'mango': 25, 'banana': 17, 'grapes':
25, 'papaya': 12, 'guava': 23, 'fig': 30, 'lemon':
15, 'melon': 65}
Python 3.9 onwards, the two operators | and |= are also available for the
dict type.
>>> d1 = {'x': 1, 'y': 2, 'c': 8}
>>> d2 = {'a': 3, 'b': 4, 'c': 7}
>>> d1 | d2
{'x': 1, 'y': 2, 'c': 7, 'a': 3, 'b': 4}
The expression d1 | d2 returns a new dictionary with the merged keys
and values of d1 and d2. The values of d2 get priority if d1 and d2 have
the same keys.
>>> d2 | d1
{'a': 3, 'b': 4, 'c': 8, 'x': 1, 'y': 2}
>>> d1 |= d2
>>> d1
{'x': 1, 'y': 2, 'c': 7, 'a': 3, 'b': 4}
5.14 Nesting of dictionaries
The values in a dictionary can be of any type; they can be of type dict also.
When we have a dictionary as a value inside a dictionary, we get a nested
dictionary. Let us understand this with the help of an example. We have seen
the following dictionary that was used to describe a student record. In this
dictionary, we have used a list to represent the marks.
student = {'name': 'John',
'gender': 'M',
'city': 'Paris',
'age': 21,
'marks': [89, 78, 91],
'is_sporty': True
}
We use the following expressions to access marks:
student['marks'][0] -> Marks in first subject
student['marks'][1] -> Marks in second subject
student['marks'][2] -> Marks in third subject
Suppose we want to make the marks field more informative and want to
store marks with the name of the subject. For that, we can use a dictionary
instead of using a list.
In Figure 5.2, we have a dictionary inside a dictionary. In the student
dictionary, the value for the key named 'marks' is again a dictionary. In
the inner dictionary, keys are subject names, and values are marks in those
subjects. To access the inner dictionary, we will write
student['marks'] because this dictionary is the value corresponding to
the 'marks' key. To access values inside this dictionary, we will use keys
'Maths', 'Physics', and 'Chemistry' as indexes.
So, to get marks in Maths, we will write student['marks']
['Maths']. Similarly, to get Physics marks, we can write
student['marks']['Physics'].
Figure 5.2: Nested dictionary
>>> student['marks']
{'Maths': 89, 'Physics': 78, 'Chemistry': 91}
>>> student['marks']['Maths']
89
>>> student['marks']['Physics']
78
>>> student['marks']['Chemistry']
91
The dictionary we have just defined represents the record of a single student.
There could be records of many students that we would want to store in our
program. Instead of giving a name to each record, like student1, student2, or
student3, we can store all these records in a collection type. Storing them in
a list will result in a longer time for value retrieval, so we can store them in a
dictionary, as a dictionary search is more efficient. All the students have a
unique student id, so we can have a collection in the form of another
dictionary, where the keys are the id numbers and values are the dictionaries
that represent student records.
Figure 5.3: Nested dictionary
Now, we can access a student’s details by using the student id. The
expression students[105416] will give us the first dictionary,
students[144547] will give us the second dictionary, and
students[132399] will give us the third dictionary. To access the data
of student with the id 144547, we can write:
>>> students[144547]
{'name': 'Dev', 'gender': 'M', 'city': 'London',
'age': 23, 'marks': {'Maths': 88, 'Physics': 77,
'Chemistry': 98}, 'is_sporty': False}
To access the name of the student with the id 105416, we can write:
>>> students[105416]['name']
'John'
The following expression gives the chemistry marks of the student with the
id 132399:
>>> students[132399]['marks']['Chemistry']
88
These types of dictionaries will generally not be written in the literal form in
the program. The details will be entered by the user or taken from a file. To
print these types of complex nested structures in a readable form, we can use
the pp function from pprint module.
>>> import pprint
>>> pprint.pp(students)
>>> pprint.pp(students[144547])
5.15 Aliasing and Shallow vs. Deep Copy
We have discussed aliasing, shallow copying, and deep copying in the
previous chapter. Dictionaries are also mutable structures like lists, so you
need to be careful about aliasing, and when you have a nested dictionary
structure, you need to perform a deep copy instead of a shallow copy. In this
section, we will take some examples to understand these concepts in the
context of dictionaries.
Suppose you have a fruit shop, and this dictionary stores the fruit names and
prices available in your shop.
>>> shop1_prices = {'apple': 200, 'mango': 250,
'banana': 100, 'grapes': 90}
Now, you open another shop and the same items are available in that shop
also, so you decide to copy this dictionary.
>>> shop2_prices = shop1_prices
This dictionary shop2_prices stores the prices of fruits in shop2.
>>> shop2_prices
{'apple': 200, 'mango': 250, 'banana': 100,
'grapes': 90}
This shop2 is a new shop, so sales are less, resulting in a huge stock of
apples and bananas, so you decide to drop the price of apples and bananas in
shop2.
>>> shop2_prices['apple'] -= 40
>>> shop2_prices['banana'] -= 20
>>> shop2_prices
{'apple': 160, 'mango': 250, 'banana': 80,
'grapes': 90}
We can see that the prices are reduced in shop2. Now there is a customer in
shop1- your old shop - who wants to buy 2 kg apples and 3 kg bananas. This
is how you calculate the amount to be paid:
>>> bill = 2 * shop1_prices['apple'] + 3 *
shop1_prices['banana']
>>> bill
560
The customer pays ₹560, and you suffer a loss of 140 because of aliasing. 2
kg apple and 3 kg banana would cost ₹700 in shop1, but you got only ₹560.
The culprit here is the statement shop2_prices = shop1_prices
which made an alias instead of an independent copy.
We can check the ids of the two dictionaries.
>>> id(shop1_prices)
1688566781824
>>> id(shop2_prices)
1688566781824
No new object was created; there is only one dictionary object, and both
shop1_prices and shop2_prices refer to it.
>>> shop1_prices
{'apple': 160, 'mango': 250, 'banana': 80,
'grapes': 90}
The prices were reduced for shop1 also. Instead of the assignment
statement, we should have used the dictionary copy method, as that would
give us an independent copy of the dictionary.
>>> shop1_prices = {'apple': 200, 'mango': 250,
'banana': 100, 'grapes': 90}
>>> shop2_prices = shop1_prices.copy()
>>> shop2_prices['apple'] -= 40
>>> shop2_prices['banana'] -= 20
>>> bill = 2 * shop1_prices['apple'] + 3 *
shop1_prices['banana']
>>> bill
700
We can also use the dict function to get an independent copy.
>>> shop2_prices = dict(shop1_prices)
Now, suppose you have a software company also, and the following nested
dictionary structure stores the salary of the employees. The salaries of
programmers working with different languages are different.
>>> office1_salary = {'manager': 6000,
... 'web designer': 3000,
... 'programmer': {'Python': 5000,
'Java': 4000, 'C#': 4500}
... }
You open another office, and from your fruit business experience, you know
what problems aliasing can cause, so you do not make that mistake again.
You make an independent copy by using the copy method.
>>> office2_salary = office1_salary.copy()
To be sure that you have independent objects, you can check the ids also.
>>> id(office1_salary)
2081864102592
>>> id(office2_salary)
2081828847232
ids are different, which means that we have separate dictionary objects.
Python programmers in office1 are performing very well, so you decide to
increase their salary.
>>> office1_salary['programmer']['Python'] += 500
You do not want anything to go wrong, so before printing the salary slips,
you can check the two dictionaries. We will use the pp function from the
pprint module to print these nested dictionaries in a readable form.
>>> import pprint
>>> pprint.pp(office1_salary)
{'manager': 6000,
'web designer': 3000,
'programmer': {'Python': 5500, 'Java': 4000, 'C#':
4500}}
Python programmers in office1 now get 5500 instead of 5000 which is what
we wanted.
>>> pprint.pp(office2_salary)
{'manager': 6000,
'web designer': 3000,
'programmer': {'Python': 5500, 'Java': 4000, 'C#':
4500}}
The salary for Python programmers in office2 has also changed. How is this
possible when we have made an independent copy using the copy method?
The problem is that we have a nested structure, so we need a deep copy
instead of a shallow copy. The copy method gives us a shallow copy. We
can check the ids of inner dictionaries.
>>> id(office1_salary['programmer'])
2081827346304
>>> id(office2_salary['programmer'])
2081827346304
The same dictionary is shared by both objects. To perform a deep copy, we
need to use the deepcopy function from the copy module.
>>> from copy import deepcopy
>>> office2_salary = deepcopy(office1_salary)
>>> id(office2_salary['programmer'])
1975443905984
>>> id(office1_salary['programmer'])
1975399634816
5.16 Introduction to sets
Searching in a list or tuple takes a long time if they are big in size, and if
they have to be searched multiple times, it can lead to poor performance.
Another constraint with lists is that they store duplicate values. In some
cases, we might need to store only unique values. We can make our list store
only unique values, but that will not be efficient since whenever we insert a
new value, we have to sequentially scan all the values to check whether the
value is already present.
The set data structure is suitable for these types of situations. When you
want to store a collection of unique values for faster lookup, you can use a
set. Sets are internally implemented in such a way that they can be searched
very quickly and they automatically eliminate duplicate entries. However,
there are some limitations of sets; the values will not be stored in any
particular order, and you can store only values of immutable type. Let us
discuss the definition and syntax of defining sets.
A set is an unordered mutable collection of immutable and unique objects.
Sets are unordered, so sets do not maintain any order among their elements.
So, they are not of sequence type. Sets are mutable, meaning an object of
type set can be changed. We can replace existing elements of the set, add
new elements, or remove elements from the set. A set is a collection of
immutable objects, meaning it can contain objects of only immutable types
like integers, strings, or tuples. It cannot contain mutable type elements like
lists or dictionaries. The elements need not be of the same types; a set can
contain elements of different types.
The most important point about sets is that it is a collection of unique
objects, meaning duplicate elements are not allowed in a set. So, you can see
that elements of a set are like keys of a dictionary. They have to be
immutable and unique. Here are a few examples of set literals.
>>> big_cities = {'London', 'Paris', 'Bangalore',
'Tokyo'}
>>> primes = {2, 3, 5, 7, 11, 13, 17, 19}
>>> colors = {'red', 'blue', 'yellow', 'black',
'white'}
On the right side of the assignment, we have a set literal that is assigned to a
variable name. The elements of a set are placed inside curly braces and are
separated by commas.
Like lists, tuples, and dictionaries, sets are also referential structures, which
means that they contain references to objects. The elements are not in any
order; there is nothing like the first element or the second element. You can
think of a set as just a bag of unique values.
In sequences like strings, lists, and tuples, the elements are ordered so they
can be identified by their position; we could access an individual element by
applying a numeric index. In dictionaries, elements are identified by keys, so
there we can access an element by using a key as the index. But in sets,
elements are neither ordered nor there are any keys, so we cannot use
indexing to access an individual element of a set. Sets do not support
indexing or slicing as they do not have an inherent order.
The most common operation performed on a set is testing the existence of an
item. For that, we can use the membership operators in and not in.
>>> 'Paris' in big_cities
True
>>> city = 'Perth'
>>> city not in big_cities
True
>>> number = 11
>>> number not in primes
False
You can write these types of expressions, and they will return True or
False depending on whether the given item is present in the set or not.
Testing for membership is faster in a set as compared to a list or tuples.
In our example sets, we have created a set of primes of the first 8 prime
numbers so we can check whether a given number exists in this set or not. If
we want to do something with, suppose, the fifth prime number, we cannot
do it because the set has not stored them in order, so we do not know which
is the fifth prime number. If we have such a requirement, we must make a
list or tuple, which are ordered structures.
Now the question is, how will you know that you need to create a set in your
program? You can create a set when you have a collection of values whose
order does not matter, and in your program, you will just need to know
whether a value belongs to that collection or not. So, when you want to store
some unique values whose order does not matter but search efficiency
matters, you can use a set.
5.17 Creating a set
The call to set() function will create an empty set.
s = set()
We have only one way to create an empty set because empty curly braces {}
are used to create an empty dictionary.
s = {} # an empty dictionary will be created
Dictionaries were introduced in Python before sets, so this syntax is taken by
dictionary, and you have to make an empty set by using the set() function
only.
We can use the set function to create sets from other types like strings,
lists, tuples, and dictionaries. The duplicate values are discarded in this
process as a set can have only unique values.
>>> print(set('HELLO'))
{'E', 'L', 'H', 'O'}
>>> L = [1, 2, 3, 1, 2, 3, 4, 5, 4, 3, 2, 1]
>>> print(set(L))
{1, 2, 3, 4, 5}
>>> t = (20, 30, 40, 30, 20)
>>> print(set(t))
{40, 20, 30}
In all these examples, we can see that the duplicate values are discarded, and
only unique values are placed in the set. The original order is not necessarily
preserved, as sets are unordered structures.
If you try to convert a dictionary to a set, you get only a set of keys; the
values are lost. To get the set of values, you have to use the values method
of dict type.
>>> d = {1:'a', 2:'b', 3:'c', 4:'a', 5:'c'}
>>> set(d)
{1, 2, 3, 4, 5}
>>> set(d.values())
{'a', 'b', 'c'}
Sets can be created by using the range function also.
>>> odds = set(range(1, 20, 2))
>>> odds
{1, 3, 5, 7, 9, 11, 13, 15, 17, 19}
If you have an existing list or tuple and you want to search it without
duplicates efficiently, you can convert it to a set. If you want to filter out
duplicates from a list, you can convert it to set and then back to the list again
but the order will be lost in this process.
We can use sets for performing order-neutral equality tests. You can convert
a list to a set before testing for equality.
>>> L1 = [1, 2, 3, 4]
>>> L2 = [3, 2, 4, 1]
>>> print(L1 == L2)
False
>>> print(set(L1) == set(L2))
True
5.18 Adding and Removing elements
Here are some methods for adding and removing elements from a set.
s.add(x) Adds a new item x to the set s
s.pop() Removes an arbitrary element from s
s.remove(x) Removes x from set s, raises KeyError if x not present
s.discard(x) Removes x from set s, no effect if x not present
s.clear() Removes all elements from s
Table 5.6: Adding and removing elements from a set
The add() method is used to add a new item to the set, and if this item x is
already present in the set, then there is no effect. The item that is to be added
should be of immutable type. The pop() method removes an arbitrary
element from s. If the set is empty, then it raises a KeyError. To remove a
specified item, use either remove() or discard(). Both will remove the
element x from the set; they just differ in their behavior when x is not
present. remove() will raise KeyError, while discard() will have no
effect if the element to be removed is not present. The clear() method
removes all elements from the set, and the copy() method returns a copy
of set s. Here are some examples of these methods:
>>> cities = {'Cairo', 'Mumbai', 'Agra',
'Bengaluru', 'Rome', 'Perth', 'Bareilly', 'Bern'}
>>> cities.add('Delhi')
>>> cities
{'Bern', 'Agra', 'Mumbai', 'Cairo', 'Perth',
'Bengaluru', 'Bareilly', 'Rome', 'Delhi'}
>>> cities.remove('Bern')
>>> cities
{'Agra', 'Mumbai', 'Cairo', 'Perth', 'Bengaluru',
'Bareilly', 'Rome', 'Delhi'}
>>> cities.remove('Tokyo')
KeyError: 'Tokyo'
>>> cities.discard('Tokyo')
>>> print(cities)
{'Agra', 'Mumbai', 'Cairo', 'Perth', 'Bengaluru',
'Bareilly', 'Rome', 'Delhi'}
>>> city = cities.pop()
>>> print(city)
Agra
>>> print(cities)
{'Mumbai', 'Cairo', 'Perth', 'Bengaluru',
'Bareilly', 'Rome', 'Delhi'}
The copy method of set type returns a shallow copy of the set. The built
in functions like len(), sum(), max(), min(), sorted(), all(),
any() work on sets also.
Some special operations can be performed on sets. These operations
correspond to the set theory of mathematics. You might have studied
operations like union, intersection, and difference in set theory in maths.
These operations are supported by sets of Python. These operations are
different from the operations that we have seen in other collections like lists
or tuples. These special set operations can make your code shorter and more
readable. We will discuss these operations in the next two sections.
5.19 Comparing sets
In mathematics, two sets are considered to be disjoint if they have no
element in common. In Python, we have the method isdisjoint to check
whether two sets have any elements in common. The expression
s1.isdisjoint(s2) returns True if sets s1 and s2 have no elements
in common, otherwise it returns False.
>>> s1 = {1, 2, 3, 4}
>>> s2 = {5, 6, 7, 3}
>>> s3 = {10, 20, 30, 50}
>>> s1.isdisjoint(s2)
False
>>> s1.isdisjoint(s3)
True
Two sets are considered equal if each element of one set is contained in the
other set. We can use the equality operators == and != with sets to check
their equality.
>>> s1 = {1, 2, 4, 3, 6}
>>> s2 = {6, 4, 3, 2, 1}
>>> s1 == s2
True
Sets s1 and s2 have identical elements, so they are equal.
We know that the elements of a set are not in any particular order, so the
meaning of operators <, >, <=, and >= operators is different from what it is
for lists and tuples. These operators are based on the mathematical notion of
subsets and supersets. In set theory, a set s1 is a subset of another set s2 if
every element of s1 is present in set s2. In Python, the method issubset
and the operator <= are used to check for subset relationship.
If s1 is a subset of s2, then we can say that s2 is a superset of s1.
So, a set s1 is a superset of another set s2; if s1 contains every element of
s2; it can contain extra elements also. In Python, the method
issuperset() and the operator >= are used to check for superset
relationship.
s1.issubset(s2) or s1 <= s2 Returns True if s1 is a subset of s2, otherwise False
s1.issuperset(s2) or s1 >= Returns True if s1 is a superset of s2, otherwise
s2 False
Table 5.7: Comparing sets
The method issubset() or the expression s1 <= s2 returns True if
every item in set s1 is also present in set s2. Otherwise, it returns False.
The method issuperset() or the expression s1 >= s2 returns True
if every item in set s2 is also present in set s1. Otherwise, it returns
False. If two sets s1 and s2 are equal, then s1 is a subset of s2, and it is
also a superset of s2.
The methods issubset() and issuperset() can also accept
sequential types as arguments. But if we use the operators <= and >=, then
we can compare two sets only. Here are some examples:
>>> s1 = {'x', 'y', 'z', 'a', 'b'}
>>> s2 = {'x', 'y', 'z', 'a', 'b', 'c', 'd'}
>>> s3 = {'a', 'b', 'x', 'y', 'z'}
>>> s1.issubset(s2)
True
>>> s1 <= s2
True
>>> s2.issuperset(s1)
True
>>> s2 >= s1
True
Sets s1 and s3 are equal, so both are considered subset and superset of each
other.
>>> s1 <= s3
True
>>> s1 >= s3
True
Every set is considered a subset and superset of itself.
>>> s1 <= s1
True
>>> s1 >= s1
True
The operators < and > are used to check for proper subset and proper
superset. A proper subset is like a subset, but the two sets cannot be equal.
Similarly, a proper superset is like a superset, but the two sets cannot be
equal. There are no equivalent methods corresponding to these operators.
>>> s1 < s2
True
>>> s1 < s3
False
>>> s3 > s1
False
The expression s1 < s3 returns False because s1 is not a proper subset
of s3. A proper subset is any subset that is not equal to the set. The set s1 is
a subset of s3, but since it is equal to s3, it is not a proper subset. The
expression s3 > s1 returns False because s3 is not a proper superset of
s1. A proper superset is any superset that is not equal to the set. This set s3
is a superset of s1, but since it is equal to s3, it is not a proper superset.
5.20 Union, intersection, and difference of
sets
Python set type is different from other types that we have seen because it
supports all standard mathematical set operations like union, intersection,
and difference. These mathematical operations can be used in different types
of programming situations. We can use either a method or an equivalent
operator for any of these set operations.
s1.union(s2,s3,…) or s1 | s2 | s3
Returns a new set containing all items of sets s1, s2, s3…..
s1.intersection(s2,s3,…) or s1 & s2 & s3
Returns a new set containing only the common items of sets s1, s2, s3…..
s1.difference(s2) or s1 - s2
Returns a new set containing all items of s1 that are not in s2
s1.symmetric_difference(s2) or s1 ^ s2
Returns a new set containing items that are in set s1 or s2, but not both; so,
it actually returns elements of both sets that are not in the intersection.
Let us see some examples. We have taken three sets named
python_programmers, java_programmers and
c_programmers.
>>> python_programmers = {'Nick', 'Sam', 'Peter',
'Mary', 'Alan', 'Rose', 'Zara', 'Max'}
>>> java_programmers = {'Ted', 'Sandy', 'Peter',
'Alan', 'Ross', 'Max', 'Ruby'}
>>> c_programmers = {'Nick', 'Ted', 'Peter',
'Abbie', 'Julie', 'Jack', 'Jill'}
Suppose we want a set of programmers who can work both in Java and in
Python. So, we need names that are common to both the sets
python_programmers and java_programmers, and for that, we can
use the intersection method or the & operator.
>>>
python_programmers.intersection(java_programmers)
{'Alan', 'Max', 'Peter'}
>>> python_programmers & java_programmers
{'Alan', 'Max', 'Peter'}
If we want a set of programmers who can work in all the three languages,
Java, Python, and C, then we need names common to all three sets, so we
can write this:
>>>
python_programmers.intersection(java_programmers,
c_programmers)
{'Peter'}
>>> python_programmers & java_programmers &
c_programmers
{'Peter'}
Now, suppose we want a set of those programmers who can program either
in C or in Python. So now we need a union of the two sets
c_programmers and python_programmers.
>>> c_programmers.union(python_programmers)
{'Julie', 'Max', 'Jill', 'Rose', 'Sam', 'Peter',
'Abbie', 'Nick', 'Jack', 'Zara', 'Ted', 'Alan',
'Mary'}
>>> c_programmers | python_programmers
{'Julie', 'Max', 'Jill', 'Rose', 'Sam', 'Peter',
'Abbie', 'Nick', 'Jack', 'Zara', 'Ted', 'Alan',
'Mary'}
Now suppose we need a set of programmers who can program in Python but
who do not know Java. So, we want the names of all those programmers
who are in the python_programmers set but not in
java_programmers set. For getting this set we can use the
difference method or the equivalent operator.
>>> python_programmers - java_programmers
{'Rose', 'Sam', 'Nick', 'Zara', 'Mary'}
>>> python_programmers.difference(java_programmers)
{'Rose', 'Sam', 'Nick', 'Zara', 'Mary'}
The following two expressions will give the names of those programmers
who are in the python_programmers set but not in
java_programmers set or c_programmers set.
>>> python_programmers.difference(java_programmers,
c_programmers)
{'Rose', 'Sam', 'Zara', 'Mary'}
>>> python_programmers - java_programmers-
c_programmers
{'Mary', 'Rose', 'Zara', 'Sam'}
Now, let us use the symmetric_difference method on the two sets
python_programmers and java_programmers.
>>>
python_programmers.symmetric_difference(java_progra
mmers)
{'Sandy', 'Rose', 'Nick', 'Zara', 'Ruby', 'Mary',
'Ross', 'Sam', 'Ted'}
>>> python_programmers ^ java_programmers
{'Sandy', 'Rose', 'Nick', 'Zara', 'Ruby', 'Mary',
'Ross', 'Sam', 'Ted'}
This symmetric difference gives us a set of those programmers who can
program either in Java or in Python but not both. So, this set contains all
names in set python_programmers and in set java_programmers
minus the names that are common to both sets.
All these methods and operations were non-mutating. They do not make any
in-place changes in the set, which calls them; they always return a new set.
We can see that the original sets have not been changed.
>>> python_programmers
{'Max', 'Rose', 'Sam', 'Peter', 'Nick', 'Zara',
'Alan', 'Mary'}
>>> java_programmers
{'Ross', 'Sandy', 'Max', 'Ted', 'Ruby', 'Alan',
'Peter'}
>>> c_programmers
{'Julie', 'Abbie', 'Nick', 'Jack', 'Ted', 'Jill',
'Peter'}
The four nonmutating methods that we have seen have mutating equivalents
also. Here are the equivalent mutating methods and their equivalent
operators:
s1.update(s2) s1 |= s2
s1.intersection_update(s2) s1 &= s2
s1.difference_update(s2) s1 -= s2
s1.symmetric_difference_update(s2) s1 ^= s2
Table 5.8: Mutating methods and operators for sets
These mutating methods perform the same operation as their non-mutating
counterparts, but they perform the operation in-place, which means that they
change the set which calls them instead of returning a new set. All these
methods return None. These four mutating methods are also accessible
using the augmented assignment syntax.
>>>
python_programmers.intersection_update(java_program
mers)
>>> python_programmers
{'Alan', 'Max', 'Peter'}
We can see that the set python_programmers has been changed, and it
now contains the intersection of the two sets. The same effect can be
achieved by using the augmented assignment syntax.
>>> java_programmers &= c_programmers
>>> java_programmers
{'Peter', 'Ted'}
Now the set java_programmers has changed, and it contains the
intersection of the two sets java_programmers and
python_programmers. Similarly, the mutating equivalents of other
methods also make in-place changes.
If you want to perform these operations on other types like lists, string, or
tuple, you can do so by converting them to set.
>>> s1 = 'Welcome'
>>> s2 = 'Come here'
>>> set(s1) - set(s2)
{'l', 'W', 'c'}
>>> s3 = 'What is in a name'
>>> s4 = 'There are letters in a name'
>>> set(s3.split()) - set(s4.split())
{'is', 'What'}
>>> x = [1, 2, 3, 4, 5]
>>> y = [3, 4, 5, 6, 7]
>>> set(x) | set(y)
{1, 2, 3, 4, 5, 6, 7}
The set operations for finding union, intersection, and difference can be used
on view objects also that are returned by dictionary methods keys() and
items(). For example, if you have two dictionaries d1 and d2, the
expression d1.items() & d2.items()will give the key-value pairs
that are common to both dictionaries. The expression d1.keys() -
d2.keys() will give you the keys that are in d1 but not in d2.
>>> d1 = {'a': 15, 'b': 22, 'c': 35, 'd': 24}
>>> d2 = {'a': 15, 'b': 20, 'x': 29, 'd': 24}
>>> d1.items() & d2.items()
{('d', 24), ('a', 15)}
>>> d1.keys() - d2.keys()
{'c'}
You do not need to convert the output of these methods to set and then
perform these operations. This facility is not available for the values()
method of the dictionary.
5.21 Frozenset
A frozenset is the immutable version of a set. Once a frozenset is created, it
cannot be changed. Since they are immutable, they can be used as members
in other sets and as dictionary keys. You can think of a frozenset as a read-
only set. frozensets support the same operations as sets, except the
operations that change the contents. So, methods like add, remove, pop,
and update are not applicable for frozensets. You can create a frozenset by
sending an iterable to the frozenset function. In the following examples,
we have created frozensets from a set, list, and string.
>>> weekdays = frozenset({'Monday', 'Tuesday',
'Wednesday', 'Thursday', 'Friday'})
>>> weekend = frozenset(['Saturday', 'Sunday'])
>>> vowels = frozenset('aeiou')
>>> type(weekdays)
<class 'frozenset'>
>>> weekdays
frozenset({'Thursday', 'Monday', 'Tuesday',
'Wednesday', 'Friday'})
>>> weekend
frozenset({'Saturday', 'Sunday'})
>>> vowels
frozenset({'a', 'i', 'o', 'e', 'u'})
When you need an immutable version of a set, you can use a frozenset.
Exercise
1. Which of these cannot be used as a key in a dictionary?
(A) String
(B) Integer
(C) List
2. Only immutable types can be used as values in a dictionary.
(A) True (B) False
3. Which one of these will make changes in the dictionary object
referenced by name d?
(A) d.clear() (C) Both
(B) d = {} (D) None of these
4. A tuple can be used as key of a dictionary if it contains references to
(A) Only mutable objects
(B) Only immutable objects
(C) Both mutable and immutable objects
5. What is wrong with this dictionary?
{5: 'a', 2: 'j', 9: 'y', 6: 'y', 5: 's'}
(A) int type cannot be used as a key
(B) There is a duplicate key
6. d = {'apple': 100, 'banana': 75,
'mango': 80}
What is the value of len(d)?
(A) 3 (B) 6
7. d = {'apple': 100, 'banana': 75,
'mango': 80}
What will be the value of expression
d.get('grapes', -1)
(A) None
(B) -1
(C) Only single argument allowed in get()
(D) KeyError is raised
8. d = {'apple': 100, 'banana': 75,
'mango': 80}
What happens when you misspell a key while changing its value.
d['aple'] = 95
(A) KeyError
(B) new key 'aple' is added to the dictionary
9. As in strings and lists, the expression d[:] represents copy of a
dictionary d.
(A) True (B) False
10. Which one of these will give all key value pairs of a dictionary?
(A) d.elements()
(B) d.items()
(C) d.pairs()
11. How will you check whether a value v is present in a dictionary d?
(A) v in d
(B) v in d.values()
(C) Both
12. d = {123: 'Dev', 342: 'Raj', '567': 'John',
898: 'Sam'}
What will the following expression return?
(123, 'Raj') in d.items()
(A) True
(B) False
(C) Error is raised
13. In a dictionary, the method pop() cannot be used without an
argument.
(A) True (B) False
14. Which key-value pair does the method popitem() remove?
(A) First pair
(B) Last pair
(C) Random pair
15. If you want to delete a key-value pair from a dictionary and print the
deleted value, what will you use?
(A) del statement
(B) pop() method
(C) anyone
16. d = dict(zip('xyz', [4, 5, 6]))
Dictionary d is -
(A) {'xyz': [4, 5, 6]}
(B) {'x': 4, 'y': 5, 'z': 6}
17. Which one of these cannot be used to create a dictionary using
dict()?
(A) [['a',11], ['b',6], ['c',7]]
(B) [['a','x',4], ['b','y',5], ['c','z',6]]
(C) ['AB', 'CD', 'EF']
18. d = {'a': 1, 'b': 2, 'c': 3}
What will be the dictionary d after d.update({})?
(A) d becomes empty (B) d is not changed
19. What is length of this dictionary?
d = dict.fromkeys('HELLO', None)
(A) 1
(B) 4
(C) 5
20. What does {} create?
(A) empty dictionary
(B) empty set
(C) empty frozenset
21. If you want to create a dictionary from an iterable, such that all the
values in the dictionary are same, which method will you use?
(A) items()
(B) setdefault()
(C) fromkeys()
22. Is it possible to create a set of sets?
(A) Yes (B) No
23. ____ are very commonly used to test for membership of an item.
(A) Dictionaries
(B) Sets
(C) Tuples
24. Is it possible to create a set of frozensets?
(A) Yes (B) No
25. Which method is used to remove an element randomly from a set?
(A) pop
(B) popitem
(C) remove
26. What is the length of the following set s?
s = set('cookbook')
(A) 4
(B) 6
(C) 8
27. s = set(1, 2, 3, 1, 3)
What will be the value of s?
(A) {1, 2, 3}
(B) {1, 2, 3, 1, 3}
(C) this assignment statement raises TypeError
28. Which method will remove an element from a set without giving any
error if the element is not present?
(A) remove (B) delete
(C) pop (D) discard
29. Which data structure will you use when you want to store things and
order is important? Contents might change.
(A) list (B) tuple
(C) set (D) dictionary
30. If you want to store unique values and do not care about the order in
which they are stored, you can use a _______.
(A) list
(B) tuple
(C) set
31. When you have some ordered data that you know will not change,
you can store it in a _______
(A) list (B) tuple
(C) set (D) dictionary
32. Use ________ when you want to attach some information to values
and want to access that value by the information not by a numeric
index.
(A) list (B) tuple
(C) set (D) dictionary
33. Which of these does not allow duplicate values?
(A) tuples (B) frozensets
34. Which of these is not a sequence?
(A) list
(B) tuple
(C) set
35. When you have a table-like data, which data structure would you use?
(A) list (B) tuple
(C) set (D) dictionary
36. Which one of these cannot be used as a key in a dictionary?
(A) string
(B) list
(C) tuple
37. _______ should be used for static sequences of elements.
(A) list
(B) tuple
(C) set
38. ___________ are generally used when the data is labelled.
(A) Dictionaries (B) Lists
39. Dictionaries and sets can retrieve a value in constant time regardless
of the number of entries.
(A) True (B) False
40. String is a mutable sequence of characters.
(A) True (B) False
41. V = 'aeiou'
L = ['a', 'e', 'i', 'o', 'u']
S = {'a', 'e', 'i', 'o', 'u'}
Which of the following expression is most efficient?
(A) ch in V
(B) ch in L
(C) ch in S
42. Set is ________ unordered collection of unique _______ objects.
(A) immutable, immutable
(B) mutable, mutable
(C) immutable, mutable
(D) mutable, immutable
What will be the output of the code given in questions 43 to 55?
43. d = {(3, 4): 100, (5, 3): 20, (4, 5): 32}
print(d[5, 3])
44. d = dict(zip('good', range(4)))
print(d)
45. d = {'x': 10, 'y': 20, 'x': 33, 'z': 40}
print(d['x'])
46. s = {1, 2, 3, 4}
print(s[1])
47. s1 = {3, 2, 4}
s2 = {3, 2, 4}
print(s1 < s2)
48. s1 = {3, 2, 4}
s2 = {3, 2, 4}
print(s1 <= s2)
49. d = {'a': 1, 'b': 2, 'c': 2}
s = set(d)
print(s)
50. x = {'hello'}
y = set('hello')
print(x, y)
51. d = {'a': [1, 2, 3], 'b': 10, 'c': 12}
d2 = d
d['a'][1] = 55
d['b'] = 99
print(d2)
52. d = {'a': [1, 2, 3], 'b': 10, 'c': 12}
d2 = d.copy()
d['a'][1] = 55
d['b'] = 99
print(d2)
53. a = 5
D = {'k1': a, 'k2': 60, 'k3': 70}
a = 10
print(D['k1'])
54. x = frozenset(['a', 'b', 'c'])
y = {'d', 'e'}
x |= y
print(x)
55. x = y = z = 0
x = 2
print(x, y, z, end=' ')
d1 = d2 = d3 = {}
d1['a'] = 2
print(d1, d2, d3)
56. On the interactive prompt, create an empty dictionary named
currency and then add these key-value pairs to it.
'India': 'Rupee'
'UK': 'Pound'
'Japan': 'Yen'
'Austria': 'Euro'
'Bangladesh': 'Taka'
57. From the currency dictionary created in the previous question,
delete the entry related to key 'UK'
58. Delete the entry related to key 'Japan' and store the return value
in another variable named c.
59. Add a new entry in the dictionary with the key 'Switzerland'
and the value 'Swiss Franc'.
60. Change the value for key 'India' from 'Rupee' to 'Indian
Rupee'
61. Delete a random key-value pair from the dictionary.
62. Use appropriate methods to get lists of all keys, all values, and all
key-value pairs of the currency dictionary.
63. Given the following dictionary:
fruits_prices = {'apple': 100, 'banana': 75,
'mango': 80}
Use the appropriate method to access the values associated with keys
'apple' and 'grapes'. If the key is not present in the dictionary,
then it should be added with value 0.
64. Create a dictionary named login from the following list named
names.
names = ['John', 'Sam', 'Marie', 'Anne']
The elements of this list should become the keys of the dictionary, and
values associated with all keys should be None.
65. Given these 2 lists:
designation = ['programmer', 'manager',
'accountant']
salary = [4000, 5000, 3000]
Create the following dictionary from the above two lists.
{'programmer':4000, 'manager':5000,
'accountant':3000}
66. Given these 3 lists:
python_books = ['Learn Python', 'Programming
in Python', 'Python for beginners']
cplusplus_books = ['C++ in depth', 'C++
Programming']
java_books = ['Java Programming', 'Learn
Java']
Write a dictionary named books with the strings 'python',
'c++' and 'java' as keys and these lists as values. Thus, when
you write books['java'] you get the list of java books and
similarly for other keys.
67. Given these 2 dictionaries:
book_prices = {'Learn ABC': 150, 'Learn 123':
200, 'Rhymes': 300, 'Cursive Writing': 250}
new_stock = {'Stories': 350, 'Poems': 290,
'Spellings': 200}
Add all the key-value pairs of new_stock to book_prices.
68. Create this dictionary by using range() function and
fromkeys() method.
{1000: None, 2000: None, 3000: None, 4000:
None, 5000: None, 6000: None, 7000: None,
8000: None, 9000: None}
69. In the following nested dictionary, how will you access the last name
of the student?
student = {'name': {'first': 'John',
'last': 'Mark'
},
'marks': 98,
'age': 20
}
70. From this dictionary d, create a list that contains all the keys in sorted
order.
d = {2: 300, 8: 900, 7: 800, 1: 100}
71. In the following dictionary, key is an integer which represents the
student id, and value is list type which contains marks of the student
in three subjects.
marks = {2234: [99, 23, 56], 2135: [67, 56,
68], 2199: [78, 89, 66] }
Write an expression to get total marks of student with student id 2135.
72. In the previous chapter, we saw how to use a list of lists to represent a
matrix. We used two indices to access an element of the matrix( for
example matrix[1][4]). If a matrix is sparse, then we can save
space by using a dictionary to implement it. A matrix is sparse, if it
has many zero values in it. For example, this is a sparse matrix.
Figure 5.4: Sparse matrix
Create a dictionary named matrix which stores only non-zero
values of this matrix. Use a tuple of row and column numbers as the
key.
73. In the implementation of the matrix that we did in the last question, if
we try to access any element of matrix that is zero, we will get an
error. For example, if we write matrix[1,2] or matrix[2,0],
we will get an error. This is because there is no key in the dictionary
corresponding to zero elements of the matrix. Tuples (1,2) or
(2,0) are not present as keys in the dictionary. How will you solve
this problem?
74. Input two strings s1 and s2, and then create a list that contains all the
common characters of the two input strings.
75. From the following two strings, find all words common in both
strings. Extract words from the string by splitting on spaces.
string1 = 'Life has no remote, get up and
change it yourself'
string2 = 'Life has no ctrl+Z'
76. How will you count the number of unique items in a list?
77. Create a new list by filtering out all the duplicates from the following
list by using the set function.
L = [12, 44, 46, 32, 12, 43, 55, 86, 43]
Will the order of the original list be preserved if you use this approach
to filter out duplicates?
78. Enter a string and create 2 sets named v and c, where v is a set of
vowels present in the string and c is a set of consonants present in the
string.
79. How can you perform order-neutral equality tests in lists and strings
using sets? The following two lists L1 and L2 have the same
elements, only the order is different, so when you perform an order
neutral equality test on these two lists, they are considered equal.
L1 = [1, 2, 3, 4]
L2 = [2, 3, 1, 4]
This test just checks whether both of them contain the same elements.
80. How will you find out all the elements of list L1 that are not in L2.
L1 = [1, 2, 3, 7]
L2 = [2, 3, 4, 5]
81. How will you find all the common characters in three strings s1, s2,
and s3
Use the following two sets for questions 82 to 87
toppers = {'id11', 'id23', 'id34', 'id45',
'id77', 'id12', 'id89', 'id56', 'id55',
'id19'}
champions = {'id19', 'id23', 'id78', 'id99',
'id79', 'id13', 'id56', 'id45', 'id80'}
The set toppers is a set of roll numbers of academic toppers of the
school, and champions is a set of roll numbers of sports champions
of the school.
82. From the set of toppers, remove the student with roll number
'id11'.
83. From the set of champions, add two students with roll numbers
'id46' and 'id20'.
84. Find a set of all the toppers who are not champions.
85. Find a set of all the champions who are not toppers.
86. Find a set of all students who are champions as well as toppers.
87. Find a set of all students who are either champions or toppers.
Conditional Execution 6
The control flow of a program is the order in which the code written in the
program executes. Normally, the program executes from top to bottom with
one statement executed at a time. This is called sequential control. All the
programs we have written have been executed this way: top to bottom and
one statement at a time. This normal flow of control is changed by control
structures, which can be either selection control structures or iterative
control structures.
In Python, selection is supported by the if statement, and iteration is
supported by the while statement and for statement. The if statement is
a conditional statement, meaning we can use it to process our code
conditionally. The two iterative structures, while statement and for
statement are called loops as they are used to repeatedly execute a section of
code.
In this chapter, we will learn about the if statement, and in the next two
chapters, we will learn about loops.
6.1 if statement
While solving a problem in real life, we often need to make decisions and
act accordingly. Similar situations arise in programming also; we will want
our program to make decisions and perform different operations based on
those decisions. As in most other languages, in Python also, decision making
or conditional execution is done with the help of an if statement. By using
an if statement, you can make your program behave differently in different
situations. It gives your program the ability to make decisions and perform
actions based on those decisions.
When you need to execute some statements only if a certain condition holds,
you can use an if statement. Here is the syntax and flowchart of an if
statement:
Figure 6.1: if statement
We have the if keyword followed by a test expression, and then we have a
colon. The test expression is a Boolean expression, and therefore, it can be
either True or False; it is often called the if condition. After the colon, we
have the statement block, which will be executed when the test expression is
True. Each statement in this block should be indented by the same length
from the if line. We have seen earlier that Python uses indentation to
identify a block.
If the test expression evaluates to True, the statements inside the block will
be executed, and then the next statement after the if statement will be
executed. If the test expression evaluates to False, the statements inside the
block will be skipped, and the next statement will be executed. The
flowchart clarifies why if statement is also called a branching statement.
Let us see an example:
n1 = int(input('Enter a number : '))
n2 = int(input('Enter a number : '))
print(n1 + n2, end = ' ')
print(n1 - n2, end = ' ')
print(n1 * n2, end = ' ')
print(n1 / n2, end = ' ')
print(n1 // n2, end = ' ')
print(n1 % n2, end = ' ')
print(n1 ** n2, end = ' ')
Sample Run-
Enter a number : 14
Enter a number : 4
18 10 56 3.5 3 2 38416
This code executes sequentially; two numbers are entered, and then all the
statements are executed in order, one by one. We want the three statements
that print n1/n2, n1//n2, and n1%n2 to be executed only when the value
of n2 is not equal to zero because if the value of n2 is zero, we will get a
division by zero error. We want the three statements to be executed
conditionally, so we will write them inside an if statement.
n1 = int(input('Enter a number : '))
n2 = int(input('Enter a number : '))
print(n1 + n2, end=' ')
print(n1 - n2, end=' ')
print(n1 * n2, end=' ')
if n2 != 0:
print(n1 / n2, end=' ')
print(n1 // n2, end=' ')
print(n1 % n2, end=' ')
print(n1 ** n2, end=' ')
Sample Run 1-
Enter a number : 10
Enter a number : 5
15 5 50 2.0 2 0 100000
Sample Run 2-
Enter a number : 10
Enter a number : 0
10 10 0 1
In the if statement written in this program, n2 != 0 is the test expression,
and the three indented statements form the if block. When the program is
executed and 5 is entered for the variable n2, the condition n2 != 0 is True,
so the three statements inside if block are executed. When 0 is entered for
n2, the condition n2 != 0 becomes False, so the three statements inside the
if block are not executed. The execution of the if block depends on the if
condition. The rest of the statements outside the if statement will always
execute.
This is the first time we have seen a block. The syntax for defining blocks is
common for all the control structures and even for functions. A block is also
called a suite in Python, and it is a group of statements grouped together
through indentation. To specify the boundaries of a block, Python uses
indentation instead of curly braces or some keywords like begin or end that
are used in other languages.
In most languages, indentation is used just to enhance readability; it is not
compulsory and does not affect the logic of the program. Python uses
indentation for grouping together statements, so Python actually forces the
programmer to write uniform and readable code.
You can have any number of statements inside a block; there is no limit, but
there should be at least one statement. The colon marks the start of the
statement block, and the first unindented statement marks the end of the
block. The block finishes when the indentation decreases. The exact amount
of indent may vary, but the indentation should be consistent. The
recommended indent is 4 spaces. It is not a good idea to use tabs or mix tabs
and spaces while indenting. Mixing tabs with spaces can result in errors,
even though it might look correct on the screen.
So, we have seen how the if statement supports conditional execution; the
statements inside the if block will be executed only if the condition is True.
Otherwise, they will be skipped. Now let us see some small programs.
The following program uses an if statement to test whether a number n1 is
divisible by another number n2.
n1 = int(input('Enter a number : '))
n2 = int(input('Enter a number : '))
if n1 % n2 == 0:
print('n1 is divisible by n2')
n1 will be divisible by n2 if by dividing n1 by n2, the remainder comes out
to be zero. When the number n1 is divisible by n2, the condition n1 % n2
== 0 will be True, and the print call will execute. When n1 is not
divisible by n2, the condition n1 % n2 == 0 will be False, and the
print call will not execute.
While typing the if statement, you will notice that when you put a colon
after the condition and then press Enter, the cursor goes to the next line
leaving some space. This is because IDLE knows that a colon means a new
block is going to start, so it automatically indents the next line. Most of the
IDEs will do this automatic indenting.
Instead of n2, if we write 2 in the condition, we are checking divisibility by
2, and if a number is divisible by 2, it is an even number.
n1 = int(input('Enter a number : '))
if n1 % 2 == 0:
print('n1 is even')
You can combine multiple conditions using the three logical operators and,
or, and not. The following if condition uses the and operator to check if
both n1 and n2 are even.
if n1 % 2 == 0 and n2 % 2 == 0:
print('Both n1 and n2 are even')
The test expression will be True when both the expressions in it are True. So,
the print call will be executed only when both n1 and n2 are even.
Similarly, we can use or and not operators in our conditions.
If we want to perform some action when any one of the two conditions is
True, we can combine the conditions using the or operator.
age = int(input('Enter age : '))
if age < 5 or age > 80:
print('Entry prohibited')
To check whether a value is present in a list, tuple, string, or dictionary, we
can use the in and not in operators.
athletes = ['Ram', 'Sam', 'Shyam', 'Abhi', 'Adi']
student = input('Enter student name : ')
if student in athletes:
print('You are awarded a scholarship')
failed_students = ['Pam', 'Sam', 'Ron', 'Ted']
student = input('Enter student name: ')
if student not in failed_students:
print('You are promoted')
If you are checking the equality of a variable multiple times, you can replace
the or operators with the in operator and a set. Here is an example:
if error_code == 400 or error_code == 404 or
error_code == 301:
print('Bad error')
if error_code in {400,404,301}:
print('Bad error')
The second version is more concise than the first one. We could have used a
list here, but using a set is better as searching is more efficient in it.
In the next program, we will check whether a string is a palindrome. A
palindrome is a word or a phrase that reads the same forwards or backward,
for example, ‘madam,’ ‘refer,’ and ‘level’ - these all are palindromes.
s = input('Enter a string : ')
if s == s[::-1]:
print(f'{s} is a palindrome')
A string will be a palindrome if the reverse of the string is the same as the
string. In Python, we can easily find out the reverse of a string by writing the
expression s[::-1].
If you want to execute compound statements like if statement and for
statement on the interactive prompt, you need to enter a blank line after
entering the code. This means that you have to press Enter twice to execute
the compound statement.
>>> s = 'madam'
>>> if s == s[::-1]:
... print(f'{s} is a palindrome')
...
...
madam is a palindrome
As we have seen in Chapter 2 when we enter a multiline statement on the
interactive prompt, the prompt changes from >>> to three dots(…), which is
the line continuation prompt.
6.2 else clause in if statement
Figure 6.2: if statement with else clause
In the if statement, you can also add an else clause in which you can
write the statements that you want to be executed when the test expression is
False.
The else keyword is followed by a colon and should be aligned with the
keyword if. All the statements in the else block should be indented by the
same amount.
If the test expression is True, then the if block is executed; otherwise, the
else block is executed. We have seen these two if statements in the
previous section.
if n % 2 == 0:
print('n is even')
if s == s[::-1]:
print(s, 'is a palindrome')
Let us write the else clause for both of them.
n = int(input('Enter a number : '))
if n % 2 == 0:
print('n is even')
else:
print('n is odd')
Sample Run 1-
Enter a number : 3
n is odd
Sample Run 2-
Enter a number : 8
n is even
s = input('Enter a string : ')
if s == s[::-1]:
print(f'{s} is a palindrome')
else:
print(f'{s} is not a palindrome')
Sample Run 1-
Enter a string : refer
refer is a palindrome
Sample Run 2-
Enter a string : learn
learn is not a palindrome
6.3 Nested if statements
if statements can be nested, which means that you can have an if
statement inside another if statement. We have seen that this is the syntax
of an if statement with an else clause.
if test-expression:
statement1
statement2
statement3
else:
statementA
statementB
statementC
Next statement
Inside the if block or the else block, we can have any type of Python
statement; it can be an if statement also.
if test-expression:
if test-expression2:
blockA
else:
blockB
else:
statementA
statementB
statementC
Next statement
Here, we have another if statement inside the if block. In the else block
also, we could write the if statement. Let us see an example program.
s = input('Enter a string : ')
if s == s[::-1]:
print(f'{s} is a palindrome')
else:
print(f'{s} is not a palindrome')
We have seen this program before. Now, suppose we do not want to print
only the message that s is a palindrome; we also want to check whether it is
big palindrome or a small palindrome. If the length is less than 4, we will
call it a small palindrome, otherwise, we will call it a big palindrome. Now,
in the if block, instead of the statement that includes a print call, we will
write another if statement.
s = input('Enter a string : ')
if s == s[::-1]:
if len(s) < 5:
print(f'{s} is a small palindrome')
else:
print(f'{s} is a big palindrome')
else:
print(f'{s} is not a palindrome')
We have two if statements with else clauses. The first else goes with
the inner if, and the second else goes with the outer if; the indentation
makes it all clear. Nested statements have different levels of indentation.
Here are some sample runs of this program:
Sample Run 1-
Enter a string : malayalam
malayalam is a big palindrome
Sample Run 2-
Enter a string : maths
maths is not a palindrome
Sample Run 3-
Enter a string : noon
noon is a small palindrome
Let us see one more example. We have this piece of code where we enter the
marks of a student and decide whether the student has got an A grade.
marks = int(input('Enter marks : '))
if marks >= 70:
print('Well done, you have got A grade')
else:
print('Try to get A grade next time')
Now, we will add an if statement in both the if block and the else
block.
marks = int(input('Enter marks : '))
if marks >= 70:
print('Well done, you have got A grade')
if marks >= 90:
print('You are awarded a scholarship')
else:
print('Try to get A grade next time')
if marks < 40:
print('You really need to work hard')
If marks are greater than or equal to 70, the student gets an A grade, and if he
gets an A grade and his marks are greater than or equal to 90, he gets a
scholarship. If a student does not get an A grade and his marks are less than
40, another print call will be executed. Here are some sample runs of this
program:
Sample Run 1-
Enter marks : 95
Well done, you have got A grade
You are awarded a scholarship
Sample Run 2-
Enter marks : 80
Well done, you have got A grade
Sample Run 3-
Enter marks : 35
Try to get A grade next time
You really need to work hard
Sample Run 4-
Enter marks : 45
Try to get A grade next time
6.4 Multiway selection by using elif clause
Let us write a program in which we have to assign different grades to
students depending on their marks. These are the criteria for assigning
grades.
Assign grade A if marks >= 70
Assign grade B if marks >= 60 and marks < 70
Assign grade C if marks >= 50 and marks < 60
Assign grade D if marks >= 40 and marks < 50
Assign grade E if marks < 40
Table 6.1
Here is the program: first, we enter the marks, then write simple if
conditions to assign these grades, and then print the grade.
marks = int(input('Enter marks - '))
if marks >= 70:
grade = 'A'
if marks >= 60 and marks < 70:
grade = 'B'
if marks >= 50 and marks < 60:
grade = 'C'
if marks >= 40 and marks < 50:
grade = 'D'
if marks < 40:
grade = 'E'
print(f'Student gets {grade} grade')
This program works, but it is inefficient, as it makes the interpreter do
unnecessary work. Let us discuss how.
Suppose the student scores 89 marks. The first condition marks >= 70
evaluates to True, resulting in the grade being set to A. Since the conditions
for the grades are mutually exclusive, there is no need to check the
remaining conditions, as only one grade can be assigned. However, the
interpreter will execute all the if statements one by one, even though the
subsequent conditions are guaranteed to be False. The grade will remain A
and will be printed at the end. We can avoid unnecessary checks done by the
interpreter by conditionally executing the rest of the if statements. This can
be done by using the nested if statements.
By using nested if statements, we can conditionally execute the
subsequent checks based on the result of the first condition. If the first
condition is True, we skip the other checks and directly assign the grade. If it
is False, we proceed to the next condition until we find the appropriate
grade. By doing this, we minimize the number of checks needed, thus
making the program more efficient.
marks = int(input('Enter marks - '))
if marks >= 70:
grade = 'A'
else:
if marks >= 60:
grade = 'B'
else:
if marks >= 50:
grade = 'C'
else:
if marks >= 40:
grade = 'D'
else:
grade = 'E'
print(f'Student gets {grade} grade')
If marks >= 70, grade is set to A. If this condition is False, it means
that marks will be less than 70, and so in the else part, we will assign
grades B, C, D or E. In the else part, we have written the if statement
with condition marks >= 60. We need not check for marks < 70 here
because we will come here only when the condition marks >= 70 fails,
so marks will be less than 70. After this, we have an else clause for this if
statement. In the else part, we will assign grades C, D, or E.
If marks >= 50, we assign the grade C. Again, we need not check the
condition marks < 60 because we will come here only if marks are less
than 60. Next, we assign the grade D if marks >= 40. In the else part of
this if statement, the grade will be E because control will come here when
the condition marks >= 40 fails, i.e., when marks are less than 40.
Now, let us see how this code is more efficient than the previous one. If a
student gets 89 marks, then the condition marks >= 70 will be True, and
the statement grade = 'A' is executed. The whole else part is skipped
and then the grade is printed. In the previous version, all the if statements
were tested in this case.
Now, let us see what happens if the student gets 56 marks. The condition
marks >= 70 is False, so the else block will be executed, and in the
else block, we have the if statement with the condition marks >= 60.
This condition is also False, so the else block will be executed. In the
else clause, we have the if statement with the condition marks >= 50.
This condition is True, so the statement grade = 'C' is executed and the
else part is skipped.
This structure is like an else-if chain or else-if ladder; it is used when we
have multiple mutually exclusive conditions. There is excessive indentation
involved here, which makes it difficult to read. Each time we add a nested
if, we need to increase the indentation. Python has a solution for this in the
form of an alternative syntax. You can replace else and the following if
by the elif keyword.
marks = int(input('Enter marks - '))
if marks >= 70:
grade = 'A'
elif marks >= 60:
grade = 'B'
elif marks >= 50:
grade = 'C'
elif marks >= 40:
grade = 'D'
else:
grade = 'E'
print(f'Student gets {grade} grade')
Each elif keyword should align with the if keyword and the final else
keyword. The keyword elif is just a shortcut for else if. This code is
similar to the previous one but is definitely more readable due to less
indentation. In the previous code, we had many if statements, but here, we
have only one if statement with multiple elif clauses and an else
clause. That is why, in the previous code, all the blocks are indented
differently, while here, all the blocks are at the same level of indentation.
The elif clause helps in multiway selection and reduces the amount of
indentation that is to be done when we use the nested if else statements.
The working of this construct is simple: each condition is checked in order;
if the first condition is True, the statement block under it is executed, and
other conditions are not checked. If the first condition is False, the second is
checked; if the second is False, the third is checked, and so on. If any of
them is True, the block under it executes, and the control comes out of the
whole if statement. The final else block will be executed when none of
the conditions is True, so it acts as the default case. Here is the syntax and
flowchart of an if statement with elif and else clauses.
Figure 6.3: if statement with elif clauses
This if..elif..else statement implements multiway branching. From
all the blocks, exactly one block will be executed.
If expression1 is True, then statementblockA is executed, the if
statement ends, and then the Next statement is executed. If
expression1 is False, then expression2 is checked.
If expression2 is True, then statementblockB is executed, the if
statement ends, and then the Next statement is executed. If
expression2 is False, expression3 is checked.
If expression3 is True, then statementblockC is executed, the if
statement ends, and then the Next statement is executed. If
expression3 is False, then statementblockD is executed, the if
statement ends, then the Next statement executes.
So, when any one of the test expressions evaluates to True, the
corresponding block is executed, the rest of the elif clauses are
automatically skipped, and the whole if statement ends, and the execution
resumes after the if statement. If none of the conditions evaluates to True,
the block in the else clause will be executed.
You can have many elif statements, but there can be only one else clause,
and it is optional. The else clause actually acts as the default or “catch-all”
condition. When all the conditions are False, the block under else will be
executed. Although the else clause is optional, it is a good idea to write a
final else in the elif ladder to ensure all the cases are covered.
While writing if statements with elif clauses, try to write those
conditions first that are more likely to be true. The conditions less likely to
be True should be towards the end.
Let us discuss one more example that uses elif clause. We have to enter a
number and display if it is less than 100, more than 100, or equal to 100.
These three are mutually exclusive conditions, which means that only one of
them can be True at a time, and so we can use an if statement with elif
clauses.
n = int(input('Enter a number : '))
if n < 100:
print('Number is less than 100')
elif n > 100:
print('Number is more than 100')
else:
print('Number is equal to 100')
In our next program, we enter a single character, and the program prints
what type of character it is. We have used the else clause to handle all the
possibilities left.
ch = input('Enter a single character : ')
if len(ch) != 1:
print('You did not enter a single character')
elif ch.isupper():
print('Uppercase letter')
elif ch.islower():
print('Lowercase letter')
elif ch.isnumeric():
print('Number')
elif ch.isspace():
print('Space')
else:
print('Special character')
print('Bye')
We can also use elif clauses to create simple menu-based programs.
x = int(input('Enter a number : '))
y = int(input('Enter another number : '))
print('1. Add the two numbers')
print('2. Subtract first from second')
print('3. Subtract second from first')
print('4. Multiply the two numbers')
print('5. Divide first by second')
print('6. Divide second by first')
choice = input('Enter your choice : ')
if choice == '1':
print(x + y)
elif choice == '2':
print(y - x)
elif choice == '3':
print(x - y)
elif choice == '4':
print(x * y)
elif choice == '5':
print(x / y)
elif choice == '6':
print(y / x)
else:
print('Wrong choice')
We enter two numbers and then display a menu, and then we ask the user to
enter a choice. Depending on the choice entered by the user, we perform a
particular operation using the if statement with elif clauses.
The else clause acts as the catch-all case, so if any number other than 1 to
6 is entered as the choice, the message ‘Wrong choice’ will be displayed.
We have to run this program again and again to execute different cases. We
will discuss how to do this repeatedly in one run when we study loops.
6.5 Truthiness
We have seen that Python has a Boolean data type (bool), with only two
values, True and False. Here are some expressions that evaluate to either
True or False.
3 < 5 a >= b a is b not x x in listA
We know that we can use these expressions in a boolean context. For
example, in the test expression of an if statement. In Python, we can use a
non-boolean value also in a boolean context. For example, we could write
if statements of this type.
if listA:
print('Do something')
if dictA:
print('Do something')
if x:
print('Do something')
We are using non-boolean values in boolean context. Boolean context means
a boolean value is needed from the expression. The if statement needs to
know whether the test expression is True or False. So, there have to be rules
for deciding what values are considered True and False. This brings us to the
concept of truthiness. In Python, every value is either a truthy value or a
falsy value. Truthy values are values that evaluate to True when used in a
boolean context, and falsy values are values that evaluate to False when used
in a Boolean context.
These values are considered falsy values in Python.
False None 0 0.0 0.0+0.0j '' []
() {} set()
Boolean value False, None, 0 of any numeric type (integer, float, or
complex) are considered falsy. Empty containers are considered false, so an
empty string, empty list, empty tuple, empty dictionary, and empty set are all
falsy values. Everything else is truthy; any non-zero number or non-empty
container is evaluated to True. So, individual values or objects in Python
have an inherent truthiness; they can be either truthy or falsy. User-defined
objects can customize their truth value by providing
a __bool__() method. We will discuss that later on.
In the if statement if listA:, if the list is empty, the condition will be
considered False, and if it is not empty, it will be considered True. The same
applies to the dictionary in the if statement if dictA:.
In the statement if x:, if the value of x is zero, the condition will be False;
if it is anything non-zero, it will be considered True.
When a non-boolean value is used in a boolean context, Python evaluates the
truthiness of that expression which means that it evaluates the value to either
True or False. Thus, truthiness is the boolean meaning of a value.
You can explicitly check the truthiness of a value by using the bool built-in
function. Pass the value to the bool function to see whether it evaluates to
True or False.
>>> bool(0)
False
>>> bool(90)
True
>>> bool('')
False
>>> bool('ab')
True
>>> bool([])
False
>>> bool([1,2,3])
True
>>> bool('False')
True
The last one is True because 'False' is a non-empty string, not the
Boolean value False. If we remove the quotes, it will be False.
>>> bool(False)
False
Similarly, bool('0') will be True as '0' is a non-empty string.
Whenever you have to perform an action, when some container is non-
empty, or a number is non-zero, or a Boolean variable is True, you can just
write if x: type of condition that contains only the variable. There is no
need to write the full conditions.
Figure 6.4: Concise way of writing if condition
This is a concise and more Pythonic way and is generally used by
programmers. Similarly, when you have to do something when a container is
empty or a number is zero, or a Boolean variable is False, or a variable is
None, you can write the condition as if not x:
Figure 6.5: Concise way of writing if condition
Let us discuss some examples:
name = input('Enter a name : ')
if name:
print('Hello', name)
else:
print('You did not enter anything')
Here, we are entering a string and assigning it to the variable name. If name
is a non-empty string, the if condition will be True, and
print('Hello', name) will execute, and if name is an empty string,
the if condition will be False and print('You did not enter
anything') will execute. Here is another example:
if listA:
print('Not empty')
else:
print('Empty')
Here we have a list, and we want to check if it is empty or not. If the list is
not empty, the condition will be True and print('Not empty') will
execute, and if the list is empty, print('Empty') will execute.
If we assign None to listA, then also print('Empty') will be
executed, as None is considered falsy. To be more specific, we can write the
conditions explicitly.
if listA is None:
print('None')
elif listA == []:
print('Empty')
else:
print('Non Empty')
There are two built-in functions named any and all that can be used to
check the truthiness of values inside an iterable like a list or tuple.
all(x) Returns True if all elements in the iterable x are Truthy
any(x) Returns True if any item in the iterable x is Truthy
Table 6.2: Built-in functions any and all
>>> help(all)
all(iterable, /)
Return True if bool(x) is True for all values x
in the iterable.
If the iterable is empty, return True.
>>> help(any)
any(iterable, /)
Return True if bool(x) is True for any x in the
iterable.
If the iterable is empty, return False.
>>> L = [1, 2, 0, 3]
>>> all(L)
False
>>> any(L)
True
>>> L = [0, 0, 0]
>>> any(L)
False
>>> L = [1, 2, 3]
>>> all(L)
True
6.6 Short circuit behavior of operators and
and or
If the value of an expression containing and or or can be determined by the
first operand only, the second operand is not evaluated. Here is the truth
table of and operator.
Figure 6.6: Truth table of and operator
We can see that if the first operand is False, the result is False regardless of
the value of the second operand. If the first operand is False, the value of the
second operand does not really matter since the result will be False anyway.
And this is why the interpreter will not evaluate the second operand if the
first one is False. When the first operand is True, the result can be either
True or False depending on the second operand, so when the first operand
evaluates to True, the interpreter has to evaluate the second operand.
A similar explanation goes for the or operator. Here is the truth table of or
operator.
Figure 6.7: Truth table of or operator
We can see that if the first operand is True, the result is True regardless of
the value of the second operand. If the first operand evaluates to True, the
interpreter will not evaluate the second operand and will consider the whole
expression as True. If the first operand is False, the result can be True or
False depending on the second operand, so when the first operand evaluates
to False, the interpreter has to evaluate the second operand.
Therefore, in the case of and operator, if the first operand is False, the
second operand is not evaluated, and in the case of or operator, if the first
operand is True, the second operand is not evaluated.
This is called the short circuit evaluation of these operators. This feature not
only makes the interpreter do less work but sometimes it can also be used to
prevent certain types of errors. Here are some examples:
if x != 0 and 1/x > n:
print('Do something')
if i < len(data) and data[i] == item:
print('Do something')
if x >= 0 and x**0.5 > 4:
print('Do something')
if 'city' in d and d['city'] == 'Paris'
print('Do something')
In these cases, we want the second condition to be checked only if the first
condition is True. If the first condition is False, we do not want the second
condition to be checked because, in that case, it will give an error. For
example, in the first code snippet, we want the comparison 1/x > n to be
done only when x is not equal to 0 because otherwise, it will give a divide
by zero error.
x != 0 and 1/x > n are the two operands of and operator; the
interpreter will evaluate the second operand only if the first one is True. This
means that it will evaluate 1/x > n only if x is non-zero. If x != 0 is
False, 1/x > n will not be evaluated, so there are no chances of getting
any divide by zero error. If the interpreter were to evaluate both operands,
we would get a divide by zero error when x is zero. Similarly, in the second
example, we have avoided taking the square root of negative numbers. The
operand x**0.5 > 4 will be evaluated only when x is a positive number.
This short circuit evaluation can also be useful in sequences and dictionaries
to avoid IndexError or KeyError. In a sequence, before checking data
at a certain index, we can ensure the index is valid. Similarly, in a dictionary,
we can check for a valid key before accessing the value associated with that
key. This way, we can avoid IndexError in sequences and KeyError in
dictionaries, as we have done in the last two examples. Thus, the left
operand can act as a guard for the second operand.
Equivalently, we could have written these constructs using two if
statement. For example, we can write the first example like this:
if x != 0:
if 1/x > n:
print('Do something')
This one works in the same way, but the one with the and operator is more
readable and is a common trick used by programmers.
6.7 Values returned by and and or operators
Unlike some other languages, in Python, the logical operators and and or
do not return Boolean values True or False; they actually return the last
evaluated operand. We generally use these operators in if and while
conditions, so we do not get to know what they return exactly because, in
those cases, only their truth value is used. Let us see what they actually
return.
>>> 0 and 4
0
>>> 4 and 8
8
>>> 0 or 4
4
>>> 4 or 8
4
For and operator, the second operand is not evaluated if the first one is
False.
In the expression 0 and 4, the interpreter evaluates the first operand; it is
False, so there is no need to evaluate the second operand. 0 is the last
evaluated operand and so it is returned.
In the expression 4 and 8, the first operand is True, so the second operand
has to be evaluated, and so here, 8 is the last evaluated operand, and it is
returned.
For or operator, the second operand is not evaluated if the first one is True.
In the expression 0 or 4, the first operand is False, so the second operand
has to be evaluated. 4 is the last evaluated operand, so it is returned.
In the expression 4 or 8, the first operand is True; there is no need to
evaluate the second one. Thus, 4 is the last evaluated operand, so it is
returned.
The expression operand1 and operand2 first evaluates operand1;
if it is False, its value is returned; otherwise, operand2 is evaluated and its
value is returned.
The expression operand1 or operand2 first evaluates operand1;
if it is True, its value is returned; otherwise, operand2 is evaluated, and its
value is returned.
These operators actually return operands, but most of the time, they are used
in a Boolean context, so only their truth value is used. The fact that they
return the last evaluated argument can be used by programmers in certain
situations.
Suppose we have a string s, and if it is empty, it has to be replaced by a
default value, 'NA'. We can write this:
s = s or 'NA'
If the string s is empty, the first operand s will be False, so the second
operand will be evaluated, and it becomes the value of the expression. If the
string s is not empty, the second operand will not be evaluated, and the
value of the expression s or 'NA' will be s only. Here is another
example:
average = count!=0 and total/count
Here, we are finding the average and guarding our division by using the and
operator. If count is not equal to zero, the first operand will be True, so the
second operand will be evaluated, and its value will be returned and assigned
to average.
If count is equal to 0, the first operand will be False, so the second operand
will not be evaluated, thus avoiding divide by zero error. The value of the
first operand will be assigned to average. So, average will be assigned
False. In Python, False is numeric value 0, and True is numeric value 1. So,
when we use this average in mathematical context, value 0 is used.
The operator not always returns Boolean value True or False; True if its
argument is falsy, False if its argument is Truthy.
6.8 if else operator
We know that unary operators operate on one operand, and binary operators
act on two operands. The operator that we are going to see now is a ternary
operator, as it acts on three operands. Here is what the if-else operator looks
like with its three operands.
expression1 if test-expression else expression2
The 3 expressions are the 3 operands. The keywords if and else form the
operator. Let us see how this operator works.
The test-expression or the condition is evaluated, and if it is True, the
left expression is evaluated and its value is the value of the whole
expression, and if the condition is False, then the right expression is
evaluated and its value is the value of the whole expression. So, this operator
checks the condition and then returns the value of either of the two
expressions, depending on the truth value of the condition. The condition
will always be evaluated, while only one of the two expressions will be
evaluated. Here is an example:
x = 6
y = x+5 if x%2==0 else x+10
First, the condition x%2==0 is checked, the value of x is 6, 6%2 is 0, and
the condition x%2==0 is True, so the first expression is evaluated, and the
value 6+5 is assigned to y.
If the value of x is 7, the condition x%2==0 will be False, so the second
expression will be evaluated, and the value 7+10 will be assigned to y.
We can read the statement as - y will be equal to x+5 if x is even, else y
will be equal to x+10. The statement could be written using an if
statement.
if x % 2 == 0:
y = x + 5
else:
y = x + 10
We can see that the ternary operator is just a shorthand operator that reduces
a 4-line if else code to a simple one-line code, which is quite readable. Let us
see some more examples that will make things clearer.
remarks = 'Pass' if marks >= 40 else 'Fail'
If marks >= 40, the string 'Pass' is assigned to remarks; otherwise,
'Fail' is assigned.
discount = 5 if items < 10 else 15
If a customer buys less than 10 items, the discount is 5 percent; otherwise,
the discount is 15 percent.
greater = x if x > y else y
Here, the variable greater is assigned the value of x if x is greater than y;
otherwise, it is assigned the value of y.
average = total/count if count else None
If count is non zero, total/count value is assigned to average;
otherwise, None is assigned to average.
print('Sir' if gender == 'male' else 'Madam')
Here, we have used the ternary operator inside a print function call.
voter_id = 'NA' if age < 18 else input('Enter voter
id')
If age is less than 18, 'NA’ is assigned to voter_id; otherwise, the value
returned by input is assigned.
z = 10 + (x if x > y else y)
Here, we add 10 to the greater of the two values x and y. If we do not put
the parentheses, Python will interpret it differently, taking 10 + x as the
first expression.
b = 100 * (a if a>=0 else -a)
Here, we are multiplying 100 with absolute value of a.
For these simple cases, a full 4-line if-else code would be an overkill, while
the ternary operator is concise and more readable. There is no efficiency
difference between an if-else statement and if-else operator code,
but the code with this operator is shorter.
Let us see an example in which one conditional expression is placed inside
another.
remarks = 'Excellent' if marks>=90 else ('Pass' if
marks>=40 else 'Fail')
If marks >=90, remarks will be assigned 'Excellent'; otherwise,
if marks >= 40, remarks will be assigned 'Pass'; otherwise,
remarks will be assigned 'Fail'.
The equivalent code using the if statement would be:
if marks >= 90:
remarks = 'Excellent'
elif marks >= 40:
remarks = 'Pass'
else:
remarks = 'Fail'
Exercise
What will be the output of questions 1 to 7?
1. n = 2
if n = 2:
print('X')
else:
print('Y')
2. units = 95
if units < 100:
bill = units * 1
else:
bill = uniiits * 1.5
print(bill)
3. s = None
if s is 'None':
print('this')
else:
print('that')
4. x = 9.7
if x:
print('Hello')
else:
print('Hi')
5. listA = [1, 2, 3, 4]
if not listA:
print('Good Morning')
else:
print('Good Evening')
6. m = 10
n = 50 if m < 0 else 20
print(n)
7. y = 402
x = 2 if y % 2 == 0 else 1
print(x)
8. When will the following code print C as the output?
if expression1:
print('A')
elif expression2:
print('B')
elif expression3:
print('C')
else:
print('D')
(A) when expression1, expression2 and expression3 all are True
(B) when expression1 and expression2 are False and expression3 is
True
9. a = 10
b = 20
Which of the following expressions will be True?
(i) a > 0 and b % 2 == 0 (ii) a % 2 == 0 and b < 0
(A) Only (i) (C) Both (i) and (ii)
(B) Only (ii)
10. x = 3
Which of the following expressions is False?
(A) x < 0 (C) not x % 2 != 0
(B) not x (D) All are False
11. if n > 0:
if n < 10:
Which of these is equivalent to the above nested if conditions?
(A) if 0 < n < 10:
(B) if 0 < n and n < 10:
(C) Both
12. Write a program that enters a string and prints whether it is a
palindrome. Ignore case and spaces, so that all strings like ‘Nurses
run’ ‘Was it a rat I saw’ are considered palindromes.
13. Write a program that inputs the length of three sides of a triangle, and
prints the perimeter of the triangle. It should also print whether the
triangle is equilateral, isosceles or scalene. If there is no triangle
possible with the given sides, then instead of printing the above things
it should print ‘No triangle possible with these sides’.
In an equilateral triangle, all sides are equal. In an isosceles triangle,
any two sides are equal. In a scalene triangle, all three sides are
unequal. For a triangle to be possible with given sides, sum of any
two sides should be greater than the third side.
Here are two sample runs of the program.
Sample run 1- Sample Run 2-
Enter first side : 2 Enter first side : 2
Enter second side : 3 Enter second side : 1
Enter third side : 4 Enter third side : 4
Perimeter of the triangle is 9 No triangle
possible with these sides
Scalene Triangle
14. Write a program that prompts the user to input his/her weight in kg
and height in cm, and calculates the body mass index (BMI). BMI is
calculated by dividing body weight in kg by square of height in
meters. For example if weight is 70 kg, height is 170 cm, then BMI is
70/(1.7*1.7) = 24.2 Display the BMI and appropriate message
according to the BMI.
< 18.5 - Underweight 18.5 to 24.9 - Normal weight 25 to 29.9 -
Overweight >=30 - Obese
15. In the previous program, give the user an option to enter the height in
inches or cm and weight in kgs or pound.
1 inch = 2.54 cm, 1 pound = 0.4535924 kg
16. Write a program to check whether a given sentence is a pangram or
not. A pangram is a sentence that uses every letter of the alphabet at
least once. Some examples of pangrams are - “The quick brown fox
jumps over the lazy dog.” “Pack my box with five dozen liquor jugs.”
“Waltz, nymph, for quick jigs vex Bud.”
17. Write a program to find whether two phrases are anagrams. Anagram
is a word or a phrase, formed by rearranging the letters of a different
word or phrase. Some examples of anagrams are -
“binary” and “brainy”, “silent” and “listen”, “forty five” and “over
fifty”, “Madam Curie” and “Radium came”
18. Write a program to find whether a year is a leap year or not. A year is
a leap year if it is divisible by 4, but not every year that is a multiple
of 4 is a leap year. If a year is divisible by 100, then it is not a leap
year unless it divisible by 400.
Years 1980, 2040 are leap years as they are divisible by 4.
Years 2000, 2400, 1800, 1900, 2500 are divisible by 4, but since they
are divisible by 100 we cannot say that all of them are leap years.
Only those which are divisible by 400 will be leap years.
Years 2000 and 2400 are leap years as they are divisible by 400, while
1800, 1900, 2500 are not leap years.
19. We have seen the following program in the chapter.
marks = int(input('Enter marks - '))
if marks >= 70:
grade = 'A'
elif marks >= 60:
grade = 'B'
elif marks >= 50:
grade = 'C'
elif marks >= 40:
grade = 'D'
else:
grade = 'E'
print(grade)
Suppose most of the students get grade E or grade D and very less
students get A grade. It would be more efficient to rewrite this code
the other way round. Refactor this code so that the more frequent
conditions are written at the top of the if statement.
20. Rewrite the following piece of code using the ternary operator.
if bill_amount > 2000:
free_home_delivery = 'Available'
else:
free_home_delivery = 'Not Available'
21. Write a more efficient version of this code
if x < y:
print('x is less than y')
if x > y:
print('x is greater than y ')
if x == y:
print('x is equal to y')
22. A list named L contains some integer values. Write a line of code to
find the average of the list elements using the ternary operator.
23. We have seen that we can use the get method to avoid errors while
accessing a non-existent key.
D = {'a': 23, 'd': 34, 'j': 56}
val = D['b'] # Raises Error
val = D.get('b', 0) # Returns 0
Instead of using get(), write a line of code that uses a ternary
operator to return a default value when the key is not present in the
dictionary.
24. Rewrite these expressions by eliminating the not operator so that the
new expressions are more readable.
(i) if not grade == 'A': (iv) if not (marks > 0
and marks <= 100):
print('Work Hard')
print('Out of range')
(ii) if not age < 18: (v) if not (age < 18 or
weight > 60):
print('You can vote')
print('Allowed to play the game')
(iii) if not n % 2 == 0:
print('n is odd')
You can avoid using the not operator by using the opposite relational
operator.
For (iv) and (v) you can use the DeMorgan’s laws to distribute the
not operator over boolean expressions.
1. NOT (a AND b) = (NOT a) OR (NOT b) 2. NOT (a OR b) = (NOT
a) AND (NOT b)
25. What will be the output of the following code? Will it show
TypeError?
print(True + 4)
print(False - 3)
26. Write a program using if..elif..else for printing the name of
the day depending on the value of a variable.
value Action
0 Print Sunday
1 Print Monday
2 Print Tuesday
3 Print Wednesday
4 Print Thursday
5 Print Friday
6 Print Saturday
Any other value Print Invalid
Write the same code using a dictionary.
Loops 7
Statements written in a program are executed sequentially, and each statement is
executed only once. However, many tasks are repetitive in nature, so there can be
situations when we need to execute a statement or a block of statements multiple
times. Python provides two control statements called loops, that can be used to
repeatedly execute a piece of code.
A loop or an iterative control statement is a control statement that is used for
repeated execution of a block of statements. We need loops when we want to do
something more than once. Instead of writing the same statements repeatedly in
our code, we can use loops to automate the repetition. In Python, we have two
loops: while loop and for loop.
7.1 while loop
In while loop, there is repeated execution of a block of statements while a given
condition is True. Here is the syntax and flowchart of a while loop.
Figure 7.1: Syntax and flowchart of while loop
The while keyword is followed by a test expression, also called the loop
condition, which is followed by a colon. This colon marks the start of the
statement block, which is to be executed repeatedly while the test expression is
true. The statement block is also called the loop body, and the indentation
separates this loop body from the header line.
Let us understand how this loop works. First, the test expression is evaluated; if it
is True, the statement block executes, and then the control returns to the test
expression. If it is True, again the block executes and then the test expression is
checked. This process continues till the test expression is True. When it becomes
False, the loop terminates, the control comes out, and the next statement out of the
loop is executed. So, this loop keeps running while the test expression at the top is
True. One complete execution of a loop is called an iteration. Here is a simple
example to show how the while loop works:
n = 1
while n <= 3:
print('Hello ' * n)
n += 1
print('Bye')
Output-
Hello
Hello Hello
Hello Hello Hello
Bye
We have an integer variable n initialized to 1. Then, we have the while loop, and
after the while loop, we have a print call that prints Bye. The test expression
or the loop condition is n <= 3.
When the control enters the loop, the loop condition is checked. It is True because
the value of n is 1, so the loop body executes. Hello is printed one time, and
then the value of n is incremented. So, n now becomes 2. The condition is
checked again. It is True, so the loop body executes again; Hello is printed two
times, and the value of n becomes 3. The condition is checked again. It is True, so
the loop body executes again; Hello is printed three times, and the value of n
becomes 4.
Once again, the condition is checked. Now, it is False since the value of n is 4. So,
the loop terminates, and the control comes to the next statement out of the while
loop. This statement prints Bye. In simple English, this loop means that “while n
is less than or equal to 3, keep executing the block of statements”.
If the condition is False the first time through the loop, the statements inside the
loop are never executed. For example, in our loop, if the initial value of n is 4
instead of 1, the condition will be False the first time, and so the loop body will
not execute even once.
n = 4
while n <= 3:
print('Hello ' * n)
n += 1
print('Bye')
Output-
Bye
There should be a statement inside the loop body that makes the loop condition
False at some point; otherwise, the loop condition will always be True, and the
loop will keep on executing infinitely.
In our example loop, the statement n += 1 is the statement that will make the
loop condition False eventually. If we delete this statement, then the value of n
will remain 1 always, and the condition will never become False, and the loop will
never end.
n = 1
while n <= 3:
print('Hello ' * n)
print('Bye')
If you execute this loop, you will see Hello being printed continuously. This is
an infinite loop; we can press Ctrl-C to break it, or we have to close the
window. To avoid an infinite loop, remember to place an update statement inside
the loop body and make sure that the condition becomes False eventually at some
point.
You can type this example loop and play around with it to understand how it
works. For example, you can change the condition to n <= 10 or n >= 3 and
see how the output is affected. If you change the update statement n+=1 to n+=2,
n will be incremented by 2 each time.
Here is another example of a while loop:
total = 0
while total <= 100:
num = int(input('Enter a number : '))
total += num
print(total)
This loop adds the numbers input by the user and stops when the total exceeds
100.
7.1.1 Indentation matters
In Python, indentation determines the body of the loop. Unlike other languages,
there are no curly braces or keywords to mark the beginning and end of the loop
body. If you make any mistakes in indenting, you might get an unexpected output.
Consider the following while loop that calculates the sum of digits of an integer:
n = int(input('Enter a number : '))
sum_digits = 0
while n > 0:
digit = n % 10
sum_digits += digit
n //= 10
print(sum_digits)
Sample Run-
Enter a number : 3214
10
First, let us understand how this loop works, and then we will see how indentation
can affect the output. Inside the loop, we extract the digits of the number from
right to left, and the extracted digits are added to the variable sum_digits. The
statement n % 10 extracts the rightmost digit from the number, the next
statement adds the extracted digit to the variable sum_digits, and the
statement n //= 10 divides n by 10 so that the next digit comes at the
rightmost place and can be extracted in the next iteration. Here is the dry run of
the loop for a value of n equal to 3214.
sum_digits = 0 n = 3214
n > 0 is True digit = 3214 % 10 = 4 sum_digits = 0 + 4 = 4 n
= 3214 // 10 = 321
n > 0 is True digit = 321 % 10 = 1 sum_digits = 4 + 1 = 5 n=
321 // 10 = 32
n > 0 is True digit = 32 % 10 = 2 sum_digits = 5 + 2 = 7 n = 32
// 10 = 3
n > 0 is True digit = 3 % 10 = 3 sum_digits = 7 + 3 = 10 n = 3 //
10 = 0
n > 0 is False Loop terminates
This is how the loop works and gives the desired output.
If you have a loop that has multiple lines in the loop body, it is possible that
mistakenly the last one or two lines do not get indented. The interpreter will not
complain in this case as it is satisfied with just a single line in the loop body. So,
there will be no syntax error, but the lines that are not indented will not be part of
the loop and hence will not be repeated. For example, in the previous loop,
suppose the last line is not indented.
while n > 0:
digit = n % 10
sum_digits += digit
n //= 10
print(sum_digits)
This will result in an infinite loop because the update statement (n //= 10) is
now outside the loop, with only 2 lines inside the loop body. So, make sure that
you indent all the lines that you intend to be inside the loop body. Whatever is not
indented will be considered out of the loop and will not be repeated.
Now, suppose the statement print(sum_digits), which was supposed to be
outside the loop, is indented by mistake.
while n > 0:
digit = n % 10
sum_digits += digit
n //= 10
print(sum_digits)
Now the statement print(sum_digits) is a part of the loop body, so in each
iteration of the loop, this statement will also be executed. This will result in some
extra undesired output on our screen.
Improper indentation can lead to such logical errors in our code, which the
interpreter cannot detect but they make the program give unexpected results.
7.1.2 Removing all occurrences of a value from
the list using the while loop
We have seen in an earlier chapter that the remove method of list type
removes only the first occurrence of the given item.
L = [1, 5, 2, 3, 9, 4, 3, 2, 4, 2, 1, 2]
n = 2
L.remove(n)
print(L)
Output-
[1, 5, 3, 9, 4, 3, 2, 4, 2, 1, 2]
We can see many 2s in our original list, but only the first occurrence of 2 was
removed from the list. Now, we will use a while loop to remove all the
occurrences.
L = [1, 5, 2, 3, 9, 4, 3, 2, 4, 2, 1, 2]
n = 2
while n in L:
L.remove(n)
print(L)
Output-
[1, 5, 3, 9, 4, 3, 4, 1]
We have written the statement L.remove(n) inside a while loop. This
statement will continue executing until the loop condition n in L is True. So,
the remove method will be repeatedly called till there is value n in the list L.
When n in L returns False, the loop will end. The value of n is 2, so this code
will remove all the 2s from the list L.
7.1.3 while loop for input error checking
We can use the while loop to validate input, which means that we can ensure
that the user enters valid input. Here is a small piece of code where the user is
expected to enter a student id in the range of 1000-9999.
student_id = input('Enter student id (1000-9999) : ')
print(student_id)
If the user enters something that is not an integer or is not in the valid range
(1000-9999), the program will not complain. We have instructed the user to enter
the correct id, but we are not checking the input.
We can use an if statement here.
student_id = int(input('Enter student id (1000-9999) :
'))
if student_id >= 1000 and student_id <= 9999
print(student_id)
If the entered input is in the correct range, the condition will be True, and the id
will be printed. This if statement checks the input, but we want to give the user
another chance to enter the input in the correct form. We want to keep asking him
to enter the id till he enters the id in the correct form. For that, we can use a
while loop.
student_id = int(input('Enter student id (1000-9999) :
'))
while student_id < 1000 or student_id > 9999:
student_id = int(input('Enter student id (1000-
9999) : '))
print(student_id)
The first input statement executes and then the control goes to the while loop.
If the entered id is not in the valid range, then the loop condition is True, and the
loop body executes and keeps executing until the user enters a valid id. When the
correct id is entered, the loop will terminate and the program will continue. If the
id is entered in the correct form the first time itself, the loop condition will be
False, so the loop body will not execute even once. So, we can use the while
loop to ensure the user enters the correct input. In the next chapter, we will discuss
a better way of writing this loop.
7.1.4 Storing user input in a list or dictionary
We can use the while loop to get data from the user and store it in a list or
dictionary. In the following example, we have a dictionary with a few items. By
using a while loop, we are letting the user enter some more items in this
dictionary.
fruit_prices = {'apple': 210, 'banana': 100, 'grapes':
90}
done = False
while not done:
fruit = input('Enter fruit name : ')
price = int(input('Enter price : '))
fruit_prices[fruit] = price
if input('Want to enter more(y/n) : ') == 'n':
done = True
print(fruit_prices)
We have taken a Boolean variable named done and initialized it to False. We
have made it True inside the loop when the user is done entering all the items.
This loop executes as long as the variable done is False because we have used
the not operator in the loop condition. When done is False, the loop condition is
True, and when done is True, the loop condition is False. So, when done will
become True, the loop will terminate.
Inside the loop, we are asking the user to enter the fruit name and then the price,
and in the next statement, we are entering the pair into the dictionary. After that,
we are asking the user if he wants to enter more pairs. If users types ‘n’, which
stands for no, variable done is set to True, and the loop terminates. Otherwise,
the loop keeps executing, and the user can enter several pairs of fruits and prices,
which will be added to the dictionary.
7.2 for loop
The while loop of Python is similar to the while loop of most other
programming languages. However, the syntax of for loop differs from the
standard three-expression for loop in languages like C++ or Java. The for loop in
Python is more like a for each loop available in some other languages.
Like the while loop, the for loop is also used to repeatedly execute a block of
code, but unlike a while loop, it is not based on a condition. It is a collection-
controlled loop, and it iterates once for each element in the collection. Here is the
syntax of a for loop:
for item in iterable:
statement1
statement2
statement3
We have the keyword for, then a variable name, another keyword in, and then
an iterable name. This iterable can be any iterable structure like a string, list, tuple,
set, dictionary, or even a file. The elements in this iterable are assigned to the
variable named item one by one, and the statement block is executed once for
each item. Here is an example of for loop in action.
data = [3, 5, 9, 8]
for number in data:
print(number)
Output-
3
5
9
8
This loop prints each element of the list on a separate line. Let us discuss how this
loop is working. When the loop starts, the first element in the list is assigned to
the iterating variable named number, and the statement block is executed. On
the next iteration, the second element of the list is assigned to the variable
number, and the statement block is executed. This process continues until the
entire list is exhausted. So, the loop terminates when this loop body has been
executed for each element of the list. In simple English, this