0% found this document useful (0 votes)
94 views155 pages

04 - Book - Python Programming (3rd SEM) - Watermark

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views155 pages

04 - Book - Python Programming (3rd SEM) - Watermark

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 155

Python Programming

Author
Rupesh Nasre.
Associate Professor, Indian Institute of Technology (IIT)
Madras, Adyar, Chennai,
Tamil Nadu

Reviewed by
Dr. Mahendran Botlagunta
Associate Professor, VIT Bhopal University
Madhya Pradesh

All India Council for Technical Education


Nelson Mandela Marg, Vasant Kunj,
New Delhi, 110070

2
BOOK AUTHOR DETAILS
Dr. Rupesh Nasre., Associate Professor, Indian Institute of Technology (IIT) Madras, Adyar,
Chennai, Tamil Nadu
Email ID: [email protected]

BOOK REVIEWER DETAILS


Dr. Mahendran Botlagunta, Associate Professor, VIT Bhopal University, Sehore, Bhoapl,
Madhya Pradesh
Email ID: [email protected]
BOOK COORDINATOR (S) – English Version
1. Dr. Amit Kumar Srivastava, Director, Faculty Development Cell, All India Council for
Technical Education (AICTE), New Delhi, India
Email ID: [email protected]
Phone Number: 011-29581312
2. Mr. Sanjoy Das, Assistant Director, Faculty Development Cell, All India Council for
Technical Education (AICTE), New Delhi, India
Email ID: [email protected]
Phone Number: 011-29581339

October, 2022
© All India Council for Technical Education (AICTE)
ISBN : 978-81-959863-5-4
All rights reserved. No part of this work may be reproduced in any form, by mimeograph or
any other means, without permission in writing from the All India Council for Technical
Education (AICTE).
Further information about All India Council for Technical Education (AICTE) courses may be
obtained from the Council Office at Nelson Mandela Marg, Vasant Kunj, New Delhi-110070.
Printed and published by All India Council for Technical Education (AICTE), New Delhi.
Laser Typeset by:
Printed at:
Disclaimer: The website links provided by the author in this book are placed for informational,
educational & reference purpose only. The Publisher do not endorse these website links or the
views of the speaker / content of the said weblinks. In case of any dispute, all legal matters to be
settled under Delhi Jurisdiction, only.
ACKNOWLEDGEMENT
I am grateful to the authorities of AICTE, particularly Prof. M. Jagadesh Kumar, Chairman;
Prof. M. P. Poonia, Vice-Chairman; Prof. Rajive Kumar, Member-Secretary and Dr Amit
Kumar Srivastava, Director, Faculty Development Cell for their planning to publish this
book on Python Programming. I am deeply indebted to Prof. Mahendran Botlagunta of
VIT Bhopal University who promptly and thoroughly reviewed this book. His review
significantly improved the presentation. The publication application in Unit 4 as well as
Windows related commands are also due to his suggestions and guidance. I am thankful to
Chinmay Nasre and AICTE Graphics Team for designing the cover page nicely.
I wrote this book from Nagpur, attending to my father’s health, away from my wife and
son. It was tough to meet the deadlines under the tight schedule amid hospital visits. But I
am happy with the outcome. This book is dedicated to those context-switches between
technical and non-technical matters.
This book is an outcome of various suggestions of AICTE members, experts and authors
who shared their opinion and thought to further develop the engineering education in our
country. Acknowledgements are due to the contributors and different workers in this field
whose published books, review articles, papers, photographs, footnotes, references and
other valuable information enriched us at the time of writing the book.

Rupesh Nasre.

2
Preface
This book is an introduction to Python Programming. Python is a popular programming
language today and this book covers its basics. It is divided into five units, each of which
builds upon the previous ones. The units span basics of variables and assignments all the
way to developing web applications.
The first three units are required by all Python users. The units consist of the basic
constructs and types, control constructs, and modular programming with functions and
modules. Together, the three units equip a reader with the basic knowledge of Python to
write simple programs. The fourth unit delves deeper into string functions, file handling,
and regular expressions. Its programs are relatively little more complex and need more
practice. I encourage readers to try out the programs in your own way prior to looking at
the solutions. The fifth unit explores the Django framework to build web applications. Its
configuration is also complicated. But I do hope you get the joy and satisfaction after you
see your first web application running.
Each program explained in this book is available online on the book’s webpage. The
readers should (i) write the program themselves, (ii) then consult the readymade program,
(iii) enhance the program with additional functionality.
Python has several libraries and frameworks, and more are developed regularly. The goal
of this book is to make you hungry to learn more. I am sure you will have a good time
learning through various carefully-crafted examples and illustrations. If you have any
feedback to improve the book, feel free to write to me directly.

Rupesh Nasre.
OUTCOME BASED EDUCATION
For the implementation of an outcome based education the first requirement is to develop
an outcome based curriculum and incorporate an outcome based assessment in the
education system. By going through outcome based assessments, evaluators will be able to
evaluate whether the students have achieved the outlined standard, specific and measurable
outcomes. With the proper incorporation of outcome based education there will be a
definite commitment to achieve a minimum standard for all learners without giving up at
any level. At the end of the programme running with the aid of outcome based education,
a student will be able to arrive at the following outcomes:
Programme Outcomes (POs) are statements that describe what students are expected
to know and be able to do upon graduating from the program. These relate to the skills,
knowledge, analytical ability attitude and behaviour that students acquire through the
program. The POs essentially indicate what the students can do from subject-wise
knowledge acquired by them during the program. As such, POs define the professional
profile of an engineering diploma graduate.
National Board of Accreditation (NBA) has defined the following seven POs for an
Engineering diploma graduate:
PO1. Basic and Discipline specific knowledge: Apply knowledge of basic mathematics,
science and engineering fundamentals and engineering specialization to solve the
engineering problems.
PO2. Problem analysis: Identify and analyses well-defined engineering problems using
codified standard methods.
PO3. Design/ development of solutions: Design solutions for well-defined technical
problems and assist with the design of systems components or processes to meet
specified needs.
PO4. Engineering Tools, Experimentation and Testing: Apply modern engineering
tools and appropriate technique to conduct standard tests and measurements.
PO5. Engineering practices for society, sustainability and environment: Apply
appropriate technology in context of society, sustainability, environment and ethical
practices.
PO6. Project Management: Use engineering management principles individually, as a
team member or a leader to manage projects and effectively communicate about well-
defined engineering activities.
PO7. Life-long learning: Ability to analyse individual needs and engage in updating in
the context of technological changes.
COURSE OUTCOMES

By the end of the course the students are expected to learn:


CO-1: The necessary background in programming to solve problems using
computation.
CO-2: Understand a given Python program and find what it does.
CO-3: Coding simple solutions to numeric and string problems.
CO-4: Using regular expressions to search for patterns in strings and files.
CO-5: Using Django to create simple web applications.
Mapping of Course Outcomes with Programme Outcomes to be done according to
the matrix given below:
Expected Mapping with Programme Outcomes
Course Outcomes (1- Weak Correlation; 2- Medium correlation; 3- Strong Correlation)

PO-1 PO-2 PO-3 PO-4 PO-5 PO-6 PO-7

CO-1 3 3 3 3 1 1 3

CO-2 3 2 2 2 1 1 3

CO-3 3 2 2 2 1 1 3

CO-4 3 2 2 3 1 1 3

CO-5 3 3 3 3 1 1 3

2
GUIDELINES FOR TEACHERS
To implement Outcome Based Education (OBE) knowledge level and skill set of the
students should be enhanced. Teachers should take a major responsibility for the proper
implementation of OBE. Some of the responsibilities (not limited to) for the teachers in
OBE system may be as follows:
● Within reasonable constraint, they should manoeuvre time to the best advantage of
all students.
● They should assess the students only upon certain defined criterion without
considering any other potential ineligibility to discriminate them.
● They should try to grow the learning abilities of the students to a certain level before
they leave the institute.
● They should try to ensure that all the students are equipped with the quality
knowledge as well as competence after they finish their education.
● They should always encourage the students to develop their ultimate performance
capabilities.
● They should facilitate and encourage team work to consolidate newer approach.
● They should follow Blooms taxonomy in every part of the assessment.

Bloom’s Taxonomy
Teacher should Student should be Possible Mode of
Level
Check able to Assessment

Students ability to
Create Design or Create Mini project
create

Students ability to
Evaluate Argue or Defend Assignment
justify

Students ability to Differentiate or Project/Lab


Analyse
distinguish Distinguish Methodology

Students ability to Operate or Technical Presentation/


Apply
use information Demonstrate Demonstration

Students ability to
Understand Explain or Classify Presentation/Seminar
explain the ideas

Students ability to
Remember Define or Recall Quiz
recall (or remember)
GUIDELINES FOR STUDENTS
Students should take equal responsibility for implementing the OBE. Some of the
responsibilities (not limited to) for the students in OBE system are as follows:
● Students should be well aware of each UO before the start of a unit in each and
every course.
● Students should be well aware of each CO before the start of the course.
● Students should be well aware of each PO before the start of the programme.
● Students should think critically and reasonably with proper reflection and action.
● Learning of the students should be connected and integrated with practical and real
life consequences.
● Students should be well aware of their competency at every level of OBE.

2
CONTENTS

Unit 1: Introduction, Variables, and Data Types 1–19


1.1 History 3
1.2 Features 3
1.3 Installation and Execution 3
1.4 Hello World! 5
1.5 Input and Output 5
1.6 Basic Data Types and Operators 9
1.7 Strings 12
1.8 Compound Data Types 15

Unit 2: Control Structures 20–48


2.1 Conditionals 22
2.2 Loops 31

Unit 3: Functions, Modules, and Packages 49–79


3.1 Functions 51
3.2 Modules 64
3.3 Packages 71

Unit 4: Files and Regular Expressions 80–118


4.1 File Input/Output 81
4.2 Text Processing 96
4.3 Pattern Matching and Regular Expressions 104
4.4 Application: Querying Publication Data 110

Unit 5: Django Framework 119–138


5.1 Installing and Running Django 120
5.2 Creating and Running a Web Application 122
5.3 Parameter Passing with GET 130
References for Further Learning 139
Index 141-143
1 Introduction, Variables,
d

and Data Types

UNIT SPECIFICS
Through this unit we discuss the following aspects:
● History of the Python Programming Language
● Overall set of features supported by Python
● Basic setup and installation
● Basic data types, operators, input and output

RATIONALE
This introductory unit gets readers acquainted with the basics of Python programming. It starts
with a brief history of Python’s creation. The unit then lists the language’s overall set of features
at a high level and motivates why it has become so popular. The unit then delves into basic language
syntax, variables, and solves a few known problems in Python. We then introduce various data
types such as numbers and strings and solve problems using those. We end this unit by introducing
aggregate data types such as lists, tuples, and dictionaries. Care has been taken to introduce the
readers to interesting problems despite the limitation of not having a conditional statement or a
loop.

PRE-REQUISITES
None
UNIT OUTCOMES
List of outcomes of this unit is as follows:
U1-O1: Realize the history of Python
U1-O2: Implement simple program and execute it
U1-O3: Use different data types
U1-O4: Perform basic input and output

EXPECTED MAPPING WITH COURSE OUTCOMES


Unit-1 (1- Weak Correlation; 2- Medium correlation; 3- Strong Correlation)
Outcomes
CO-1 CO-2 CO-3 CO-4 CO-5 CO-6

U1‐O1 3 3 3 - 3 1

U1‐O2 1 1 2 2 1 -

U1‐O3 2 1 3 1 2 1

U1‐O4 - - 3 1 2 2

2
1.1 History
Python is a widely used programming language today. Starting from the
first semester students to final year projects to industry personnel, Python
finds its use in developing programs for graphics applications, text
processing, data analysis, among others. Fig. 1.1: “Two
snakes” logo of Python
Python was developed by Guido van Rossam, a Dutch programmer, and
released in 1991. The name is inspired from a BBC comedy show Monty
Python’s Flying Circus. Python is a successor of the ABC programming
language. At the time of this writing, Python 3 is the latest major release
of Python (on your computers, you may notice the program python3).

Since the last 20 years, Python has been in the Top 10 most popular
programming languages. At the time of this writing (July 2022), Python
is the most popular language, surpassing C and Java (according to the
TIOBE index).

1.2 Features
Python is a general-purpose programming language and supports multiple paradigms or ways of
programming. For instance, we can write procedural (sequence of steps) as well as object-oriented
programs (entities as objects and communication using messages across them) in it. It can also be used
to write functional programs (applying and composing functions), among others. Unlike C and C++,
Python is interpreted. This means Python programs are not compiled and stored into a binary code file
(e.g., an executable)., but its source is translated into machine code and executed by the interpreter
directly (without us seeing the executable code). Thus, on your computers, python3 is an interpreter
(and gcc is a compiler for C programs).

Python is also dynamically-typed. This means that the type of a variable may not be specified in the
source code, and is identified when the program executes. Python also relies on garbage collection
which reclaims the allocated memory of the variables which are no longer needed (referenced, to be
precise). This relieves the programmer of the task of memory deallocation, similar to Java. Python also
has a large variety of standard libraries, which allow us to write complex codes quickly, improving our
productivity.

1.3 Installation and Execution


If you have access to the internet, you can write and execute Python programs online (e.g., Replit,
CodeAcademy, Codevny). If you wish to install it on your laptop or iPad, you can download the
appropriate installable for Windows or Linux or iPadOS or other operating systems from Python’s
official website www.python.org.

3
For instance, the following screenshot shows a Linux installation and running of Python.

Once installed, you can invoke the Python interpreter (using graphical user interface or via a command
line) to execute a Python program. Python programs can be written in your favorite editor (e.g., VS
Code or Sublime or gedit or even notepad) apart from the Python IDE, called IDLE on Windows.

On the command line, you can use certain basic commands to navigate through the file system. For
instance, if your home directory is /home/user, you can create a directory for Python programs as:
$ cd # go to home directory
$ mkdir python # create directory
$ cd python # go into that directory
$ cat >hello.py # create a new file hello.py
print(“Hello World!”)
$ python3 hello.py # run the Python interpreter
$ cd .. # come back to home directory

4
Python programs are stored in files typically with extension .py (e.g., hello.py). On a command-line, we
can execute the program as below.

$ python3 hello.py

In the above command-line, $ indicates the command-prompt, python3 is the interpreter, and hello.py
is a text-file containing your program. On your computer, the interpreter name may differ depending
upon your installation. For instance, on Windows, the interpreter binary is named python. Further, with
new versions of Python, the binary name may change to python4.

1.4 Hello World!


A typical first program in a programming language prints Hello World! to the screen. Below is our first
program in Python.

print(”Hello World!”)

The program uses a function print() to output a message to the screen. The message is given as an
argument to print() and is specified in double-quotes (“). It can also be specified in single-quotes (‘) or
triple-quotes (‘’’ or “””). When executed using the interpreter, it outputs the message.

$ python3 hello.py
Hello World!
$

The last $ indicates that the command-prompt is displayed again for the next command.

1.5 Input and Output


The function print() is used for output, while the function input() is used for taking input from the user.
Consider the following functionality where we wish to take the user’s name as input and greet the user.

$ python3 greet.py
What is your name?
Guido van Rossam
Hello Guido van Rossam
$

One problem here is that we would like to greet exactly the name that was entered. This demands us to
store the name during input and retrieve it during the output. Such an entity is called a variable. A

5
variable is capable of storing a value which can be retrieved later. In fact, a variable can hold different
values at different times. Our program achieving the above functionality looks like this.

1. print("What is your name?")


2. name = input()
3. print("Hello", name)

Note that the program is shown with line numbers, which are not part of the program. Line 1 prints our
message asking for the user’s name. Line 2 takes the input name and stores it in a variable called name.
In Line 3, we use the same variable to greet the user.
You must have noticed that Lines 1 and 3 both use the print() function but take a different number of
arguments (this is called polymorphism). In fact, the print() function can be invoked with an arbitrary
number of arguments.
Another noteworthy point is that the printing on Line 3 automatically separates the two strings “Hello”
and “Guido van Rossam” by a space (see the output above). This is the default behavior of print(). As
another example, consider the following program.

print(”Hello”, “World!”)
print(”Bye”, “World!”)

What will be the output of the above program?

Hello World!
Bye World!

We specified neither the space nor the newline in our program. It is the default behavior of print(). This
behavior can be changed with additional arguments to print().

1. print("Hello", "World!", end='####')


2. print("Bye", "World!", sep='$$')
3. print("With sep", "and end", sep=' ', end='.\n')

The output of the above program is

Hello World!####Bye$$World!
With sep and end.

Line 1 of the source code above prints the two strings separated by the default separator space ‘ ‘ but
ends with ‘####’. Line 2 then prints the two strings separated by ‘$$’ and ends with the default end-of-
line character ‘\n’. Line 3 prints the two strings separated by space, which is explicitly specified, and
ends with a full-stop followed by a newline. Thus, individual strings are separated by sep and the line

6
is ended by end, whose default values are ‘ ‘ and ‘\n’ respectively. ‘\n’ is the newline character. Also
note that strings are specified in this program with single as well as double quotes.

Examples

Program Output

print('1', '+', '2', '=', '3') 1+2=3

print(1+2, 3/2) 3 1.5

print("a", "b", "c", "d", sep=",") a,b,c,d

print("a" "b" "c" "d", sep=",") abcd

id = "rupesh" [email protected]
domain = "cse.iitm.ac.in"
print(id, domain, sep='@')

id = ”rupesh” [email protected]
print(id, end='@')
print("cse", "iitm", "ac", "in", sep='.')

Consider a program wherein we wish to ask for a name and year of birth from a user, and then display
the age of that user. We can write the initial part of such a program as follows.

print('Enter your name: ')


name = input()
print('Enter your year of birth: ')
yob = input()

When executed, the program works as follows.

Enter your name:


Guido van Rossam
Enter your year of birth:
1956

It will be nice if the input can be provided on the same line. This can be achieved as follows.

7
print('Enter your name: ', end=’’)
name = input()
print('Enter your year of birth: ', end=’’)
yob = input()

Another way to achieve it is by specifying the string as an argument to input().

name = input('Enter your name: ')


yob = input('Enter your year of birth: ')

When executed, the program works as expected.

Enter your name: Guido van Rossam


Enter your year of birth: 1956

Now let’s get back to the task of calculating the age of the user. Thus, the program should be able to
print the following (in the year 2023).

Enter your name: Guido van Rossam


Enter your year of birth: 1956
Hey Guido van Rossam you are 67 years old!

This can be achieved as follows.

name = input('Enter your name: ')


yob = input('Enter your year of birth: ')
print('Hey', name, "you are", (2023 - yob), "years old!")

Unfortunately, this simple program does not work. Why?


The Python interpreter considers variable yob to be of type string. Therefore, it cannot be directly used
in arithmetic expressions (such as 2023 - yob). This means that the string “1956” is treated differently
from the integer value 1956. To be used in an arithmetic expression, we need to convert the string
“1956” into integer 1956. This can be done using the int() function.

name = input('Enter your name: ')


yob = input('Enter your year of birth: ')
print('Hey', name, "you are", (2023 - int(yob)), "years old!")

8
Future connect :Instead of hardcoding 2023 or the current year in the program, it would be nice if our
program can find out the current year. This can be done using the datetime module, which we will study
a little later.

Similar to string and int, there are other data types supported by Python. Let’s take a look.

1.6 Basic Data Types and Operators


Python supports numeric types such as integer, floating point, as well as complex numbers. It also
supports boolean values and strings. Further, the value of a variable can be converted from one type to
another (as we saw in the last example).

Example: A card is drawn at random from a deck of well-shuffled cards. Find the probability of it being
neither a king nor a spade.
We know that the deck has a total of 52 cards, split into 4 suites, each containing 13 cards. We can then
compute the desired probability as follows.

ncards = int(52)
nkings = int(4)
nspades = int(13)
nspadeking = int(1)
nnonspadenonking = ncards - (nkings + nspades - nspadeking)
probnonspadenonking = nnonspadenonking / ncards
print('Probability of nonking, nonspade is', probnonspadenonking)

The output of this computation is:

Probability of nonking, nonspade is 0.6923076923076923

We can restrict the output to a few decimal digits using format specification (similar to C).

print('Probability is %f' %probnonspadenonking,


‘%.2f' %probnonspadenonking)

The output of these two lines alone (Lines 8 and 9) is:

Probability is 0.692308 0.69

9
Let’s understand this formatted printing. Format specifier “%f” instructs printing of the next argument
%probnonsadenonking to be a single-precision value which by default restricts the output to six decimal
digits (unlike the original double-precision value). Format specifier “%.2f” restricts it further to 2
decimal digits. Note that there is no comma between the format specifier and %variable.
Another non-technical point this program reveals is about the variable names. As you may notice, the
variables are difficult to read. One can use camelCase to improve it.

nCards = int(52)
nKings = int(4)
nSpades = int(13)
nSpadeKing = int(1)
nNonSpadeNonKing = nCards - (nKings + nSpades - nSpadeKing)
probNonSpadeNonKing = nNonSpadeNonKing / nCards
print('Probability of nonKing, nonSpade is', probNonSpadeNonKing)

Good Programming Practice : Use variable names that are easier to read and understand.

Example: Find the sum of the first n natural numbers.


We know the formula to compute the sum of the first n numbers: n * (n + 1) / 2. Let us use this to write
our program.

nstr = input()
n = int(nstr)
sum = n * (n + 1) / 2
print(sum)

We now understand that input() will return a string, which we need to convert to an integer (Line 2).
After this, we apply the formula (Line 3) and print the sum. For input 10, the program should print 55.

10
55.0

From the output, it is clear that the computation is reasonable, but the sum is printed as a real number.
This happens because the division operator (/) uses a floating point division. We can convert it to an
integer in multiple ways.

n = int(input())
sum = n * (n + 1) / 2
print(int(sum)) # using explicit conversion
print(“%d” %sum) # using format specifier
10
sum = n * (n + 1) // 2 # using integer division
print(sum)

Line 1 reads the input string, converts it into a number, and stores it in variable n. Line 2 computes the
sum, which is by default a real value. We print it in Line 3 using the int() function. There is also a format
specifier “%d” to print the value as an integer. We use that in Line 4. Alternatively, Python provides an
integer division operator (//). On Line 5, we use it (now the same sum variable stores an integer value).
We print the integer on Line 6.

The program highlights multiple points. First, types can be interconvertible. Second, a variable does not
have a fixed type; it can have values of different types at different points in the program (e.g., variable
sum). Hence, we do not declare them in Python. Third, notice comments on Lines 3 and 4 in gray font.
Comments start with # and last till the end of that line. Comments are for improving the readability of
the program and are not executed. Multiline comments in Python are often written using triple quotes.
Good Programming Practice : Use comments judiciously to improve the code readability.

Examples

Program Output

print(5 + 3 / 4 - 2) 3.75

print(3.1 * 3.2 ** 3.3) 143.99805374858647


# ** is an exponentiation operator
# similar to Fortran

print(“Last digit of 1234 is”, 1234 % 10) Last digit of 1234 is 4


# % is a modulus operator
# which gives the remainder

Python has inbuilt support for complex numbers. The following example conveys its syntax.

Program Output

1. x = 1 + 2j # create
2. y = complex(1, -2) # create
3.
4. z=x-y # arithmetic
5. print(z) 4j
6.

11
7. z = x * y # arithmetic
8. print(z) (5+0j)
9. print(z.real, '+', z.imag, 'j') 5.0 + 0.0 j

Lines 1 and 2 present two ways to create complex numbers (we can also read a complex
number as a user input). Lines 4 and 7 illustrate simple arithmetic involving complex numbers.
Finally, Line 9 shows how to separate its real and imaginary parts.

1.7 Strings
String is a fundamental data type in Python and the language provides many ways of manipulating
strings. Strings are enclosed in single-, double-, or triple-quotes. Triple-quoted strings can span multiple
lines. Certain characters such as ‘\n’ have special meaning and are not treated as two characters, but
one. These are called escape characters or escape sequences. Strings can be concatenated readily by
using the + operator. Note that concatenation does not add a space between the two strings.

Examples

Program Output

oneline = 'one line'


twoline = '''two
lines''' # triple-quotes
print(oneline, twoline) one line two
lines

errorline = "error Syntax error


line" # double-quotes

print("My drive is c:\nasre") # note \n My drive is c:


print("My drive is c:\\nasre") # escaped slash asre
print(r"My drive is c:\nasre") # raw string My drive is c:\nasre
My drive is c:\nasre

print("Tab\tseparated\ttext\non the second line") Tab separated text


on the second line

print('I don\'t understand quotes.') I don't understand quotes.


print("I don't" 'understand' '''quotes.''') I don'tunderstandquotes.
# quoted strings can be concatenated.

oneline = 'one line'


print(oneline, oneline) # space-separated one line one line
print(oneline + oneline) # concatenated one lineone line

12
print("My answer is", 3 * "No! ") My answer is No! No! No!
n = int(input()) 5
print(n * “Python “) Python Python Python Python Python

As the final example shows, strings can also be repeated easily with a multiplier.

Python provides a large set of operators and functions to manipulate strings. We illustrate this with a few
examples below.

Examples
For the following set of examples, assume this initialization:
name = "Python Programming"

ID Program Output

1 print(name[0], name[7], len(name)) P P 18

2 name[0] = ‘C’ Syntax error

3 print(name[-1], name[-11], name[-len(name)]) gPP

4 first = name[1:5] ytho,Programming


second = name[7:len(name)]
print(first, second, sep=’,’)

5 first = name[:6] Python,Programming


second = name[7:]
print(first, second, sep=’,’)

name = second + " " + first; Programming Python


print(name)

6 print(name[1::2]) yhnPormig

The first program with ID 1 indexes into the string name and extracts individual characters
(which are actually one-length strings). Indexes start with 0. Thus, name[0] prints the first P.
To find the length of a string, that is, the number of characters in it, len() function can be used.
The last letter is at index len(name) - 1.
The program with ID 2 tries to modify the string by writing to name[0]. Incidentally, this is
disallowed in Python (such strings are called immutable). If we want to modify a string, we

13
need to assign the complete string. This is shown in the program with ID 5 where name is
reassigned to a different string.
The program with ID 3 strangely uses negative indices! Python supports negative indexing
whose meaning is counting backwards. Thus, name[-1] indexes the last character. Since this
reverse indexing starts with -1, the first letter will be at index -len(name).
The program with ID 4 extracts substring. This can be done by specifying the index range [x:y]
where all the characters starting with index x all the way upto-but-not-including y form the
substring. Thus, name[1:5] includes name[1], name[2], name[3], and name[4] (ytho); it does
not include name[5]. We can extract the second word with name[7:len(name)], which prints
Programming. Note again that len(name) is an index one past the last letter’s index.
The program with ID 5 extracts the two words from name. Not specifying the first index of the
range defaults to 0, while not specifying the second index defaults to the length of the string.
The last program with ID 6 prints letters at odd indices. It starts from index 1 (letter y), goes up
to the end of the string, and increments the index by 2.

Example: Extract fields from a roll number.


Consider an institute with roll numbers of the following format. An example roll number is
CS23B010.
● The roll number is exactly eight letters long.
● The first two letters indicate the department (e.g., CS, ME, EE, AE).
● The next two digits indicate the admission year (e.g., 23, 22, 21, 20).
● The next letter is for degree (e.g., B for BTech, M for MTech, S for MS, P for PhD).
● The last three digits indicate the position within the class.
Our task is to extract these different fields from an input roll number. The solution program is
given below.

rollno = input() # CS23B010


branch = rollno[0:2] # CS
year = int(rollno[2:4]) # 23
degree = rollno[4] #B
position = int(rollno[5:8]) # 010

print(branch, "20%02d" %year, degree, position, sep=',')

Example: Intelligent word shuffle


Find out what the below program does. The comments should help you decipher.

original = "eleven plus two"

14
first = original[:6] # eleven
second = original[6:12] # plus
third = original[12:] # two

first2 = first[:2] # el
first3 = first[2] #e
first45 = first[3:5] # ve
first6 = first[-1] #n

third2 = third[:2] # tw
third3 = third[-1] #o

original = first2 + first3 + first45 + first6 + second + third2 + third3


modified = third2 + first2 + first45 + second + third3 + first6 + first3

print(original, "==", modified)

Python provides a variety of functions to manipulate strings (e.g., converting to upper-case,


splitting based on a delimiter, substring search, etc.).

1.8 Compound Data Types


We now introduce the compound data types such as lists, tuples, and dictionaries. All these
are aggregates of several elements (which could themselves be aggregates). We will make
use of these later in an elaborate manner.

Data type List Tuple Dictionary

Properties Ordered, allows Ordered, allows Associative, unique


duplicates, mutable duplicates, immutable keys, mutable

Create L = [0, 1, 2, “T”, 4] T = 0, 1, 2, “T”, 4, 5 D={'IN':91,'US':1,‘AU’:7}


or or
T = (0, 1, 2, “T”, 4, 5) D = dict(IN=91, US=1,
AU=7)
or
D = dict([('IN', 91), ('US',
1), (‘AU’, 7)])

Index L[0], L[-1] T[0], T[-1] D['IN'], D[‘AU’]

15
Range / Slice L[1:5] T[1:5] Not supported

Length len(L) len(T) len(D)

Concatenation L+L T+T D1 | D2


(Python 3.9 onward)

Mutation L[0] = 10 Not supported D['US'] = 2

Append L.append(6) Not supported Use mutation

Remove L[2:12] = [] Not supported del(D['US'])

Unpack e1, e2, e3 = L e1, e2, e3 = T e1, e2, e3 = D.items()

Others Empty list as [] Empty tuple as () Empty dictionary as {}


One length list as [1] One length tuple as (1,) One length dictionary as
{0:’zero’}

Note that in case of a list, L[2] = [] will not remove L[2]. Instead, it will replace L[2] with an
empty list. Also note that one-length tuple is indicated with an extra comma, to distinguish it
from a value, since (1) will be treated as parenthesized value 1.
A dictionary should be viewed as a storage for key-value pairs. The mapping from keys to
values is captured in a dictionary.

Future connect : Following programs can be implemented in a generic way using loops.

Example: Store and print all the vowels from the word Mississippi.
Here, we will use a list to store the vowels.

word = "Mississippi"
vowels = [] # empty list
vowels.append(word[1:2])
vowels.append(word[4:5])
vowels.append(word[7:8])
vowels.append(word[10:11])
print(vowels)

Example: Given a postal address, extract its fields, and print.


Here, we will make use of a string function called split() to divide a string into different parts, and make
use of a tuple containing all the parts.

16
address = '670, New Nandanvan Layout, Near Sham Dham Temple, Nagpur, Maharashtra, 440024'

(plotNo, addrLine1, addrLine2, city, state, pincode) = address.split(', ')


print('Pincode =', pincode)
print(‘City = ‘, city)

Example: Extract the phone number of a friend from a directory.


We will implement a directory using a dictionary, with its key as the friend’s name and its value as the
friend’s phone number.

phoneof = {'Rajesh':9432492125, 'Somesh':8793932633, 'JK':3283728272}


friend = input('Enter a friend\'s name: ')
print("The friend's phone number is", phoneof[friend])

UNIT SUMMARY
We touched upon the historical aspects of Python’s genesis and delved into writing simple programs
using numbers and strings. We also introduced aggregates such as lists, tuples, and dictionaries.

EXERCISES

Multiple Choice Questions


1. What is the output of the following program?

print("p", "q", "r", sep='=')

A. p=q=r
B. p=q=r=
C. =p=q=r
D. =p=q=r=

2. Replace X in the following program such that the output is pqr#pqr.


print('pqr', X)
print('pqr')

17
A. ‘#’
B. “””#”””
C. sep=’#’
D. end=’#’

3. What is the output of the following program if the input is 7.5?

cgpa = input("Enter your CGPA: ")


perc = 10 * cgpa
print(perc)

A. 75
B. Error
C. 7.57.57.57.57.57.57.57.57.57.5
D. 75.0

4. What is the output of the following program?

T = ("1", "1"+"1", "1"+"1"+"1", "1"+"1"+"1"+"1")


print(T[2])

A. 111
B. “3”
C. 3
D. 111

5. Replace X in the following program such that the output is 6.

M = {0:0, 1:1, 2:4, 3:9, 4:16}


M[X] = 9
M[5] = 25
print(len(M))

A. 5
B. 6
C. 7
D. 12

18
Answers of Multiple Choice Questions
1. A: p=q=r
2. D: end=’#’
3. C: 7.57.57.57.57.57.57.57.57.57.5
4. D: 111
5. A: 5

PRACTICAL
1. Write a program that reads marks of three quizzes and outputs the total out of 100.
2. Read a string from the user. Assume these to be unique letters in a set. Find out the
size of the powerset of this set. For instance, if the string is “abc”, the set is {a, b, c}
and its powerset is {{}, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}} whose size is 8. For
string “python”, the output should be 64.
3. Write a program to find the probability of a card drawn from a standard deck to be
neither a Spade nor a colored card (Jack, Queen, King).
4. Given a tuple of five coefficients e.g. (1, 2, -3, 0, 5), write the corresponding polynomial
in x. For instance, 1x^4 + 2x^3 + -3x^2 + 0x + 5.
5. Find the sum of the geometric series 1 + x + x2 + x3 + … + xn given the values of x and
n.

Dynamic QR Code for Further Reading

19
2
2
d
Control Structures

UNIT SPECIFICS
Through this unit we discuss the following aspects:
● Conditional processing using if, if-else, elif
● Looping constructs for, while
● Control-flow alteration using break, continue, pass, and else

RATIONALE
Assignment statements alone cannot enable us to write arbitrary programs. To write general-
purpose codes, we need to alter the straight-line execution path, as well as execute certain
statements repeatedly. This leads to conditional statements and looping constructs, which we study
in this unit. Python also supports specialized variants, which allow us to systematically specify the
desired control pattern.

PRE-REQUISITES
Unit 1

UNIT OUTCOMES
List of outcomes of this unit is as follows:
U2-O1: Use conditional constructs
U2-O2: Use looping constructs
U2-O3: Apply various control structures to solve problems

20
EXPECTED MAPPING WITH COURSE OUTCOMES
Unit-2 (1- Weak Correlation; 2- Medium correlation; 3- Strong Correlation)
Outcomes
CO-1 CO-2 CO-3 CO-4 CO-5 CO-6

U2‐O1 3 3 3 - 3 1

U2‐O2 1 1 2 2 1 -

U2‐O3 2 1 3 1 2 1

21
2.1 Conditionals
Consider two millionaires: you and your friend. You want to identify who is richer. What
do you do? While there are multiple ways to find this out, a direct way is to ask your
friend the amount of property owned. You compare it with your property and find out the
answer.

If we have to write this as steps of an algorithm (called pseudocode), we can do so


using Python comments:

1. # ask your friend for the amount of property owned


2. # friend’s property = read / listen from the friend
3. # my property = ….
4. # if my property > friend’s property then
5. # I am richer
6. # otherwise
7. # Nevermind, I am not richer than my friend

Note how the outcome can differ depending upon the situation. Thus, the conditional
check on Line 4 decides whether Line 5 is reached or Line 7. Also note that only one of
the two is possible.

The above pseudocode is essentially computing the larger of two numbers, which could
be two property evaluations, two digits, two ages, two times, or even two speeds.
Irrespective of what we compare, the program pattern remains the same. Python allows
us to write such a code using an if construct.

1. friendProperty = int(input(‘What is your property in millions? ‘))


2. myProperty = 10 # million
3. if myProperty > friendProperty:
4. print(‘I am richer.’)
5. else:
6. print(‘Nevermind, I am not richer than you.’)

Lines 3–6 present the syntax of an if-else statement. It evaluates the condition
myProperty > friendProperty. Depending upon the value entered, this condition may
evaluate to True or False. If the condition is True, Line 4 gets executed. If the condition
is False, then Line 6 gets executed. Thus, the following two executions of the above
program reveal the conditional execution of statements.

What is your property in millions? 2 What is your property in millions? 15

22
I am richer. Nevermind, I am not richer than you.

Thus, the program when executed with input 2, evaluates the condition 10 > 2 to True,
and hence, executes Line 4. In contract, with input 15, the program evaluates the
condition 10 > 15 to False, and hence, executes Line 6.

Syntactically, the following points should be noted.


1. The condition of the if statement ends with a colon(:).
2. The else clause must end with another colon.
3. The else clause is optional. Thus, Lines 5 and 6 need not be present in the program and it is
still a syntactically valid program.
4. There could be one or more lines in the if block and the else block. For instance, Line 4 forms
the if-block, while Line 6 forms the else-block in the above program.
5. The if and the else blocks should be indented compared to the indentation of if-else.

Example: Matching blood group.


Let us say that the user enters two blood groups and we want to write a program to find
out if the two blood groups match. While in reality, there are more matchings possible
(e.g., A+ can donate blood to AB+), in our program, we will say that a blood group
matches with only itself. Sample executions of such a program are as below.

Enter the two blood groups: A+ A+ Enter the two blood groups: A+ B-
The two blood groups match. It is a mismatch.

1. bg1, bg2 = input("Enter the two blood groups:").split()


2. if bg1 == bg2:
3. print("The two blood groups match.")
4. else:
5. print("It is a mismatch.")

We use the split() function to read two blood groups separated by spaces, and assign
those to two variables in the same assignment statement (Line 1). We then check if the
two blood groups entered are the same. This is done using the == operator, for checking
equality. If the two blood groups are the same, the condition evaluates to True, leading to
execution of Line 3. Otherwise, Line 5 gets executed.

23
The following table shows various comparison operators available in Python.

Op Meaning Usage

< Less than if a < b: print(‘a is smaller’)

> Greater than if a > b: print(‘a is larger’)

<= Less than or equal to if a <= b: print(‘a is less than or equal to b’)

>= Greater than or equal to if a >= b: print(‘a is greater than or equal to b’)

== Equal to if a == b: print(‘a and b are equal’)

The above table also illustrates that the else clause is optional.

Example: Find the student from your department.


Say you are in the Computer Science (CS) Department with roll number as CS23B001.
Using the roll number of a student, you should be able to find out if the student is from CS
or not. It should work as follows.

Your roll number: CS23B010 Your roll number: ME23B111


Hi Bro! Excuse me?

Your roll number: CH23B002 Your roll number: BS23B002


Excuse me? Excuse me?

Let’s write a program for it. The input is a string holding the roll number. We need to extract the first
two characters (how?) and check if those are ‘C’ and ‘S’. This can be done using nested if-else
statements.

1. rollNum = input("Your roll number: ")


2. if rollNum[0] == 'C':
3. if rollNum[1] == 'S':
4. print("Hi Bro!")
5. else:
6. print("Excuse me?")

Note the extra indentation for Line 4, due to the nesting. If the first character is ‘C’, it checks the second
character on Line 3, and if it is ‘S’, then it executes Line 4. Otherwise, it executes Line 6. Out of the
24
four inputs given above, try running this program with those inputs. For which inputs the program gives
the expected output?

You would notice that this program gives the correct output for CS23B010, ME23B111, BS23B002,
but does not produce any output for CH23B002. Why? Well, because we did not ask it to. The else
clause on Line 5 corresponds to the if clause on Line 2. As far as the if clause of Line 3 is concerned, it
does not have an associated else clause. Hence, when the condition on Line 3 is False (CH….), the
control comes out of the nested-if statement without any output. How do we fix it?

1. rollNum = input("Your roll number: ")


2. if rollNum[0] == 'C':
3. if rollNum[1] == 'S':
4. print("Hi Bro!")
5. else:
6. print("Excuse me?")

The code is still wrong. For what input does it produce no output?
Out of the four inputs, the program produces correct output for CS23B010 and CH23B002. But it does
not produce any output for ME… and BS… roll numbers. This is because, due to indentation, the else
clause now corresponds to the if clause on Line 3. Line 2 does not have any corresponding else. So let’s
add it.

1. rollNum = input("Your roll number: ")


2. if rollNum[0] == 'C':
3. if rollNum[1] == 'S':
4. print("Hi Bro!")
5. else:
6. print("Excuse me?")
7. else:
8. print("Excuse me?")

Now the program works across all the inputs. But note that there is a duplicate processing on Lines 6
and 8. Can we avoid this duplication?

Conjuncts
The duplication can be avoided if we can combine the if conditions on Lines 2 and 3. Python allows us
to do that using conjuncts such as and and or.

1. rollNum = input("Your roll number: ")

25
2. if rollNum[0] == 'C' and rollNum[1] == 'S':
3. print("Hi Bro!")
4. else:
5. print("Excuse me?")

The conjunct executes the if-block if both the conditions are true. Otherwise, it executes the else-block.
Line 2 can also be written succinctly as:

2. if rollNum[0:2] == 'CS':

Example: Find the student from your department where the roll number may be in capital or small-case
letters.

To address this, we need to augment our if condition to include small-case letters too.

1. rollNum = input("Your roll number: ")


2. if (rollNum[0] == 'C' or rollNum[0] == 'c') and (rollNum[1] == 'S' or rollNum[1] == 's'):
3. print("Hi Bro!")
4. else:
5. print("Excuse me?")

Thus, the or conjunct evaluates to True if any one of the conditions evaluates to True. That is, it
evaluates to False only if all the conditions are False.
Note the use of parentheses to combine the clauses by the and conjunct. This is required because or has
a lower precedence than the and conjunct. Thus, in absence of parentheses, the meaning of Line 2 would
be:
2. if rollNum[0] == 'C' or (rollNum[0] == 'c' and rollNum[1] == 'S') or rollNum[1] == 's':

Can you find out the inputs for which this modified program would produce wrong results?
One may write the above program by reordering the conditions.

1. rollNum = input("Your roll number: ")


2. if (rollNum[0] == 'C' and rollNum[1] == 'S') or \
3. (rollNum[0] == 'c' and rollNum[1] == 's'):
4. print("Hi Bro!")
5. else:
6. print("Excuse me?")

Are the last two programs equivalent? The answer is no. The last program allows roll numbers ‘CS…’
and ‘cs…’, but does not allow a mixed-case ‘Cs…’ or ‘cS…’, which is permitted by the earlier program.

26
Also note how the long statement on Line 2 is split using a backslash at the end of the line. Without the
backslash, the program exhibits a syntax error.

Yet another way in which the above check can be made is by using a string function upper(), which
returns its upper-case version. The following program illustrates this.

1. rollNum = input("Your roll number: ")


2. if rollNum[0:2].upper() == 'CS':
3. print("Hi Bro!")
4. else:
5. print("Excuse me?")

Is it possible to print “Excuse me?” in the if-block and “Hi Bro!” in the else-block? This demands
reversing the conditions. Python provides the not keyword to alter the truth-value of a condition. Thus,
the functionally equivalent program would be:

1. rollNum = input("Your roll number: ")


2. if not (rollNum[0:2].upper() == 'CS'):
3. print("Excuse me?") # note the exchange of print() statements
4. else:
5. print("Hi Bro!")

When the program execution finds out that rollNum[0] is not ‘C’, does it need to check if rollNum[1]
is ‘S’ or not? It does not need to, since the truth value of the condition is anyway going to be False. This
way of evaluating conditions is called short-circuiting.

Short‐circuiting
In short-circuiting, only as many sub-conditions as required to find the truth value of the whole condition
are evaluated. This can lead to some sub-conditions not getting evaluated. We explain this using an
example: a == 0 and (b < 3 or c <= 5). In the below table, ✔indicates that the corresponding sub-
condition is evaluated. An empty cell indicates that the sub-condition is short-circuited and, therefore,
not evaluated.

Values a == 0 b<3 c <= 5 Truth value

a = 0, b = 3, c = 6 ✔ ✔ ✔ False

a = 0, b = 3, c = 5 ✔ ✔ ✔ True

a = 0, b = 2, c = 5 ✔ ✔ True

27
a = 0, b = 2, c = 6 ✔ ✔ True

a = 1, b = 3, c = 5 ✔ False

Thus, when a = 0, b = 2, c = 6, the sub-condition a == 0 is evaluated and is True. However, we cannot


deduce whether the whole condition is True or not, due to the presence of the and conjunct. Therefore,
we must evaluate (b < 3 or c <= 5). The sub-condition, b < 3 is True. At this stage, we can say that
therefore, the condition (b < 3 or c <= 5) is also True – without evaluating c <= 5. Thus, the last sub-
condition gets short-circuited and is not evaluated. The whole condition evaluates to True.

Note that a row such as the following is not possible for this example.

a = …, b = …, c = … ✔ ✔ False

One may wonder how such a phenomenon affects you, since this seems to be only an optimization and
not affecting any output. The output gets affected when sub-conditions can have side-effects. For
instance, if a, b, c are replaced with function calls, and each function has a print() statement, then the
output will depend upon whether the short-circuiting happens or not.

Future connect : We will study functions in detail in another unit. But to illustrate short-circuiting, we
present an example using functions.

1. def a():
2. print("a")
3. return True
4. def b():
5. print("b")
6. return True
7. def c():
8. print("c")
9. return True
10.
11. if a() and (b() or c()):
12. print("Whole condition is true.")
13. else:
14. print("Whole condition is false.")

The above program defines three functions a(), b() and c(), which print a message and return True. The
main program starts from Line 11 (similar to our programs so far), and evaluates the condition. After
evaluating function a() and function b(), the condition is guaranteed to be True, hence, function c() is
not executed. Therefore, the output of the above code is:

28
a
b
Whole condition is true.

Without short-circuiting, c would also have been printed.

Does this mean short-circuiting is useful only while using functions? Not at all. Short-circuiting can be
very useful to guard against an invalid access. For instance, consider the condition below:
if index < len(mystring) and mystring[index] == ‘x’:
With short-circuiting, if the index is beyond the string then the first sub-condition is False. Therefore,
the second sub-condition will not be evaluated, avoiding the out-of-bound access.

Example: Lucky cards!


To expand on our understanding, let’s implement the game of lucky cards. We know that a standard
deck of 52 cards is grouped into four suites Club, Diamond, Heart, and Spade, each containing 13 cards:
Ace, 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, and King. Some of these cards are considered lucky. Our task
is to read a card as input and output if the card is lucky or not.

The predefined set of lucky cards is as follows:


● Ace of Spade
● Any Heart
● Queen of Diamond
● King of Diamond
● Any 7
The corresponding program will have a list of conditions separated by or. Each condition can then take
care of one category of lucky cards. A sample run of such a program can be as follows.

Enter your card: 7 of Heart


Lucky you!

Enter your card: Queen of Spade


Better luck next time.

Enter your card: 8 of Spade


Better luck next time.

Enter your card: Ace of Spade

29
Lucky you!

Let’s write the program using conditionals for lucky cards.

1. card, of, suite = input('Enter your card: ').split()


2.
3. if (suite == 'Spade' and card == 'Ace') or \
4. (suite == 'Heart') or \
5. (suite == 'Diamond' and card == 'Queen') or \
6. (suite == 'Diamond' and card == 'King') or \
7. (card == '7'):
8. print('Lucky you!')
9. else:
print('Better luck next time.')

Line 1 uses the split() method to separate the three words, and accordingly, find the suite and the card
in it. It then runs a large if condition, separated by or, split on multiple lines connected by backslash (\).
If any one of the conditions is satisfied, it is a lucky card. Note that Line 7 should use string ‘7’ and not
integer 7.

We would like to now enhance this program to check for invalid cards. Thus, inputs such as ‘seven of
Spade’ or ‘7 of spade’ or ‘7 Spade’ or ‘11 of Spade’ should be marked as invalid.

1. card, of, suite = input('Enter your card: ').split()


2.
3. if (suite != 'Spade' and suite != 'Heart' and \
4. suite != 'Club' and suite != 'Diamond') or \
5. (card != 'Ace' and card != '2' and card != '3' and card != '4' and \
6. card != '5' and card != '6' and card != '7' and card != '8' and \
7. card != '10' and card != 'Jack' and card != 'Queen' and card != 'King'):
8. print('Invalid card')
9.
10. elif (suite == 'Spade' and card == 'Ace') or \
11. (suite == 'Heart') or \
12. (suite == 'Diamond' and card == 'Queen') or \
13. (suite == 'Diamond' and card == 'King') or \
14. (card == '7'):
15. print('Lucky you!')
16. else:
17. print('Better luck next time.')

In the above program, Line 1 and Lines 10 – 17 remain the same as before, except replacing if on Line
10 with elif. In addition we create another long condition to check if the card is invalid. This is done

30
with subconditions to check the suite (Lines 3 and 4) and subconditions to check the card (Lines 5 – 7).
The two subconditions are joined with an or condition (end of Line 4, and note the parentheses).

The condition can be simplified using De Morgan’s laws. The laws mention:
not x and not y == not(x or y)
not x or not y == not(x and y)
Thus, Lines 3–8 can be rewritten as:

3. if not ( (suite == 'Spade' or suite == 'Heart' or \


4. suite == 'Club' or suite == 'Diamond') and \
5. (card == 'Ace' or card == '2' or card == '3' or card == '4' or \
6. card == '5' or card == '6' or card == '7' or card == '8' or \
7. card == '10' or card == 'Jack' or card == 'Queen' or card == 'King') ):
8. print('Invalid card')

Carefully note the application of the laws. The inner subconditions now contain ==, which are easier to
decipher. The conditions of the suite and the card are joined using an and while the subconditions of a
suite are joined using or. If the big condition is true, it means that it is a valid card. Therefore, to check
invalidity, a negation using not is required for the whole condition (note the parentheses).

2.2 Loops
Some programs cannot be written without the ability to repeat. For instance, consider printing
‘Hello World!’ 100 times. One can write 100 print() statements easily. However, if I now ask
you to write a program to take a number from the user and print ‘Hello World!’ those many
times, you would be stuck!

Loops allow us to repeat an arbitrary piece of code, arbitrarily number of times. Python
supports two types of loops: while and for. The while loop iterates through (repeats) a
sequence of code till a given condition is True. The for loop iterates over the items of a given
sequence. We will study both in detail now.

While Loop
Let’s first write our program to print a message a certain user-defined number of times.

1. n = int(input())
2. i = 0
3. while i < n:
4. print('Hello World!')

31
Note the similarity of the loop’s structure with that of an if statement. The body of the loop (in this case,
Line 4) is repeated till the condition on Line 3 is true.

If you enter 10 as input, how many ‘Hello World!’s does our program print? 9 or 10?
Well, it continues to print ‘Hello World!’ an unbounded number of times. Why? Because we asked it
to. The condition i < n continues to remain true, as the value of i never changes in the loop. To get the
expected result, we need to increment i.

1. n = int(input())
2. i = 0
3. while i < n:
4. print('Hello World!')
5. i=i+1 # progress

Remember : A loop must make progress towards its terminating condition.

Example: Find pass-percentage of a class.


A teacher is entering the marks of students. A student passes a course if the marks are at least 40 (out
of 100). The teacher wants to know the percentage of students passed.

To find the pass percentage, we need the number of students P passing the course, and the total number
of students N in the class. The output then is P * 100 / N. Finding P needs us to go through all the marks.
How do you find N?

There are two ways. One, we ask the teacher to enter N at the start of our program. Two, we derive it
based on a special end-of-class marker (e.g., -1 for marks). Let’s write the program both ways.

1. n = int(input(‘Number of students: ‘))


2. i=0
3. pass = 0
4.
5. while i < n: # predefined number of iterations
6. marks = int(input(‘Enter marks: ‘))
7.
8. if marks >= 40:
9. pass = pass + 1
10.
11. i=i+1
12.
13. print(pass * 100 / n, ‘% passed’)

32
Note how we added blank lines 4, 7, and 10, which makes the code tidy and often easier to understand.
The code can be enhanced to check for the divide-by-zero error at Line 13.

Good Programming Practice : Add blank lines to improve readability of the code. A typical
convention is to have a blank line prior to and after a control construct (and functions), especially for
large programs.

Let’s now write it with an end-of-class marker. Assuming marks are always non-negative (no negative
marking by the teacher), we can use -1 as a special marker. As soon as that number is entered, our
program can know that marks of all the students are entered and we can now compute the percentage.

A sample execution of such a program would be as follows.

Enter marks: 5
Enter marks: 10
Enter marks: 50
Enter marks: 32
Enter marks: 23
Enter marks: 40
Enter marks: 89
Enter marks: 100
Enter marks: 99
Enter marks: 89
Enter marks: -1
60.0 % passed

The output was computed by counting the number of non-negative inputs as N. The corresponding
program is as follows.

1. n=0
2. pass = 0
3. marks = int(input(‘Enter marks: ‘))
4.
5. while marks >= 0: # number of iterations based on data
6. if marks >= 40:
7. pass = pass + 1
8.
9. n=n+1
10. marks = int(input(‘Enter marks: ‘))
11.
12. print(pass * 100 / n, ‘% passed’)

33
Note how the condition changed at Line 5. It is now based on the marks entered. The students are
counted at Line 9. Note also the usage of the same statement at Lines 3 and 10. You will find this to be
a recurrent pattern.

One can use tricks to avoid the repetition. For instance, by initializing n to -1, marks to 0, and removing
the input() statement outside the loop. The code will be functionally equivalent to the above code. But
it is important to make sure that the code is readable.

Example: Print Fibonacci sequence.


Fibonacci sequence is defined as follows.
Fib(0) = 0, Fib(1) = 1
Fib(n) = Fib(n - 1) + Fib(n - 2)

If we expand the Fib function, we get a sequence as 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, … Since it is unending,
we can specify how many terms you wish to print, that is, the value of n.

Future Connect : Fibonacci sequence is a classic example of recursion.

While Fibonacci numbers have a strong presence in Mathematics, a few natural phenomena also follow
Fibonacci numbers (e.g., how branches emerge in trees or how petals of certain flowers are arranged).

Can we use loops to print this sequence? We can get started like this (as a pseudocode).

n = int(input())
first number = 0
second number = 1

while n > 0:
print the first number
Shall I print the second number?
sum = first number + second number
What happens in the second iteration?
n=n-1

The issues need to be resolved as follows.


● We should print only one number per iteration.
● Somehow, the first number should take the next value to be printed in the next iteration.
● The other two variables (second number and sum) should be computed but not printed in the
current iteration.
Essentially, the Fibonacci sequence should slide through the three variables.

34
first number second number sum

0 1 1

1 1 2

1 2 3

2 3 5

3 5 8

5 8 13

… … …

This sliding through can be implemented by:


● discarding the printed value of the first number,
● storing second number into the first number, and
● storing sum into the second number.
The overall program now becomes:

1. n = int(input())
2. first = 0
3. second = 1
4.
5. while n > 0:
6. print(first)
7. sum12 = first + second
8. first = second
9. second = sum12
10. n=n-1

Note the order of statements 7, 8, and 9. Reordering of these statements would result in a wrong output.
By the way, we can use sum instead of sum12 and the code would work. But since sum() is a predefined
function in Python, we would like to avoid using the same name.

The same code can be succinctly written using the multiple-assignment form, which gets rid of the
sum12 variable.

1. n = int(input())

35
2. first, second = 0, 1
3.
4. while n > 0:
5. print(first)
6. first, second = second, first + second
7. n=n-1

It is important to understand that for the program to work correctly, all the expressions on the right hand
side of = must be evaluated before any assignment to the left hand side variables happens. This allows
you to swap two variables easily: x, y = y, x

Example: Collatz sequence.


Mathematician Collatz made a conjecture which is still unproven. He said, start from any positive
integer. If it is even, halve it; otherwise triple it and add one. Now repeat this process. This sequence
will finally reach 1. This conjecture is also known as the 3n+1 problem. For instance, for input 7, the
sequence becomes 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1.

Let’s implement this sequence given an input number.

1. n = int(input())
2.
3. while n != 1:
4. if n % 2 == 1:
5. n=3*n+1
6. else:
7. n = n // 2
8. print(n)
9.
10. print(‘How do I always get printed?’)

For Loops
For loops work with an ordered sequence, wherein the loop variable assumes the value of each element
in the sequence from left to right in different iterations. For instance, the simple form of a for loop can
be used to print numbers from 0 to 9.

1. for i in range(10):
2. print(i)

In the above program, i is the loop variable, going over the sequence returned by range(). The function
could also be used to step through the sequence in a strided manner. For instance, consider the following
code.

36
1. for i in range(10, 20, 3):
2. print(i)

Its output is

10
13
16
19

The same code can be equivalently written using a while loop as follows.

1. i = 10
2. while i < 20:
3. print(i)
4. i=i+3

Unlike the while loop, since the for loop iterates through a finite sequence, we do not have to often
worry about the termination condition.

Example: Print the negative numbers from a given sequence.


We can use a for loop to go over a given sequence and print the number if it is negative.

1. numbers = [65, 32, -3, 32, 43, 32, -22, 2, -5, -7, 0, -9]
2. for n in numbers:
3. if n < 0:
4. print(n)

Example: Find the continent with the longest name.

We can store the continents in a list and iterate through them using a for loop. To find the length of a
string, we can use the len() function. Finding the maximum length requires maintaining a maxlen
variable. However, to print the continent with the maximum length, we also need to store it in the
maxcont variable – whenever we update maxlen. The corresponding code is below.

1. continents = [“Africa”, “Antarctica”, “Asia”, “Australia”, “Europe”, “North America”,


“South America”]

37
2. maxlen = 0 # this initialization is important
3.
4. for cont in continents:
5. if maxlen < len(cont):
6. maxlen = len(cont)
7. maxcont = cont
8.
9. print(maxcont)

Since there are two answers to this question, think about what would our code output. If we want the
other answer to be output, what small change would you make to the code?

Example: Check if a given number is prime. The number is larger than 2.


We will use a simple algorithm which checks divisibility of a given number n with numbers from 2 to
square root of n. sqrt() is a function for finding the square root defined in a module named math. For
this, we will have to tell the Python interpreter that we want to use this module. This is done using an
import statement.

1. import math, sys


2.
3. n = int(input())
4.
5. for i in range(2, 1 + int(math.sqrt(n))): # note 1 + …
6. if n % i == 0: # if i divides n
7. print('Composite')
8. sys.exit()
9.
10. print('Prime')
The above program works for a number larger than 2. Line 1 imports math and sys modules. The former
is for sqrt function (Line 5) while the latter is for exit function (Line 8), which terminates the program.
The loop goes over all the integers from 2 to sqrt(n) and if any of them divides n (Line 6), then the
number is not prime. What happens if we comment out Line 8?

Future Connect : We will learn about modules in a later unit. Consider a module to be a library of
useful functions related to a specific domain.

In the above program, only 2 is an even divisor which needs to be checked, but we are unnecessarily
checking against 4, 6, 8, … This can be easily avoided as follows.

1. import math, sys


2.
3. n = int(input())

38
4.
5. if n % 2 == 0:
6. print(‘Composite’)
7. sys.exit()
8.
9. for i in range(3, 1 + int(math.sqrt(n)), 2):
10. if n % i == 0: # if i divides n
11. print('Composite')
12. sys.exit()
13.
14. print('Prime')

Nested Loops and Loop Modifiers


Loops can be nested, that is, one loop can be inside another. Consider that we want to find all the prime
numbers from 3 to 100. How can we modify the above program, which works for one number, to work
for many numbers?

1. import math, sys


2.
3. for n in range(3, 100+1):
4. print(n)
5. # code from the above program (Lines 5 – 14)

Unfortunately, the output of the above code consists of only two numbers.

3
Prime
4
Composite

This happens because we have sys.exit() after printing Composite. Exiting the program was okay for a
single number, but not for a sequence of numbers. Ideally, we want that after finding out that a number
is composite, we should not print Prime, but still go to the next number for primality checking.

Python provides loop modifiers to enable such a control-flow. The keyword break brings the control
out of the currently enclosing nearest loop, while the keyword continue skips the rest of the iteration
and goes to the next one. The whole program would now be as follows.

39
1. import math
2.
3. for n in range(3, 100+1):
4. print(n)
5.
6. if n % 2 == 0:
7. print(‘Composite’)
8. continue # go to the next number
9.
10. for i in range(3, 1 + int(math.sqrt(n)), 2):
11. if n % i == 0: # if i divides n
12. print('Composite')
13. break # come out of the loop at Line 10
14.
15. print('Prime')

After Line 7 finds out that a number is even, Line 8 goes back to the next number in the range specified
at Line 3, skipping Lines 9 – 15. Similarly, after finding a number composite at Line 12, Line 13 brings
the control to Line 14, out of the for loop at Line 10. Note that Line 13 is enclosed in two loops (Line 3
and Line 10), but it comes out of the currently enclosing nearest loop, that is, Line 10’s loop. The
program now prints all the numbers from 3 to 100, and most of the outputs are correct, but some are
wrong. For instance, the initial few lines are:

3
Prime
4
Composite
5
Prime
6
Composite
7
Prime
8
Composite
9
Composite
Prime
10
Composite

Notice the output for number 9. Our code prints both Composite and Prime. Why does this happen?
This is because break at Line 13 brings the control to Line 14. After that Line 15 continues to print
Prime.
40
How do we correct this? We can keep track whether Composite was printed or not. If it was, then we
need not print Prime; otherwise we should. The modified code looks like this.

1. import math
2.
3. for n in range(3, 100+1):
4. print(n, ‘’, end=’’)
5.
6. if n % 2 == 0:
7. print(‘Composite’)
8. continue # go to the next number
9.
10. composite = False
11.
12. for i in range(3, 1 + int(math.sqrt(n)), 2):
13. if n % i == 0: # if i divides n
14. print('Composite')
15. composite = True
16. break
17.
18. if composite == False: # number is not composite
19. print('Prime')

The above code works well and produces the expected output.

Future Connect : Such a reuse of functionality (checking primality of one number) can be nicely
encoded using functions. We can then call such a function in a loop.

Python provides a special construct to support the functionality we implemented using the variable
composite. It is the else clause of a loop (for or while).

1. import math
2.
3. for n in range(3, 100+1):
4. print(n, ‘’, end=’’)
5.
6. if n % 2 == 0:
7. print(‘Composite’)
8. continue
9.

41
10. for i in range(3, 1 + int(math.sqrt(n)), 2):
11. if n % i == 0: # if i divides n
12. print('Composite')
13. break
14. else: # else corresponding to for
15. print('Prime')

Note the indentation of the else clause at Line 14. It corresponds to the for loop at Line 10 (and not to
the if statement at Line 11). The else clause is executed if the loop exhausts all the elements in the
sequence of numbers (or when the while condition becomes False), but does not get executed if the loop
gets terminated by break. To understand this clause better, let’s look at the following simple example.

1. for i in range(1, 10):


2. print(i)
3. else:
4. print("End:", i)
5.
6. i=1
7. while i < 10:
8. print(i)
9. i=i+1
10. else:
11. print("End:", i)

Can you analyze the program and find out the output of the above code? It is as below.

1
2
3
4
5
6
7
8
9
End: 9
1
2
3
4
5
6
7
42
8
9
End: 10

Note how the while loop increments the variable to 10, whereas the for loop retains its value to 9 at the
end.

To understand the use of break and continue, let’s consider the following example.

Example: Given a list, print all the positive numbers, but stop as soon as 0 is reached.
For instance, if the input list is [-4, 2, 54, 21, -32, 3, 6, 3, 1, 0, -5, 321], the output should print the
values 2, 54, 21, 3, 6, 3, 1. It should not print 321. The program is as below.

1. numbers = [-4, 2, 54, 21, -32, 3, 6, 3, 1, 0, -5, 321]


2.
3. for n in numbers:
4. if n == 0: # 0 is seen, come out of the loop
5. break
6.
7. if n < 0:
8. continue # do not print negative numbers
9. print(n)

Python also supports a pass statement, which can be used when no processing is required.
For instance, consider a program where we wish to find the sum of positive integers in a list which may
contain negative numbers also.

1. numbers = [-4, 2, 54, 21, -32, 3, 6, 3, 1, 0, -5, 321]


2. sumn = 0
3.
4. for n in numbers:
5. if n < 0:
6. pass
7. else
8. sumn = sumn + n
9.
10. print(sumn)

Of course, the program can be modified to not require pass. But sometimes, it can be useful for
readability purposes.

43
1. price = int(input('Price: '))
2. deno1, deno2, deno3 = map(int, input('Denominations: ').split())
3.
4. print("Can you form", price, "exactly using", deno1, deno2, deno3, "?")
5.
6. for d1 in range(0, 1 + price // deno1):
7. for d2 in range(0, 1 + price // deno2):
8. for d3 in range(0, 1 + price // deno3):
9. if d1 * deno1 + d2 * deno2 + d3 * deno3 == price:
10. print(d1, d2, d3)
11. else:
12. print('No')

Line 2 uses the split() function to receive three fields (strings). To map those to integers, we can use the
map() function. Lines 6–8 run a triply-nested loop which check all possible combinations of the three
denominations, upper bounding by the price. For instance, if the price is 550, there is no point in using
more than five 100-rupee notes. If such an exact combination is found, we print it at Line 10. If no such
combination exists, then we print No at Line 12 (Line 11 is an else corresponding to the outermost for
loop).

UNIT SUMMARY
We explored various basic control constructs in Python such as conditionals and loops. We solved
various problems using those constructs. We also looked at various loop modifiers which allow special
control flow.

EXERCISES

Multiple Choice Questions


1. What is the output of the following program for the input 100 99?

n1, n2 = input().split()
if n1 < n2:
print(n1, "<", n2)
elif n1 == n2:
print(n1, “==”, n2)
elif n1 < n2:
print(n1, “again <”, n2)

45
else:
print(n1, ">=", n2)

A. 100 < 99
B. 100 >= 99
C. 100 == 99
D. 100 again < 99

2. Suggest a value for the dob variable that illustrates usefulness of short-circuiting for the following
conditional statement.

if len(dob) >= 5 and dob[3:5] == "08":


print("You are an August person.")

A. ‘29/10/1999’
B. ‘29/08/2000’
C. ‘1/1/2001’
D. ’123’

3. What is the output of the following program?

numbers = [65, 32, -3, 32, 43, 32, -22, 2, -5, 64, 0, 8]
i=0

for n in numbers:
if 2**i == n:
print(n)
i=i+1

A. 32 32 64 8
B. 32 64 8
C. 32 64
D. 32

4. What should be the value of the expression X in the following program to print all the squares between
3 and 99?

46
The output of the above code is:

Hello World!
Hello from loop
Hello from loop

Such arguments can help tune the functionality. This is what we do with a print() function.

Recall our program from Unit 2 which found out if a range of numbers is prime or not. We will modify
it to use functions and see how the program looks simplified.

Example: Find out if numbers from 3-to-100 are prime, using functions.

Here is the original program without using functions.

1. import math
2.
3. for n in range(3, 100+1):
4. print(n, ‘’, end=’’)
5.
6. if n % 2 == 0:
7. print(‘Composite’)
8. continue
9.
10. for i in range(3, 1 + int(math.sqrt(n)), 2):
11. if n % i == 0: # if i divides n
12. print('Composite')
13. break
14. else: # else corresponding to for
15. print('Prime')

Which part can we move to a function? That is up to us. For instance, we can simply make printing
Prime or Composite inside a function. Alternatively, we can convert the for loop at Line 12 into a
function. From the perspective of this example, we can make primality testing of a given number as a
function. This function can then be called from the loop at Line 3.

1. import math
2.
3. def isPrime(n):
4. if n % 2 == 0:
5. print(‘Composite’)
6. continue # there is no enclosing loop

52
7.
8. for i in range(3, 1 + int(math.sqrt(n)), 2):
9. if n % i == 0: # if i divides n
10. print('Composite')
11. break
12. else: # else corresponding to for
13. print('Prime')
14.
15. for n in range(3, 100+1):
16. print(n, ‘’, end=’’)
17. isPrime(n)

See how we have moved the code to the isPrime() function and the main module is simplified (Lines
15 – 17). Unfortunately, this code reorganization poses an issue. The continue statement at Line 6 is
now no longer inside a loop. Hence, it is a syntax error. From the perspective of a function, we want
that after printing Composite at Line 5, the control should go back to the caller and the next number
should be checked for primarily. This can be done using a return statement.
6. return

Now the program works as expected. It will be nice however, if the caller can take charge of the
processing after finding if a number is prime or not. This demands the function isPrime() to return True
or False.

1. import math
2.
3. def isPrime(n):
4. if n % 2 == 0:
5. return False
6.
7. for i in range(3, 1 + int(math.sqrt(n)), 2):
8. if n % i == 0:
9. return False
10. else:
11. return True
12.
13. for n in range(3, 100+1):
14. if isPrime(n): print(n, end=', ')

Lines 5, 9, and 11 return a boolean value depending upon the argument number’s primality. If the
number is composite, the caller does not do anything, otherwise it prints the number. The output of the
above code is:

53
1. import math
2.
3. def getAllPrimes(maxn): # find primes upto maxn
4. global allPrimes # I want to access allPrimes list
5.
6. for n in range(3, maxn+1):
7. if n % 2 == 0:
8. continue # we can use this now
9.
10. for i in range(3, 1 + int(math.sqrt(n)), 2):
11. if n % i == 0:
12. break
13. else:
14. allPrimes.append(n) # add to the list
15.
16. # main module begins here
17. allPrimes = []
18. getAllPrimes(100) # populate allPrimes
19. print(allPrimes)
20. getAllPrimes(50) # populate allPrimes
21. print(allPrimes)

The function getAllPrimes() remains almost the same, except for two changes. One: at Line 4, it marks
allPrimes to be a global variable. Two: it does not (need to) return allPrimes. allPrimes is now a regular
variable in the main module initialized to an empty list at Line 17. It is populated by getAllPrimes() at
Lines 18 and 20. Note that after these two function calls, we simply print allPrimes list in Lines 19 and
21. What will be the output of the code?

If you said that the output will remain the same, that is not correct. Here is the output.

[3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
[3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 3, 5, 7, 11,
13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

Note how the second call to getAllPrimes() has appended the prime numbers to the original list. This
happened because we forgot to initialize the list to empty before calling getAllPrimes(50) at Line 20.

Remember : Global variables make the code easy, but can also lead to initialization errors if we are
not careful.
To understand the scope of global variables, let’s look at a simple example. What will be the output of
this code?

55
1. g = 10 # global variable
2. print("1:", g).
3.
4. def fun():
5. g = 20 # local variable
6. print("3:", g)
7.
8. g = 30 # global variable
9. print("2:", g)
10. fun()
11. print("4:", g)

A function can be defined in the middle of the main module (Line 4). The program works as follows.

Line 1: assigns 10 to a global variable g.


Line 2: prints 1: 10
Line 8: assigns 30 to the same global variable g.
Line 9: prints 2: 30
Line 10: makes a function call to fun() at Line 4.
Line 5: assigns 20 to a local variable g. This is a new variable, different from the global g.
Line 6: prints 3: 20
The control now comes back to the caller. The local variable is no longer accessible.
Line 11: prints 4: 30. This is the same earlier global variable g.

To summarize, when fun() accesses a variable, it is the local variable by default. If we want it to access
the global variable g, we will have to add the statement: global g, similar to the last example (global
allPrimes). Then the output will change to 4: 20.

Good Programming Practice : If there are central structures holding data which you need to access
from several functions, then only use global variables. Otherwise, avoid globals and pass the data as
arguments to functions.

Example: Perfect numbers.


A number is perfect if the sum of its proper divisors equals itself. For instance, 6 is a perfect number as
its proper divisors (1, 2, 3) sum to 6. Another perfect number is 28 = 1 + 2 + 4 + 7 + 14. Let’s write a
program to find perfect numbers using functions.

The main module can be written as below.

56
10. for n in range(1, 10000):
11. factors = findFactors(n)
12.
13. if n == sum(factors):
14. print(n, ":", factors)

We make use of two functions: findFactors and sum. The latter is a predefined function in Python, while
we will have to write the former. With our experience of primality testing, we know how to write it.

1. def findFactors(n):
2. factors = []
3.
4. for f in range(1, n):
5. if n % f == 0:
6. factors.append(f)
7.
8. return factors;

Example: Fibonacci strings


Like Fibonacci numbers, we can define Fibonacci strings, which are strings over binary digits 0 and 1.
Fib(0) = “0”, Fib(1) = “01”
Fib(n) = Fib(n-1) + Fib(n-2), where + is Python’s string concatenation
A few initial Fibonacci strings are:
0, 01, 010, 01001, 01001010, 0100101001001, …

Fibonacci strings have a special property. If we remove the last two digits, then the string reads the same
forward and backward. Such strings are called palindromes (e.g., nitin, a, malayalam, amma, tattarrattat,
and if you permit spaces and punctuation, “borrow or rob”, “was it a car or a cat I saw?”, “never odd or
even”, “A man, a plan, a canal – panama”).

Let’s check if Fibonacci strings have this fascinating property.

1. def isPalindrome(bstring):
2. blen = len(bstring)
3. first = bstring[:blen // 2] # integer division to get the index
4. second = bstring[(blen + 1) // 2:]
5.
6. if first == second[::-1]: # extended range, for reversing
7. return True
8.

57
9. return False
10.
11. # main module begins
12. first = "0"
13. second = "01"
14. i=0
15.
16. while i < 10:
17. if i >= 2 and isPalindrome(first[:-2]):
18. print(first[:-2] + " is a palindrome")
19.
20. first, second = second, second + first
21. i=i+1

We define the first two strings in Lines 12 and 13. We check this property for the first few Fibonacci
strings (Line 16). The strings are generated in Line 20 using the multiple assignment form. Line 17 uses
a conjunct to call function isPalindrome() by removing the last two binary digits of the eligible strings.
If the property is satisfied, all the strings should get printed by Line 18.

The function isPalindrome() splits the given string into two halves. [0:blen // 2] is the first half, while
the second half is [(blen + 1) // 2: blen], which we use in Lines 3 and 4. Interestingly, the expressions
work whether the string is even length or odd. Note that we use the same variable names (first and
second) in both the main module and the function, but both refer to two different entities. Line 6 uses a
peculiar syntax [::-1] which defines a range, but also uses a step (-1 in this case, 1 by default) which
allows us to reverse the string second. If the two strings are the same, we declare it to be a palindrome.

The output of the above code is:

0 is a palindrome
010 is a palindrome
010010 is a palindrome
01001010010 is a palindrome
0100101001001010010 is a palindrome
01001010010010100101001001010010 is a palindrome
01001010010010100101001001010010010100101001001010010 is a palindrome
01001010010010100101001001010010010100101001001010010100100101001001010010100100
1010010 is a palindrome

An interesting aspect of writing functions is that we can reuse it in different contexts. For instance, the
isPalindrome() function works not only for binary strings, but also for arbitrary strings.

58
1. import primefun
2.
3. for n in range(3, 50+1):
4. if primefun.isPrime(n): print(n, end=', ')
5. print()

Thus, we can call a function from a module using modulename.functionname syntax (Line 4). The
output of the above code is:

3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97,
3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47,

Note that the first line is printed by the code in primefun.py, while the second one is by the code in
primefunmodule.py. We can also rename an imported entity and can import everything from a module,
so that we can directly make use of the functions and globals from that module (instead of using the
syntax modulename.functionname).

1. import primefun
2.
3. for n in range(3, 50+1):
4. if primefun.isPrime(n): print(n, end=', ')
5. print()
6.
7. myprime = primefun.isPrime # shorthand name
8.
9. if myprime(22): print(22) # use the shorthand name
10. else: print("22 is composite")
11.
12. from primefun import * # import everything from a module
13.
14. if isPrime(23): print(23) # and use the function directly
15. else: print("23 is composite")

Line 7 creates a shorthand name (like a pointer) to our isPrime function, which we can use in Line 9.
This avoids using primefun.isPrime() type syntax as in Line 4. another way to use the isPrime() function
directly is by importing only the function isPrime from the module primefun or importing everything
(indicated by *) as in Line 12, and then using the function in Line 14. The output of the above code is:

65
Let’s create a matrix module which can have some useful functions such as addition and multiplication.
To initialize the matrix, let’s add another function, and to test it, let’s add a printing function too. The
four function signatures would look like this:

def minit(rows, cols, val = 0, inc = 0):



def mprint(mat):

def madd(mat1, mat2):

def mmult(mat1, mat2):

How is a matrix represented? It can be a list of lists. For instance, [ [1, 2, 3], [4, 5, 6] ]. Before reading
further, it would be helpful to try out writing the above four functions yourself.

Here is how we will be able to use it in a client program, assuming the above module is stored in
matrix.py.

1. from matrix import *


2.
3. mat1 = minit(4, 4, 1, 0)
4. mat2 = minit(4, 4, 10, 2)
5. mprint(mat1)
6. mprint(mat2)
7.
8. mat3 = madd(mat1, mat2)
9. mprint(mat3)
10.
11. mat4 = mmult(mat1, mat2)
12. mprint(mat4)

The output of the above program is:

111
111
111

10 12 14
16 18 20

69
22 24 26

11 13 15
17 19 21
23 25 27

48 54 60
48 54 60
48 54 60

The four matrices correspond to the four mprint statements in the program (Line 5 for mat1, Line 6 for
mat2, Line 9 for mat3 which is addition of mat1 and mat2, while Line 12 for mat4 which is the
multiplication of mat1 and mat2).

Let’s now look at how to implement the four functions.

1. def minit(rows, cols, val = 0, inc = 0): # using default arguments


2. mat = []
3.
4. for i in range(rows):
5. mat.append([]) # create ith row
6.
7. for j in range(cols):
8. mat[i].append(val) # create jth column
9. val = val + inc
10.
11. return mat
12.
13. def mprint(mat):
14. for i in range(len(mat)):
15. for j in range(len(mat[i])):
16. print(mat[i][j], '', end='')
17.
18. print() # newline after every row
19.
20. print() # newline after the matrix
21.
22. def madd(mat1, mat2):
23. # TODO: check if their sizes are the same.
24. mat3 = []
25.
26. for i in range(0, len(mat1)):
27. mat3.append([])

70
A. 12
B. 21
C. 11
D. 22

2. What is the output of the following recursive code?

def rprint(l):
if len(l) > 0:
rprint(l[1:])
print(l[0], end='')

rprint([1, 2, 3, 4, 5])

A. 12345
B. 2345
C. 54321
D. 5432

3. Consider the following code to compute the nth Fibonacci number.

def fib(n):
if n < 2: return 1
return fib(n - 1) + fib(n - 2)

If the function is invoked as fib(5), how many total invocations of the function fib() happen? That is,
how many times, fib() gets called in total.

A. 5
B. 8
C. 9
D. 15

4. Consider a file dateutil.py which has useful functions such as today(), diff(date1, date2), etc. We want
to use this function today() in our program written in file main.py in the same directory. How should
this be done correctly?

77
Let’s implement this functionality.

Example: Print contents of a file.

1. myfile = open('hello.txt') # open the file for reading.


2.
3. for line in myfile:
4. print(line)
5.
6. myfile.close() # close the file.

A file is opened for processing in Line 1 (the operating system sets up resources using which the file
can be accessed), processed (Line 3), and closed in Line 6 (allowing the operating system to release the
resources).

Can you guess the output of the above program? Well, it turns out to be more than what one may
anticipate.

This is a text file.

It contains four lines.

The third line is empty.

This is because the file already contains a newline after every line, and print() adds another. We know
how to fix it though (try it out).

It would be nice if we can provide the filename (hello.txt) on the command-line, instead of hardcoding
in the program or prompting the user every time.

$ python3 cat.py hello.txt


This is a text file.
It contains four lines.

The third line is empty.

For accessing command-line arguments, we can use sys.argv. This variable is a list of arguments we
provide while invoking the Python interpreter. Thus, argv[0] is “cat.py” and argv[1] is “hello.txt”.

82
1. import sys
2.
3. myfile = open(sys.argv[1]) # argv[1] contains the second argument
4.
5. for line in myfile:
6. print(line, end='')
7.
8. myfile.close()

We can use this new program on other text files too. For instance:

$ python3 cat.py ../unit1/power.py # our program using ** operator


print(5 + 3 / 4 - 2)
print(3.1*3.2**3.3)
print(1234%10)

Note that since power.py is in another directory, we have provided a relative path.

Example: Copy a file.


We would like to make a copy of a file. Thus, the contents of the two files should be the same after
running our Python program. We would like to execute it as follows:

$ ls
cat.py cp.py hello.txt # initially there are three files.

$ python3 cat.py hello.txt


This is a text file.
It contains four lines.

The third line is empty.

$ python3 cp.py hello.txt hellodup.txt # copies file, does not print anything

$ ls
cat.py cp.py hello.txt hellodup.txt # now there are four.

$ python3 cat.py hellodup.txt


This is a text file.
It contains four lines.

83
The third line is empty.

We know how to read a file and use command-line arguments. But we don’t know how to write to a
file. It needs a second argument to the open() function.

Past Connect : Second argument can be made optional using default arguments.

1. import sys
2.
3. infile = open(sys.argv[1] , 'r') # open for reading
4. outfile = open(sys.argv[2] , 'w') # open for writing
5.
6. for line in infile: # read
7. print(line, file=outfile, end='') # write
8.
9. infile.close()
10. outfile.close()

You must have guessed why we named our program as cp.py. Linux has a command by that name to
copy files.

Example: Split text into words.


We are given a text file textin.txt having the following contents.

Lexical analyzer groups a sequence of characters into tokens.


Syntax analyzer checks if these tokens adhere to a grammar syntax.

We would like to split it into words, and store it in another file textout.txt, as follows.

1: Lexical
2: analyzer
3: groups
4: a
5: sequence
6: of
7: characters
8: into
9: tokens.

84
10: Syntax
11: analyzer
12: checks
13: if
14: these
15: tokens
16: adhere
17: to
18: a
19: grammar
20: syntax.

We would like to invoke our program as follows.

$ python3 file-split.py textin.txt textout.txt

How do we write such a program? First, we need to utilize reading and writing of files and handling
command-line arguments, as we just did. Apart from that, we need string processing to split a line into
words using space as a separator. To print the word number, we need an iterative way to go over the
words of a line.

1. import sys
2.
3. infile = open(sys.argv[1] , 'r') # textin.txt
4. outfile = open(sys.argv[2] , 'w') # textout.txt
5. ii = 1 # word number
6.
7. for line in infile:
8. words = line.split()
9.
10. for ww in words:
11. print(ii, ww, file=outfile, sep=': ')
12. ii = ii + 1 # word number update
13.
14. infile.close()
15. outfile.close()

Note the similarity in Lines 7 and 10. The former goes over all the lines of a file, while the latter goes
over all the words in a sequence of words. The function split() on Line 8 creates a word-sequence out
of a string.

85
You can see that writing such simple utilities is not difficult in Python. Once you know how to do it, it
is a matter of finding the right functions. Of course, writing programs in the best possible way requires
a significant amount of practice.

Example: Join words to form sentences.


Let’s now do the opposite of what we just did. Given a file textout.txt, containing a word per line
preceded by the word number, can we generate back (almost) the original file textin.txt?

Such a processing requires us file handling similar to the previous example, except for the roles of the
two files reversed for reading and writing. Apart from that, we need to separate out each word from the
number, concatenate all the words, except when full-stop appears. Separating a word from its number
can be done in multiple ways, one of them is using split(), and strings can be concatenated using +
operator. How do we check if a string contains a full-stop?

1. import sys
2.
3. infile = open(sys.argv[1] , 'r')
4. outfile = open(sys.argv[2] , 'w')
5.
6. outline = '' # empty string
7.
8. for line in infile:
9. words = line.strip().split(': ')
10. outline = outline + words[1] + " " # one line, operator + joins strings
11.
12. if "." in words[1]: # found the full-stop
13. print(outline, file=outfile)
14. outline = ''
15.
16. infile.close()
17. outfile.close()

The loop at Line 8 goes over all the lines in the input file. Each line (containing the number and the
word) also contains a newline, which we remove using a function strip(). The function also removes
any preceding spaces and tabs (which we do not have in our file). We then separate the number and the
word using split(). Note how the output of strip() acts as an input to split(). After splitting, the word is
stored in words[1], which we use in Line 12. Note the syntax of checking if a substring exists in a larger
string.

Discussion: The reversal of the output is not perfect. If a line contains a dot in the middle, our program
will split it in multiple sentences (e.g., cse.iitm.ac.in will follow a newline). We can modify the program
to check if a word ends in a dot. But there will be words which will cause issues (e.g., e.g. this line.).

86
If a word is misspelt it will be corrected.

Without reading further, try to reason out why this can happen. Also, you are highly encouraged to try
out whether you observe a similar behavior on your machine.

To understand this peculiar behavior, we need to first know that file input / output is buffered. This
means, the operating system is not going to read the file on disk byte-by-byte, but as a large chunk. This
chunk is copied to a buffer, which essentially is an array having a fixed size. How much buffer we have
read or written is tracked using a pointer (or an array offset). Therefore, when we read / write a line in
a file, this pointer, which is often called a file pointer, moves ahead. As long as the operation is getting
done inside this buffer, further disk processing is avoided (because disk access is costlier than accessing
this buffer in memory). If we end up reading beyond this buffer, the next part of the data is brought
from the disk into another buffer and this process continues. The operating system also knows the size
of the file. So it does not allow reading beyond the current file size.

Now let’s come back and analyze the output of the above erroneous program. Due to Line 5, the
operating system has already read the three lines into its buffer (including the newline characters). This
buffer is processed line-by-line when we only read from the file. But because we also write in Line 7,
it ends up writing at the end of the file (since the three lines have already been read). This adds the
fourth line in the modified file and the file size increases. But where does the file pointer point due to
this writing? It is at the end of the fourth sentence, which is the end of the file. Therefore, the next
iteration of the for loop does not take place (you can check that by printing a message in the loop) and
the loop gets over. The modified file is closed at Line 9 (which makes the operating system dump the
modified buffer to the file on disk, if not done already).

How do we fix this? Clearly, to modify a word we have already read, the file pointer needs to go back
in the file and rewrite that word. This is done using a function called seek(). So our second attempt is to
use it to move the file pointer.

1. import sys
2.
3. myfile = open(sys.argv[1], 'r+')
4.
5. for line in myfile:
6. modline = line.replace('mispelt', 'misspelt')
7. curoff = myfile.tell() # current file pointer
8. myfile.seek(curoff - len(line)) # go back
9. print(modline, file=myfile, end='')
10.
11. myfile.close()

88
If a word is misspelt it will be corrected.
his is the second line, which also contains misspelt word.
his is the third line.

While the two words are corrected, it creates a problem at the start of the lines: T goes missing. Why?

This happens because the new word misspelt is longer than the old one by one character. Therefore,
when we write the modified line in Line 10, it overwrites one character of the next line. We show it
pictorially below.

… m i s p e l t i t w … e d . \n T h i

This buffer gets modified to:

… m i s s p e l t i t w … e d . \n h i

A similar issue occurs on the next line too. Thus, unlike our usual variables (such as strings and
sequences), we cannot insert text in between a file. We will have to overwrite. In other words, the above
program works only when the old and the modified text are of the same length. What happens when the
new text is shorter than the original one? You can again use the pictorial buffer representation above to
find out the answer.

Can we not fix this problem? We can. Instead of reading line-by-line, if we can read the whole file
together, then we can replace all the occurrences of the word and write all the modified contents at once.
This is done using a read() function.

1. import sys
2.
3. myfile = open(sys.argv[1], 'r+')
4. alllines = myfile.read()
5. modlines = alllines.replace('mispelt', 'misspelt')
6. myfile.seek(0) # go to the start of the file
7. myfile.write(modlines)
8. myfile.close()

Line 4 reads the whole file into alllines variable, on which, we perform find-and-replace (Line 5). We
move the file pointer to the start of the file in Line 6, and write the modified lines in Line 7.

90
Example: Retrieve academic record from a file and compute CGPA.
Recall our academics module in which we kept track of our courses, total points, and earned credits.
Our program had hardcoded this academic record. Ideally, this data (academic record) should be kept
separate from the functionality (our Python program). We can store this data into a file (academics.txt),
with semesters separated by an empty line.

CS1100 9 10
CS1200 12 9
AM1100 9 9
# empty line to indicate end of semester
CS2200 12 10
CS2310 6 10
CS2710 6 9
# this empty line is required

Interestingly, our academics.py module can remain the same. We can simply modify academicsuser.py
to read this data file, populate lists in the academics module, and invoke the CGPA calculation. Thus, a
sample run would be as follows.

$ python3 academicsuser.py academics.txt


course credits earned
CS1100 9 10
CS1200 12 9
AM1100 9 9
This Sem CGPA: 9.3

course credits earned


CS1100 9 10
CS1200 12 9
AM1100 9 9
CS2200 12 10
CS2310 6 10
CS2710 6 9
This Sem CGPA: 9.5

Can you write the academicsuser.py program? We present and discuss it below.

1. import sys
2. from academics import *

92
3.
4. acadfile = open(sys.argv[1])
5.
6. for line in acadfile:
7. if line == '\n': # end of semester
8. cprint()
9. print("This Sem CGPA:", cgpa(), "\n")
10. else:
11. cc, cr, ea = line.split()
12. add(cc, int(cr), int(ea))
13.
14. acadfile.close()

We first import sys for argv, and then all the functions from our module academics (Lines 1 and 2). We
open the file specified on the command-line in Line 4. We process each line in the file in the loop at
Line 6. If the line is empty, it indicates the end of a semester. So, we print the academic record and
CGPA (if block). Otherwise, we split the line into course number, credits, and earned points and add
those to our internal record (else block), to be used for CGPA calculation later. We finally close the file
in Line 14.

Example: Track friends information.


We would like to use files to keep track of friends. A friend is associated with a name and the associated
information. We will keep track of the phone number and the github handle. The friends module can
provide API to support the following:

Function Remarks

add(name, phone, github) Add a new friend

remove(name, phone, github) Remove a friend

updatePhone(name, phone) Update phone number of a friend

updateGithub(name, github) Update github handle of a friend

printByName(name) Print information of a friend

printAll() Print information of all the friends

readAll() Read information of friends from a datafile

93
data can be dumped into a datafile using writeAll(). Before a readAll(), we need to ensure that the
dictionaries are empty.

Thus, our friends’ module can be written as follows.

1. phones = {} # name is the key


2. githubs = {} # name is the key
3. datafile = "friends.txt"
4.
5. def add(name, phone, github):
6. phones[name] = phone
7. githubs[name] = github
8.
9. def remove(name):
10. del(phones[name])
11. del(githubs[name])
12.
13. def updatePhone(name, phone):
14. phones[name] = phone
15.
16. def updateGithub(name, github):
17. githubs[name] = github
18.
19. def get(name):
20. return [name, phones[name], githubs[name]] # list
21.
22. def printOne(name, phone, github):
23. print(name, phone, github)
24.
25. def printOneList(npg):
26. printOne(*npg)
27.
28. def printByName(name):
29. printOneList(get(name))
30.
31. def printAll():
32. for name in phones:
33. printByName(name)
34.
35. def writeAll():
36. global datafile
37.

95
4/23/2022 16:08:31, [email protected], PRIYA SHARMISHTHA, 7753763928,
Indian institute of technology madras ,Student
4/23/2022 16:13:12,[email protected],Prema S,9783297231,NMAMIT,Student
4/23/2022 16:50:56, [email protected], Nigam Vaishnav, 447439860637,
IITM,Student
4/23/2022 17:06:19,[email protected], Abha Joshi, 9307876327, Indian Institute of Technology Guwahati,
Faculty
4/23/2022 17:08:28,[email protected], Abha Joshi, 9307876327, Indian Institute of Technology Guwahati,
Student

Challenge 1: Comma is present in the data.


While other fields are okay, the affiliation field may contain a comma. Even while typing a name,
someone may inadvertently enter a comma instead of a dot (e.g. P. V, Sindhu). Someone may enter two
phone numbers separated by a comma. All of these are real-world problems. One way to ensure valid
data is by having validation checks at the form-level itself, which can handle these problems, ensuring
that when the data reaches your Python script, it is sanitized. Another way is to not assume that the data
is sanitized, but write our script to handle these situations.

Solution: We can solve this comma problem by saving the spreadsheet in .csv format with tab (\t) as the
separator. Typically (but not always), a tab is not present in the form-input data. Another option is to
encode the data using the base64 module, which allows arbitrary binary data also to be processed. In
our program, we will use tab-separated values.

Challenge 2: Some fields may contain a newline.


By default, our script may assume that one registration record appears in one line of the .csv file.
However, a few fields such as address may contain a newline, which may result in the malfunctioning
of our script. Often entering a newline is not directly feasible in usual text-boxes on forms, but when
the data is copy-pasted from elsewhere, newlines may appear. Textarea boxes in HTML can also have
newline characters.

1. 4/19/2022 9:28:53, [email protected], K. Velumathi, 8973745575, “PSG College of Technology


2. coimbatore”, Faculty
3. 4/23/2022 11:19:06, [email protected], Addulachari Nikhila, 7832464327, “Indian Institute
of technology
4. Palakkad” Student

Solution: Similar to handling commas, the validation logic (typically, in JavaScript for HTML forms)
can either disallow newlines, or convert those to a different form (such as <br> in HTML). Alternatively,
while converting from spreadsheet to CSV format, double-quotes can be used as the text-delimiter,
which stores multiline field in double-quotes (if double-quotes are present in the field, those can be
encoded, or escaped as \” or “”). In our program, we will assume that this is taken care of in a
preprocessing step, and each line is a new record.

98
Challenge 3: Header can be mistakenly recognized as a record.
Our .csv file contains the first line as the header. While processing, if we forget, our functions may treat
it as a valid record, which can change the output. For instance, if we are booking rooms for all the
attendees, we may end up booking (and paying for) one more room than required.

Solution: This is easy to resolve. We can simply remove it as a sanitization prepass. Alternatively, our
Python script can skip the first line.

With these challenges and possible solutions, let’s now find out some interesting statistics from the
registration data. How do we represent the data? We can store each column as a list. Thus, for our
example data, we can have six associative lists for timestamps, emails, names, phones, affiliations, and
designations. These are populated by reading the .csv file.

Functionality 1: Find the number of non-students registered.


This is a simple check on designation (populated from a drop-down menu in the registration form).

1. def nonStudents():
2. for ii in range(len(designations)):
3. if not designations[ii] == 'Student':
4. printRecord(ii)

We make use of the auxiliary function printRecord() to print details of a registrant. The final output of
using the above function is:

K. Velumathi:::PSG College of Technology,


coimbatore:::Faculty:::[email protected]:::8973745575
Abha Joshi:::Indian Institute of Technology
Guwahati:::Faculty:::[email protected]:::9307876327

Functionality 2: Find all the people registered from IITs.


This is again a straightforward check on the affiliation, with the caveat that some people may use IIT or
iit or Indian Institute of Technology or INDIAN INSTITUTE OF TECHNOLOGY.

5. def iit():
6. for ii in range(len(affiliations)):

99
7. lowaffil = affiliations[ii].lower() # to lower-case
8.
9. if ("iit" in lowaffil and not "iiit" in lowaffil) or \
10. "indian institute of technology" in lowaffil:
11. printRecord(ii)

Line 7 uses a string function lower() to convert the affiliation to lower-case. This allows us to avoid any
capitalization issues while comparing strings. We check if it is “iit” or its full-form in the condition on
Lines 9 and 10. Since we know that there would be registrants from IIIT also, we filter those out from
the output. The final output of using the above function is:

Addulachari Nikhila:::Indian Institute of technology


Palakkad:::Student:::[email protected]:::7832464327
Sophia Kumari:::IIT (ISM) DHANBAD :::Student:::[email protected]:::8647125622
PRIYA SHARMISHTHA:::Indian institute of technology madras
:::Student:::[email protected]:::7753763928
Priya Sharmishtha:::Indian institute of technology madras
:::Student:::[email protected]:::7753763928
Nigam Vaishnav:::IITM:::Student:::[email protected]:::447439860637
Abha Joshi:::Indian Institute of Technology
Guwahati:::Faculty:::[email protected]:::9307876327
Abha Joshi:::Indian Institute of Technology
Guwahati:::Student:::[email protected]:::9307876327

Functionality 3: Find all the duplicate records.


Finding duplicates can be done by checking all pairs of records. We can also reduce the number of
checks between a pair to decide them to be duplicates. For instance, in our code, we will say two records
match if the names and the email ids match (even if the remaining information is different). You can
have more stringent checks by adding more conditions.

12. def match(ii, jj):


13. return (names[ii].upper() == names[jj].upper() and \
14. emails[ii].upper() == emails[jj].upper())
15.
16. def duplicates():
17. for ii in range(len(timestamps)):
18. for jj in range(ii + 1, len(timestamps)):
19. if match(ii, jj):
20. print("Records", ii, "and", jj, "are duplicates")
21. printRecord(ii)
22. printRecord(jj)

100
Enumerating all pairs can be done using a nested loop (Lines 17 and 18). If the two records match (Line
19), then we print them. The match() function uses the string function upper() to remove any differences
in the capitalization. Output of using the above function is:

Records 6 and 8 are duplicates


PRIYA SHARMISHTHA:::Indian institute of technology madras
:::Student:::[email protected]:::7753763928
Priya Sharmishtha:::Indian institute of technology madras
:::Student:::[email protected]:::7753763928

Records 10 and 11 are duplicates


Abha Joshi:::Indian Institute of Technology
Guwahati:::Faculty:::[email protected]:::9307876327
Abha Joshi:::Indian Institute of Technology
Guwahati:::Student:::[email protected]:::9307876327

Functionality 4: Display a list of participants grouped by their affiliations.


Such a grouping can be done in a manner similar to Functionality 3’s nested loops. But we can as well
sort the lists based on affiliations to have the registrants with the same affiliation in consecutive records.
Unfortunately, this poses another challenge. We have multiple associative lists. So if we want to sort
the affiliations list, we should accordingly also sort other lists. Python provides a zip() function to
achieve this.

23. def groupByAffiliation():


24. grouped = zip(affiliations, names, emails)
25. sg = sorted(grouped)
26.
27. for record in sg:
28. print(record[0], record[1], record[2], sep=":::")

Line 24 creates a joint aggregate from individual aggregates. This joint aggregate is sorted in Line 25,
which sorts all the associative lists. Note that we have provided affiliations as the first list in zip(). Once
these lists are sorted based on affiliation, we go over this sorted aggregate row-by-row, that is, one index
across all three aggregates (Line 27). Each entry in this row corresponds to one record. Thus, record[0]
corresponds to the first affiliation in the sorted order, record[1] is the name corresponding to it, and
record[2] is the email id of that registrant. We print this record in Line 28. The output using the above
function is:

101
pattern in Line 8 matches 99405 33241 and does not match the other two. It looks for five digits \d{5},
then a non-digit \D (such as space or hyphen), followed by another set of five digits. The third pattern
on Line 9 for US mobile format consists of an opening parenthesis \( followed by three digits \d{3}
followed by a closing parenthesis, followed by an optional space \s?, followed by three digits \d{3},
followed by a non-digit \D (such as space or hyphen), followed finally by four digits \d{4}. \s indicates
a whitespace character, ? makes the preceding expression optional, and parenthesis need to be escaped
because those have a special meaning in re.

These three patterns are OR’ed in the regular expression regex at Line 11 using | operator (understood
by the re module). Line 12 then calls the findall() method which finds all occurrences of the regular
expression in text. These are returned as a list.

Is it possible to have a single regular expression for 99405 33241 and 8932732436? Yes, we can modify
digits55 as '\d{5}\D?\d{5}' and it will match both the phone numbers.

Good Programming Practice : It is easy to make the regular expressions complex. Therefore, split
those into multiple parts, test each of them separately, and then join them, as we have done in the above
example.

Example: Find all email addresses out of a long text.


Before reading further, think about how you want to identify an email address. One possibility is to have
alphanumeric characters, followed by symbol @, followed by another set of alphanumeric characters.
An alphanumeric character (including underscore) can be specified as \w. Thus, our regular expression
should be: (one or more \w)@(one or more \w).

1. import re
2.
3. text = "My departmental email id is [email protected]. This is what I use for all the
official purposes. The institute also provides an id [email protected]. During PhD, I used
to have a personal email id [email protected], which is still in use. When
gmail was not around, id [email protected] was the one I used. I also own a
twitter handle @rupeshsomething, which I hardly use. That's it from my side for now. See
you in class @11."
4.
5. regex = r'\w+@\w+'
6. emails = re.findall(regex, text.lower()) # use lower-case string
7. print(emails)

105
"""

We want the output of our program to recognize the highlighted dates (and nothing else, such as only
October). We build the regular expression using twelve months’ names.

1. months = ('jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec')
2. regex12 = ''
3.
4. for mm in months:
5. regex12 = regex12 + mm + '|' # create regex using 12 months
6.
7. regex12 = ‘(‘ + regex12[:-1] + ')' # remove the extra |
8. finalregex = '(' + regex12 + r'[a-z.]*\s+\d\d?)(,?\s?\d{2,4})?'
9.
10. dates = re.findall(finalregex, text.lower())
11.
12. for onedate in dates:
13. print(onedate)

We explain various parts of the regular expression on Line 8. Variable regex12 contains the three-letter
abbreviations of the twelve months separated by |. [a-z.]* allows the month to contain either only the
abbreviation, or the full name, or anything in between, including a dot. This needs to be followed by at
least one space. \d\d? matches one or two digits of the day. This is optionally followed by the second
part of the pattern which may contain comma and space, followed by the year in two, three, or four
digits. If you wish to support only two or four digit year, it can be achieved as \d{2}|\d{4}.

The output of the whole program is:

('september 10', 'sep', ', 2023')


('sep 9', 'sep', '')
('sept 10', 'sep', '')
('oct 10', 'oct', '')
('oct. 11', 'oct', '')

Each tuple in the output contains three entries corresponding to the three parenthesized expressions in
our regular expression: month followed by day, month in regex12, and optional part of the year.

Discussion: The function findall() finds all the occurrences of the given regular expression. If you are
interested only in one, you can use the search() function. For instance:

109
This indicates that the web-server manage.py is unable to find appone. This is because, although the
directory appone exists, it needs to be registered explicitly.

App Registration Step 1


Open mydjango/mydjango/urls.py file. It should originally look as below.


from django.contrib import admin
from django.urls import path

urlpatterns = [
path('admin/', admin.site.urls),
]

Modify it to the following.


from django.contrib import admin
from django.urls import path, include

urlpatterns = [
path('admin/', admin.site.urls),
path('appone/', include('appone.urls')),
]

At this stage, the server may issue certain errors (e.g., ModuleNotFoundError: No module named
'appone.urls'). Ignore the errors for now. If we try to check again in the browser, we may encounter
Unable to connect message. This means the registration of our web-app is not yet complete.

App Registration Step 2


This time, we need to create the appone directory’s urls.py file (note that the file is not already present).
This file permits accessing the contents of appone’s views.py.

1. from django.urls import path


2. from . import views
3.
4. urlpatterns = [
5. path('', views.index, name='index'),
6. ]

124
Once this basic setup is running, we can add arbitrary HTML text and can process it from views.py.

Example: Display an HTML file.


Let’s create a simple HTML file in appone directory, named basic.html. Its contents are as follows. We
will assume that you know basic HTML.

basic.html

<h2>This is basic HTML</h2>

It contains <a href='http://www.cse.iitm.ac.in/~rupesh/books/python/'>a link to Python


Programming's webpage</a>, and an image of Python Logo.<p>
<img src='https://www.python.org/static/img/psf-logo.png'><p>

Do you know that we can also create tables in it?<p>


<table border=1>
<tr><td>one one</td><td>one two</td></tr>
<tr><td>two one</td><td>two two</td></tr>
</table>

<p>
That's all for now.

You can open this file in a browser to see how it looks. But we would like our webapp to show this
content. Keeping everything else the same, let’s modify appone/views.py file as follows.

1. from django.shortcuts import render


2. from django.http import HttpResponse
3.
4. def index(request):
5. fp = open('/home/rupesh/mydjango/appone/basic.html')
6. htmlstr = fp.read()
7. fp.close()
8.
9. return HttpResponse(htmlstr)

We use our file handling functions to read basic.html as a string, and return it as the HTTP response
(when the browser issues an HTTP request to our server).

126
1. from django.contrib import admin
2. from django.urls import path, include
3.
4. urlpatterns = [
5. path('admin/', admin.site.urls),
6. path('appone/', include('appone.urls')),
7. path('acads/', include('acads.urls')),
8. ]

Step 2: Creating acads/urls.py (you can copy it from appone/urls.py)

1. from django.urls import path


2. from . import views
3.
4. urlpatterns = [
5. path('', views.index, name='index'),
6. ]

Step 3: Writing code in acads/views.py

1. from django.shortcuts import render


2. from django.http import HttpResponse
3. from .academics import *
4.
5. def processAcads(filename):
6. htmlstr = '<table border=1>\n'
7. acadfile = open(filename)
8.
9. for line in acadfile:
10. if line == '\n': # new semester
11. htmlstr = htmlstr + gethtml()
12. htmlstr = htmlstr + "<tr><td colspan=3>This Sem CGPA:" + \
13. str(cgpa()) + "</td></tr>\n"
14. else:
15. cc, cr, ea = line.split()
16. add(cc, int(cr), int(ea))
17.
18. acadfile.close()
19. htmlstr = htmlstr + '</table><br>\n'
20. return htmlstr
128

You might also like