04 - Book - Python Programming (3rd SEM) - Watermark
04 - Book - Python Programming (3rd SEM) - Watermark
Author
Rupesh Nasre.
Associate Professor, Indian Institute of Technology (IIT)
Madras, Adyar, Chennai,
Tamil Nadu
Reviewed by
Dr. Mahendran Botlagunta
Associate Professor, VIT Bhopal University
Madhya Pradesh
2
BOOK AUTHOR DETAILS
Dr. Rupesh Nasre., Associate Professor, Indian Institute of Technology (IIT) Madras, Adyar,
Chennai, Tamil Nadu
Email ID: [email protected]
October, 2022
© All India Council for Technical Education (AICTE)
ISBN : 978-81-959863-5-4
All rights reserved. No part of this work may be reproduced in any form, by mimeograph or
any other means, without permission in writing from the All India Council for Technical
Education (AICTE).
Further information about All India Council for Technical Education (AICTE) courses may be
obtained from the Council Office at Nelson Mandela Marg, Vasant Kunj, New Delhi-110070.
Printed and published by All India Council for Technical Education (AICTE), New Delhi.
Laser Typeset by:
Printed at:
Disclaimer: The website links provided by the author in this book are placed for informational,
educational & reference purpose only. The Publisher do not endorse these website links or the
views of the speaker / content of the said weblinks. In case of any dispute, all legal matters to be
settled under Delhi Jurisdiction, only.
ACKNOWLEDGEMENT
I am grateful to the authorities of AICTE, particularly Prof. M. Jagadesh Kumar, Chairman;
Prof. M. P. Poonia, Vice-Chairman; Prof. Rajive Kumar, Member-Secretary and Dr Amit
Kumar Srivastava, Director, Faculty Development Cell for their planning to publish this
book on Python Programming. I am deeply indebted to Prof. Mahendran Botlagunta of
VIT Bhopal University who promptly and thoroughly reviewed this book. His review
significantly improved the presentation. The publication application in Unit 4 as well as
Windows related commands are also due to his suggestions and guidance. I am thankful to
Chinmay Nasre and AICTE Graphics Team for designing the cover page nicely.
I wrote this book from Nagpur, attending to my father’s health, away from my wife and
son. It was tough to meet the deadlines under the tight schedule amid hospital visits. But I
am happy with the outcome. This book is dedicated to those context-switches between
technical and non-technical matters.
This book is an outcome of various suggestions of AICTE members, experts and authors
who shared their opinion and thought to further develop the engineering education in our
country. Acknowledgements are due to the contributors and different workers in this field
whose published books, review articles, papers, photographs, footnotes, references and
other valuable information enriched us at the time of writing the book.
Rupesh Nasre.
2
Preface
This book is an introduction to Python Programming. Python is a popular programming
language today and this book covers its basics. It is divided into five units, each of which
builds upon the previous ones. The units span basics of variables and assignments all the
way to developing web applications.
The first three units are required by all Python users. The units consist of the basic
constructs and types, control constructs, and modular programming with functions and
modules. Together, the three units equip a reader with the basic knowledge of Python to
write simple programs. The fourth unit delves deeper into string functions, file handling,
and regular expressions. Its programs are relatively little more complex and need more
practice. I encourage readers to try out the programs in your own way prior to looking at
the solutions. The fifth unit explores the Django framework to build web applications. Its
configuration is also complicated. But I do hope you get the joy and satisfaction after you
see your first web application running.
Each program explained in this book is available online on the book’s webpage. The
readers should (i) write the program themselves, (ii) then consult the readymade program,
(iii) enhance the program with additional functionality.
Python has several libraries and frameworks, and more are developed regularly. The goal
of this book is to make you hungry to learn more. I am sure you will have a good time
learning through various carefully-crafted examples and illustrations. If you have any
feedback to improve the book, feel free to write to me directly.
Rupesh Nasre.
OUTCOME BASED EDUCATION
For the implementation of an outcome based education the first requirement is to develop
an outcome based curriculum and incorporate an outcome based assessment in the
education system. By going through outcome based assessments, evaluators will be able to
evaluate whether the students have achieved the outlined standard, specific and measurable
outcomes. With the proper incorporation of outcome based education there will be a
definite commitment to achieve a minimum standard for all learners without giving up at
any level. At the end of the programme running with the aid of outcome based education,
a student will be able to arrive at the following outcomes:
Programme Outcomes (POs) are statements that describe what students are expected
to know and be able to do upon graduating from the program. These relate to the skills,
knowledge, analytical ability attitude and behaviour that students acquire through the
program. The POs essentially indicate what the students can do from subject-wise
knowledge acquired by them during the program. As such, POs define the professional
profile of an engineering diploma graduate.
National Board of Accreditation (NBA) has defined the following seven POs for an
Engineering diploma graduate:
PO1. Basic and Discipline specific knowledge: Apply knowledge of basic mathematics,
science and engineering fundamentals and engineering specialization to solve the
engineering problems.
PO2. Problem analysis: Identify and analyses well-defined engineering problems using
codified standard methods.
PO3. Design/ development of solutions: Design solutions for well-defined technical
problems and assist with the design of systems components or processes to meet
specified needs.
PO4. Engineering Tools, Experimentation and Testing: Apply modern engineering
tools and appropriate technique to conduct standard tests and measurements.
PO5. Engineering practices for society, sustainability and environment: Apply
appropriate technology in context of society, sustainability, environment and ethical
practices.
PO6. Project Management: Use engineering management principles individually, as a
team member or a leader to manage projects and effectively communicate about well-
defined engineering activities.
PO7. Life-long learning: Ability to analyse individual needs and engage in updating in
the context of technological changes.
COURSE OUTCOMES
CO-1 3 3 3 3 1 1 3
CO-2 3 2 2 2 1 1 3
CO-3 3 2 2 2 1 1 3
CO-4 3 2 2 3 1 1 3
CO-5 3 3 3 3 1 1 3
2
GUIDELINES FOR TEACHERS
To implement Outcome Based Education (OBE) knowledge level and skill set of the
students should be enhanced. Teachers should take a major responsibility for the proper
implementation of OBE. Some of the responsibilities (not limited to) for the teachers in
OBE system may be as follows:
● Within reasonable constraint, they should manoeuvre time to the best advantage of
all students.
● They should assess the students only upon certain defined criterion without
considering any other potential ineligibility to discriminate them.
● They should try to grow the learning abilities of the students to a certain level before
they leave the institute.
● They should try to ensure that all the students are equipped with the quality
knowledge as well as competence after they finish their education.
● They should always encourage the students to develop their ultimate performance
capabilities.
● They should facilitate and encourage team work to consolidate newer approach.
● They should follow Blooms taxonomy in every part of the assessment.
Bloom’s Taxonomy
Teacher should Student should be Possible Mode of
Level
Check able to Assessment
Students ability to
Create Design or Create Mini project
create
Students ability to
Evaluate Argue or Defend Assignment
justify
Students ability to
Understand Explain or Classify Presentation/Seminar
explain the ideas
Students ability to
Remember Define or Recall Quiz
recall (or remember)
GUIDELINES FOR STUDENTS
Students should take equal responsibility for implementing the OBE. Some of the
responsibilities (not limited to) for the students in OBE system are as follows:
● Students should be well aware of each UO before the start of a unit in each and
every course.
● Students should be well aware of each CO before the start of the course.
● Students should be well aware of each PO before the start of the programme.
● Students should think critically and reasonably with proper reflection and action.
● Learning of the students should be connected and integrated with practical and real
life consequences.
● Students should be well aware of their competency at every level of OBE.
2
CONTENTS
UNIT SPECIFICS
Through this unit we discuss the following aspects:
● History of the Python Programming Language
● Overall set of features supported by Python
● Basic setup and installation
● Basic data types, operators, input and output
RATIONALE
This introductory unit gets readers acquainted with the basics of Python programming. It starts
with a brief history of Python’s creation. The unit then lists the language’s overall set of features
at a high level and motivates why it has become so popular. The unit then delves into basic language
syntax, variables, and solves a few known problems in Python. We then introduce various data
types such as numbers and strings and solve problems using those. We end this unit by introducing
aggregate data types such as lists, tuples, and dictionaries. Care has been taken to introduce the
readers to interesting problems despite the limitation of not having a conditional statement or a
loop.
PRE-REQUISITES
None
UNIT OUTCOMES
List of outcomes of this unit is as follows:
U1-O1: Realize the history of Python
U1-O2: Implement simple program and execute it
U1-O3: Use different data types
U1-O4: Perform basic input and output
U1‐O1 3 3 3 - 3 1
U1‐O2 1 1 2 2 1 -
U1‐O3 2 1 3 1 2 1
U1‐O4 - - 3 1 2 2
2
1.1 History
Python is a widely used programming language today. Starting from the
first semester students to final year projects to industry personnel, Python
finds its use in developing programs for graphics applications, text
processing, data analysis, among others. Fig. 1.1: “Two
snakes” logo of Python
Python was developed by Guido van Rossam, a Dutch programmer, and
released in 1991. The name is inspired from a BBC comedy show Monty
Python’s Flying Circus. Python is a successor of the ABC programming
language. At the time of this writing, Python 3 is the latest major release
of Python (on your computers, you may notice the program python3).
Since the last 20 years, Python has been in the Top 10 most popular
programming languages. At the time of this writing (July 2022), Python
is the most popular language, surpassing C and Java (according to the
TIOBE index).
1.2 Features
Python is a general-purpose programming language and supports multiple paradigms or ways of
programming. For instance, we can write procedural (sequence of steps) as well as object-oriented
programs (entities as objects and communication using messages across them) in it. It can also be used
to write functional programs (applying and composing functions), among others. Unlike C and C++,
Python is interpreted. This means Python programs are not compiled and stored into a binary code file
(e.g., an executable)., but its source is translated into machine code and executed by the interpreter
directly (without us seeing the executable code). Thus, on your computers, python3 is an interpreter
(and gcc is a compiler for C programs).
Python is also dynamically-typed. This means that the type of a variable may not be specified in the
source code, and is identified when the program executes. Python also relies on garbage collection
which reclaims the allocated memory of the variables which are no longer needed (referenced, to be
precise). This relieves the programmer of the task of memory deallocation, similar to Java. Python also
has a large variety of standard libraries, which allow us to write complex codes quickly, improving our
productivity.
3
For instance, the following screenshot shows a Linux installation and running of Python.
Once installed, you can invoke the Python interpreter (using graphical user interface or via a command
line) to execute a Python program. Python programs can be written in your favorite editor (e.g., VS
Code or Sublime or gedit or even notepad) apart from the Python IDE, called IDLE on Windows.
On the command line, you can use certain basic commands to navigate through the file system. For
instance, if your home directory is /home/user, you can create a directory for Python programs as:
$ cd # go to home directory
$ mkdir python # create directory
$ cd python # go into that directory
$ cat >hello.py # create a new file hello.py
print(“Hello World!”)
$ python3 hello.py # run the Python interpreter
$ cd .. # come back to home directory
4
Python programs are stored in files typically with extension .py (e.g., hello.py). On a command-line, we
can execute the program as below.
$ python3 hello.py
In the above command-line, $ indicates the command-prompt, python3 is the interpreter, and hello.py
is a text-file containing your program. On your computer, the interpreter name may differ depending
upon your installation. For instance, on Windows, the interpreter binary is named python. Further, with
new versions of Python, the binary name may change to python4.
print(”Hello World!”)
The program uses a function print() to output a message to the screen. The message is given as an
argument to print() and is specified in double-quotes (“). It can also be specified in single-quotes (‘) or
triple-quotes (‘’’ or “””). When executed using the interpreter, it outputs the message.
$ python3 hello.py
Hello World!
$
The last $ indicates that the command-prompt is displayed again for the next command.
$ python3 greet.py
What is your name?
Guido van Rossam
Hello Guido van Rossam
$
One problem here is that we would like to greet exactly the name that was entered. This demands us to
store the name during input and retrieve it during the output. Such an entity is called a variable. A
5
variable is capable of storing a value which can be retrieved later. In fact, a variable can hold different
values at different times. Our program achieving the above functionality looks like this.
Note that the program is shown with line numbers, which are not part of the program. Line 1 prints our
message asking for the user’s name. Line 2 takes the input name and stores it in a variable called name.
In Line 3, we use the same variable to greet the user.
You must have noticed that Lines 1 and 3 both use the print() function but take a different number of
arguments (this is called polymorphism). In fact, the print() function can be invoked with an arbitrary
number of arguments.
Another noteworthy point is that the printing on Line 3 automatically separates the two strings “Hello”
and “Guido van Rossam” by a space (see the output above). This is the default behavior of print(). As
another example, consider the following program.
print(”Hello”, “World!”)
print(”Bye”, “World!”)
Hello World!
Bye World!
We specified neither the space nor the newline in our program. It is the default behavior of print(). This
behavior can be changed with additional arguments to print().
Hello World!####Bye$$World!
With sep and end.
Line 1 of the source code above prints the two strings separated by the default separator space ‘ ‘ but
ends with ‘####’. Line 2 then prints the two strings separated by ‘$$’ and ends with the default end-of-
line character ‘\n’. Line 3 prints the two strings separated by space, which is explicitly specified, and
ends with a full-stop followed by a newline. Thus, individual strings are separated by sep and the line
6
is ended by end, whose default values are ‘ ‘ and ‘\n’ respectively. ‘\n’ is the newline character. Also
note that strings are specified in this program with single as well as double quotes.
Examples
Program Output
id = "rupesh" [email protected]
domain = "cse.iitm.ac.in"
print(id, domain, sep='@')
id = ”rupesh” [email protected]
print(id, end='@')
print("cse", "iitm", "ac", "in", sep='.')
Consider a program wherein we wish to ask for a name and year of birth from a user, and then display
the age of that user. We can write the initial part of such a program as follows.
It will be nice if the input can be provided on the same line. This can be achieved as follows.
7
print('Enter your name: ', end=’’)
name = input()
print('Enter your year of birth: ', end=’’)
yob = input()
Now let’s get back to the task of calculating the age of the user. Thus, the program should be able to
print the following (in the year 2023).
8
Future connect :Instead of hardcoding 2023 or the current year in the program, it would be nice if our
program can find out the current year. This can be done using the datetime module, which we will study
a little later.
Similar to string and int, there are other data types supported by Python. Let’s take a look.
Example: A card is drawn at random from a deck of well-shuffled cards. Find the probability of it being
neither a king nor a spade.
We know that the deck has a total of 52 cards, split into 4 suites, each containing 13 cards. We can then
compute the desired probability as follows.
ncards = int(52)
nkings = int(4)
nspades = int(13)
nspadeking = int(1)
nnonspadenonking = ncards - (nkings + nspades - nspadeking)
probnonspadenonking = nnonspadenonking / ncards
print('Probability of nonking, nonspade is', probnonspadenonking)
We can restrict the output to a few decimal digits using format specification (similar to C).
9
Let’s understand this formatted printing. Format specifier “%f” instructs printing of the next argument
%probnonsadenonking to be a single-precision value which by default restricts the output to six decimal
digits (unlike the original double-precision value). Format specifier “%.2f” restricts it further to 2
decimal digits. Note that there is no comma between the format specifier and %variable.
Another non-technical point this program reveals is about the variable names. As you may notice, the
variables are difficult to read. One can use camelCase to improve it.
nCards = int(52)
nKings = int(4)
nSpades = int(13)
nSpadeKing = int(1)
nNonSpadeNonKing = nCards - (nKings + nSpades - nSpadeKing)
probNonSpadeNonKing = nNonSpadeNonKing / nCards
print('Probability of nonKing, nonSpade is', probNonSpadeNonKing)
Good Programming Practice : Use variable names that are easier to read and understand.
nstr = input()
n = int(nstr)
sum = n * (n + 1) / 2
print(sum)
We now understand that input() will return a string, which we need to convert to an integer (Line 2).
After this, we apply the formula (Line 3) and print the sum. For input 10, the program should print 55.
10
55.0
From the output, it is clear that the computation is reasonable, but the sum is printed as a real number.
This happens because the division operator (/) uses a floating point division. We can convert it to an
integer in multiple ways.
n = int(input())
sum = n * (n + 1) / 2
print(int(sum)) # using explicit conversion
print(“%d” %sum) # using format specifier
10
sum = n * (n + 1) // 2 # using integer division
print(sum)
Line 1 reads the input string, converts it into a number, and stores it in variable n. Line 2 computes the
sum, which is by default a real value. We print it in Line 3 using the int() function. There is also a format
specifier “%d” to print the value as an integer. We use that in Line 4. Alternatively, Python provides an
integer division operator (//). On Line 5, we use it (now the same sum variable stores an integer value).
We print the integer on Line 6.
The program highlights multiple points. First, types can be interconvertible. Second, a variable does not
have a fixed type; it can have values of different types at different points in the program (e.g., variable
sum). Hence, we do not declare them in Python. Third, notice comments on Lines 3 and 4 in gray font.
Comments start with # and last till the end of that line. Comments are for improving the readability of
the program and are not executed. Multiline comments in Python are often written using triple quotes.
Good Programming Practice : Use comments judiciously to improve the code readability.
Examples
Program Output
print(5 + 3 / 4 - 2) 3.75
Python has inbuilt support for complex numbers. The following example conveys its syntax.
Program Output
1. x = 1 + 2j # create
2. y = complex(1, -2) # create
3.
4. z=x-y # arithmetic
5. print(z) 4j
6.
11
7. z = x * y # arithmetic
8. print(z) (5+0j)
9. print(z.real, '+', z.imag, 'j') 5.0 + 0.0 j
Lines 1 and 2 present two ways to create complex numbers (we can also read a complex
number as a user input). Lines 4 and 7 illustrate simple arithmetic involving complex numbers.
Finally, Line 9 shows how to separate its real and imaginary parts.
1.7 Strings
String is a fundamental data type in Python and the language provides many ways of manipulating
strings. Strings are enclosed in single-, double-, or triple-quotes. Triple-quoted strings can span multiple
lines. Certain characters such as ‘\n’ have special meaning and are not treated as two characters, but
one. These are called escape characters or escape sequences. Strings can be concatenated readily by
using the + operator. Note that concatenation does not add a space between the two strings.
Examples
Program Output
12
print("My answer is", 3 * "No! ") My answer is No! No! No!
n = int(input()) 5
print(n * “Python “) Python Python Python Python Python
As the final example shows, strings can also be repeated easily with a multiplier.
Python provides a large set of operators and functions to manipulate strings. We illustrate this with a few
examples below.
Examples
For the following set of examples, assume this initialization:
name = "Python Programming"
ID Program Output
6 print(name[1::2]) yhnPormig
The first program with ID 1 indexes into the string name and extracts individual characters
(which are actually one-length strings). Indexes start with 0. Thus, name[0] prints the first P.
To find the length of a string, that is, the number of characters in it, len() function can be used.
The last letter is at index len(name) - 1.
The program with ID 2 tries to modify the string by writing to name[0]. Incidentally, this is
disallowed in Python (such strings are called immutable). If we want to modify a string, we
13
need to assign the complete string. This is shown in the program with ID 5 where name is
reassigned to a different string.
The program with ID 3 strangely uses negative indices! Python supports negative indexing
whose meaning is counting backwards. Thus, name[-1] indexes the last character. Since this
reverse indexing starts with -1, the first letter will be at index -len(name).
The program with ID 4 extracts substring. This can be done by specifying the index range [x:y]
where all the characters starting with index x all the way upto-but-not-including y form the
substring. Thus, name[1:5] includes name[1], name[2], name[3], and name[4] (ytho); it does
not include name[5]. We can extract the second word with name[7:len(name)], which prints
Programming. Note again that len(name) is an index one past the last letter’s index.
The program with ID 5 extracts the two words from name. Not specifying the first index of the
range defaults to 0, while not specifying the second index defaults to the length of the string.
The last program with ID 6 prints letters at odd indices. It starts from index 1 (letter y), goes up
to the end of the string, and increments the index by 2.
14
first = original[:6] # eleven
second = original[6:12] # plus
third = original[12:] # two
first2 = first[:2] # el
first3 = first[2] #e
first45 = first[3:5] # ve
first6 = first[-1] #n
third2 = third[:2] # tw
third3 = third[-1] #o
15
Range / Slice L[1:5] T[1:5] Not supported
Note that in case of a list, L[2] = [] will not remove L[2]. Instead, it will replace L[2] with an
empty list. Also note that one-length tuple is indicated with an extra comma, to distinguish it
from a value, since (1) will be treated as parenthesized value 1.
A dictionary should be viewed as a storage for key-value pairs. The mapping from keys to
values is captured in a dictionary.
Future connect : Following programs can be implemented in a generic way using loops.
Example: Store and print all the vowels from the word Mississippi.
Here, we will use a list to store the vowels.
word = "Mississippi"
vowels = [] # empty list
vowels.append(word[1:2])
vowels.append(word[4:5])
vowels.append(word[7:8])
vowels.append(word[10:11])
print(vowels)
16
address = '670, New Nandanvan Layout, Near Sham Dham Temple, Nagpur, Maharashtra, 440024'
UNIT SUMMARY
We touched upon the historical aspects of Python’s genesis and delved into writing simple programs
using numbers and strings. We also introduced aggregates such as lists, tuples, and dictionaries.
EXERCISES
A. p=q=r
B. p=q=r=
C. =p=q=r
D. =p=q=r=
17
A. ‘#’
B. “””#”””
C. sep=’#’
D. end=’#’
A. 75
B. Error
C. 7.57.57.57.57.57.57.57.57.57.5
D. 75.0
A. 111
B. “3”
C. 3
D. 111
A. 5
B. 6
C. 7
D. 12
18
Answers of Multiple Choice Questions
1. A: p=q=r
2. D: end=’#’
3. C: 7.57.57.57.57.57.57.57.57.57.5
4. D: 111
5. A: 5
PRACTICAL
1. Write a program that reads marks of three quizzes and outputs the total out of 100.
2. Read a string from the user. Assume these to be unique letters in a set. Find out the
size of the powerset of this set. For instance, if the string is “abc”, the set is {a, b, c}
and its powerset is {{}, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}} whose size is 8. For
string “python”, the output should be 64.
3. Write a program to find the probability of a card drawn from a standard deck to be
neither a Spade nor a colored card (Jack, Queen, King).
4. Given a tuple of five coefficients e.g. (1, 2, -3, 0, 5), write the corresponding polynomial
in x. For instance, 1x^4 + 2x^3 + -3x^2 + 0x + 5.
5. Find the sum of the geometric series 1 + x + x2 + x3 + … + xn given the values of x and
n.
19
2
2
d
Control Structures
UNIT SPECIFICS
Through this unit we discuss the following aspects:
● Conditional processing using if, if-else, elif
● Looping constructs for, while
● Control-flow alteration using break, continue, pass, and else
RATIONALE
Assignment statements alone cannot enable us to write arbitrary programs. To write general-
purpose codes, we need to alter the straight-line execution path, as well as execute certain
statements repeatedly. This leads to conditional statements and looping constructs, which we study
in this unit. Python also supports specialized variants, which allow us to systematically specify the
desired control pattern.
PRE-REQUISITES
Unit 1
UNIT OUTCOMES
List of outcomes of this unit is as follows:
U2-O1: Use conditional constructs
U2-O2: Use looping constructs
U2-O3: Apply various control structures to solve problems
20
EXPECTED MAPPING WITH COURSE OUTCOMES
Unit-2 (1- Weak Correlation; 2- Medium correlation; 3- Strong Correlation)
Outcomes
CO-1 CO-2 CO-3 CO-4 CO-5 CO-6
U2‐O1 3 3 3 - 3 1
U2‐O2 1 1 2 2 1 -
U2‐O3 2 1 3 1 2 1
21
2.1 Conditionals
Consider two millionaires: you and your friend. You want to identify who is richer. What
do you do? While there are multiple ways to find this out, a direct way is to ask your
friend the amount of property owned. You compare it with your property and find out the
answer.
Note how the outcome can differ depending upon the situation. Thus, the conditional
check on Line 4 decides whether Line 5 is reached or Line 7. Also note that only one of
the two is possible.
The above pseudocode is essentially computing the larger of two numbers, which could
be two property evaluations, two digits, two ages, two times, or even two speeds.
Irrespective of what we compare, the program pattern remains the same. Python allows
us to write such a code using an if construct.
Lines 3–6 present the syntax of an if-else statement. It evaluates the condition
myProperty > friendProperty. Depending upon the value entered, this condition may
evaluate to True or False. If the condition is True, Line 4 gets executed. If the condition
is False, then Line 6 gets executed. Thus, the following two executions of the above
program reveal the conditional execution of statements.
22
I am richer. Nevermind, I am not richer than you.
Thus, the program when executed with input 2, evaluates the condition 10 > 2 to True,
and hence, executes Line 4. In contract, with input 15, the program evaluates the
condition 10 > 15 to False, and hence, executes Line 6.
Enter the two blood groups: A+ A+ Enter the two blood groups: A+ B-
The two blood groups match. It is a mismatch.
We use the split() function to read two blood groups separated by spaces, and assign
those to two variables in the same assignment statement (Line 1). We then check if the
two blood groups entered are the same. This is done using the == operator, for checking
equality. If the two blood groups are the same, the condition evaluates to True, leading to
execution of Line 3. Otherwise, Line 5 gets executed.
23
The following table shows various comparison operators available in Python.
Op Meaning Usage
<= Less than or equal to if a <= b: print(‘a is less than or equal to b’)
>= Greater than or equal to if a >= b: print(‘a is greater than or equal to b’)
The above table also illustrates that the else clause is optional.
Let’s write a program for it. The input is a string holding the roll number. We need to extract the first
two characters (how?) and check if those are ‘C’ and ‘S’. This can be done using nested if-else
statements.
Note the extra indentation for Line 4, due to the nesting. If the first character is ‘C’, it checks the second
character on Line 3, and if it is ‘S’, then it executes Line 4. Otherwise, it executes Line 6. Out of the
24
four inputs given above, try running this program with those inputs. For which inputs the program gives
the expected output?
You would notice that this program gives the correct output for CS23B010, ME23B111, BS23B002,
but does not produce any output for CH23B002. Why? Well, because we did not ask it to. The else
clause on Line 5 corresponds to the if clause on Line 2. As far as the if clause of Line 3 is concerned, it
does not have an associated else clause. Hence, when the condition on Line 3 is False (CH….), the
control comes out of the nested-if statement without any output. How do we fix it?
The code is still wrong. For what input does it produce no output?
Out of the four inputs, the program produces correct output for CS23B010 and CH23B002. But it does
not produce any output for ME… and BS… roll numbers. This is because, due to indentation, the else
clause now corresponds to the if clause on Line 3. Line 2 does not have any corresponding else. So let’s
add it.
Now the program works across all the inputs. But note that there is a duplicate processing on Lines 6
and 8. Can we avoid this duplication?
Conjuncts
The duplication can be avoided if we can combine the if conditions on Lines 2 and 3. Python allows us
to do that using conjuncts such as and and or.
25
2. if rollNum[0] == 'C' and rollNum[1] == 'S':
3. print("Hi Bro!")
4. else:
5. print("Excuse me?")
The conjunct executes the if-block if both the conditions are true. Otherwise, it executes the else-block.
Line 2 can also be written succinctly as:
2. if rollNum[0:2] == 'CS':
Example: Find the student from your department where the roll number may be in capital or small-case
letters.
To address this, we need to augment our if condition to include small-case letters too.
Thus, the or conjunct evaluates to True if any one of the conditions evaluates to True. That is, it
evaluates to False only if all the conditions are False.
Note the use of parentheses to combine the clauses by the and conjunct. This is required because or has
a lower precedence than the and conjunct. Thus, in absence of parentheses, the meaning of Line 2 would
be:
2. if rollNum[0] == 'C' or (rollNum[0] == 'c' and rollNum[1] == 'S') or rollNum[1] == 's':
Can you find out the inputs for which this modified program would produce wrong results?
One may write the above program by reordering the conditions.
Are the last two programs equivalent? The answer is no. The last program allows roll numbers ‘CS…’
and ‘cs…’, but does not allow a mixed-case ‘Cs…’ or ‘cS…’, which is permitted by the earlier program.
26
Also note how the long statement on Line 2 is split using a backslash at the end of the line. Without the
backslash, the program exhibits a syntax error.
Yet another way in which the above check can be made is by using a string function upper(), which
returns its upper-case version. The following program illustrates this.
Is it possible to print “Excuse me?” in the if-block and “Hi Bro!” in the else-block? This demands
reversing the conditions. Python provides the not keyword to alter the truth-value of a condition. Thus,
the functionally equivalent program would be:
When the program execution finds out that rollNum[0] is not ‘C’, does it need to check if rollNum[1]
is ‘S’ or not? It does not need to, since the truth value of the condition is anyway going to be False. This
way of evaluating conditions is called short-circuiting.
Short‐circuiting
In short-circuiting, only as many sub-conditions as required to find the truth value of the whole condition
are evaluated. This can lead to some sub-conditions not getting evaluated. We explain this using an
example: a == 0 and (b < 3 or c <= 5). In the below table, ✔indicates that the corresponding sub-
condition is evaluated. An empty cell indicates that the sub-condition is short-circuited and, therefore,
not evaluated.
a = 0, b = 3, c = 6 ✔ ✔ ✔ False
a = 0, b = 3, c = 5 ✔ ✔ ✔ True
a = 0, b = 2, c = 5 ✔ ✔ True
27
a = 0, b = 2, c = 6 ✔ ✔ True
a = 1, b = 3, c = 5 ✔ False
Note that a row such as the following is not possible for this example.
a = …, b = …, c = … ✔ ✔ False
One may wonder how such a phenomenon affects you, since this seems to be only an optimization and
not affecting any output. The output gets affected when sub-conditions can have side-effects. For
instance, if a, b, c are replaced with function calls, and each function has a print() statement, then the
output will depend upon whether the short-circuiting happens or not.
Future connect : We will study functions in detail in another unit. But to illustrate short-circuiting, we
present an example using functions.
1. def a():
2. print("a")
3. return True
4. def b():
5. print("b")
6. return True
7. def c():
8. print("c")
9. return True
10.
11. if a() and (b() or c()):
12. print("Whole condition is true.")
13. else:
14. print("Whole condition is false.")
The above program defines three functions a(), b() and c(), which print a message and return True. The
main program starts from Line 11 (similar to our programs so far), and evaluates the condition. After
evaluating function a() and function b(), the condition is guaranteed to be True, hence, function c() is
not executed. Therefore, the output of the above code is:
28
a
b
Whole condition is true.
Does this mean short-circuiting is useful only while using functions? Not at all. Short-circuiting can be
very useful to guard against an invalid access. For instance, consider the condition below:
if index < len(mystring) and mystring[index] == ‘x’:
With short-circuiting, if the index is beyond the string then the first sub-condition is False. Therefore,
the second sub-condition will not be evaluated, avoiding the out-of-bound access.
29
Lucky you!
Line 1 uses the split() method to separate the three words, and accordingly, find the suite and the card
in it. It then runs a large if condition, separated by or, split on multiple lines connected by backslash (\).
If any one of the conditions is satisfied, it is a lucky card. Note that Line 7 should use string ‘7’ and not
integer 7.
We would like to now enhance this program to check for invalid cards. Thus, inputs such as ‘seven of
Spade’ or ‘7 of spade’ or ‘7 Spade’ or ‘11 of Spade’ should be marked as invalid.
In the above program, Line 1 and Lines 10 – 17 remain the same as before, except replacing if on Line
10 with elif. In addition we create another long condition to check if the card is invalid. This is done
30
with subconditions to check the suite (Lines 3 and 4) and subconditions to check the card (Lines 5 – 7).
The two subconditions are joined with an or condition (end of Line 4, and note the parentheses).
The condition can be simplified using De Morgan’s laws. The laws mention:
not x and not y == not(x or y)
not x or not y == not(x and y)
Thus, Lines 3–8 can be rewritten as:
Carefully note the application of the laws. The inner subconditions now contain ==, which are easier to
decipher. The conditions of the suite and the card are joined using an and while the subconditions of a
suite are joined using or. If the big condition is true, it means that it is a valid card. Therefore, to check
invalidity, a negation using not is required for the whole condition (note the parentheses).
2.2 Loops
Some programs cannot be written without the ability to repeat. For instance, consider printing
‘Hello World!’ 100 times. One can write 100 print() statements easily. However, if I now ask
you to write a program to take a number from the user and print ‘Hello World!’ those many
times, you would be stuck!
Loops allow us to repeat an arbitrary piece of code, arbitrarily number of times. Python
supports two types of loops: while and for. The while loop iterates through (repeats) a
sequence of code till a given condition is True. The for loop iterates over the items of a given
sequence. We will study both in detail now.
While Loop
Let’s first write our program to print a message a certain user-defined number of times.
1. n = int(input())
2. i = 0
3. while i < n:
4. print('Hello World!')
31
Note the similarity of the loop’s structure with that of an if statement. The body of the loop (in this case,
Line 4) is repeated till the condition on Line 3 is true.
If you enter 10 as input, how many ‘Hello World!’s does our program print? 9 or 10?
Well, it continues to print ‘Hello World!’ an unbounded number of times. Why? Because we asked it
to. The condition i < n continues to remain true, as the value of i never changes in the loop. To get the
expected result, we need to increment i.
1. n = int(input())
2. i = 0
3. while i < n:
4. print('Hello World!')
5. i=i+1 # progress
To find the pass percentage, we need the number of students P passing the course, and the total number
of students N in the class. The output then is P * 100 / N. Finding P needs us to go through all the marks.
How do you find N?
There are two ways. One, we ask the teacher to enter N at the start of our program. Two, we derive it
based on a special end-of-class marker (e.g., -1 for marks). Let’s write the program both ways.
32
Note how we added blank lines 4, 7, and 10, which makes the code tidy and often easier to understand.
The code can be enhanced to check for the divide-by-zero error at Line 13.
Good Programming Practice : Add blank lines to improve readability of the code. A typical
convention is to have a blank line prior to and after a control construct (and functions), especially for
large programs.
Let’s now write it with an end-of-class marker. Assuming marks are always non-negative (no negative
marking by the teacher), we can use -1 as a special marker. As soon as that number is entered, our
program can know that marks of all the students are entered and we can now compute the percentage.
Enter marks: 5
Enter marks: 10
Enter marks: 50
Enter marks: 32
Enter marks: 23
Enter marks: 40
Enter marks: 89
Enter marks: 100
Enter marks: 99
Enter marks: 89
Enter marks: -1
60.0 % passed
The output was computed by counting the number of non-negative inputs as N. The corresponding
program is as follows.
1. n=0
2. pass = 0
3. marks = int(input(‘Enter marks: ‘))
4.
5. while marks >= 0: # number of iterations based on data
6. if marks >= 40:
7. pass = pass + 1
8.
9. n=n+1
10. marks = int(input(‘Enter marks: ‘))
11.
12. print(pass * 100 / n, ‘% passed’)
33
Note how the condition changed at Line 5. It is now based on the marks entered. The students are
counted at Line 9. Note also the usage of the same statement at Lines 3 and 10. You will find this to be
a recurrent pattern.
One can use tricks to avoid the repetition. For instance, by initializing n to -1, marks to 0, and removing
the input() statement outside the loop. The code will be functionally equivalent to the above code. But
it is important to make sure that the code is readable.
If we expand the Fib function, we get a sequence as 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, … Since it is unending,
we can specify how many terms you wish to print, that is, the value of n.
While Fibonacci numbers have a strong presence in Mathematics, a few natural phenomena also follow
Fibonacci numbers (e.g., how branches emerge in trees or how petals of certain flowers are arranged).
Can we use loops to print this sequence? We can get started like this (as a pseudocode).
n = int(input())
first number = 0
second number = 1
while n > 0:
print the first number
Shall I print the second number?
sum = first number + second number
What happens in the second iteration?
n=n-1
34
first number second number sum
0 1 1
1 1 2
1 2 3
2 3 5
3 5 8
5 8 13
… … …
1. n = int(input())
2. first = 0
3. second = 1
4.
5. while n > 0:
6. print(first)
7. sum12 = first + second
8. first = second
9. second = sum12
10. n=n-1
Note the order of statements 7, 8, and 9. Reordering of these statements would result in a wrong output.
By the way, we can use sum instead of sum12 and the code would work. But since sum() is a predefined
function in Python, we would like to avoid using the same name.
The same code can be succinctly written using the multiple-assignment form, which gets rid of the
sum12 variable.
1. n = int(input())
35
2. first, second = 0, 1
3.
4. while n > 0:
5. print(first)
6. first, second = second, first + second
7. n=n-1
It is important to understand that for the program to work correctly, all the expressions on the right hand
side of = must be evaluated before any assignment to the left hand side variables happens. This allows
you to swap two variables easily: x, y = y, x
1. n = int(input())
2.
3. while n != 1:
4. if n % 2 == 1:
5. n=3*n+1
6. else:
7. n = n // 2
8. print(n)
9.
10. print(‘How do I always get printed?’)
For Loops
For loops work with an ordered sequence, wherein the loop variable assumes the value of each element
in the sequence from left to right in different iterations. For instance, the simple form of a for loop can
be used to print numbers from 0 to 9.
1. for i in range(10):
2. print(i)
In the above program, i is the loop variable, going over the sequence returned by range(). The function
could also be used to step through the sequence in a strided manner. For instance, consider the following
code.
36
1. for i in range(10, 20, 3):
2. print(i)
Its output is
10
13
16
19
The same code can be equivalently written using a while loop as follows.
1. i = 10
2. while i < 20:
3. print(i)
4. i=i+3
Unlike the while loop, since the for loop iterates through a finite sequence, we do not have to often
worry about the termination condition.
1. numbers = [65, 32, -3, 32, 43, 32, -22, 2, -5, -7, 0, -9]
2. for n in numbers:
3. if n < 0:
4. print(n)
We can store the continents in a list and iterate through them using a for loop. To find the length of a
string, we can use the len() function. Finding the maximum length requires maintaining a maxlen
variable. However, to print the continent with the maximum length, we also need to store it in the
maxcont variable – whenever we update maxlen. The corresponding code is below.
37
2. maxlen = 0 # this initialization is important
3.
4. for cont in continents:
5. if maxlen < len(cont):
6. maxlen = len(cont)
7. maxcont = cont
8.
9. print(maxcont)
Since there are two answers to this question, think about what would our code output. If we want the
other answer to be output, what small change would you make to the code?
Future Connect : We will learn about modules in a later unit. Consider a module to be a library of
useful functions related to a specific domain.
In the above program, only 2 is an even divisor which needs to be checked, but we are unnecessarily
checking against 4, 6, 8, … This can be easily avoided as follows.
38
4.
5. if n % 2 == 0:
6. print(‘Composite’)
7. sys.exit()
8.
9. for i in range(3, 1 + int(math.sqrt(n)), 2):
10. if n % i == 0: # if i divides n
11. print('Composite')
12. sys.exit()
13.
14. print('Prime')
Unfortunately, the output of the above code consists of only two numbers.
3
Prime
4
Composite
This happens because we have sys.exit() after printing Composite. Exiting the program was okay for a
single number, but not for a sequence of numbers. Ideally, we want that after finding out that a number
is composite, we should not print Prime, but still go to the next number for primality checking.
Python provides loop modifiers to enable such a control-flow. The keyword break brings the control
out of the currently enclosing nearest loop, while the keyword continue skips the rest of the iteration
and goes to the next one. The whole program would now be as follows.
39
1. import math
2.
3. for n in range(3, 100+1):
4. print(n)
5.
6. if n % 2 == 0:
7. print(‘Composite’)
8. continue # go to the next number
9.
10. for i in range(3, 1 + int(math.sqrt(n)), 2):
11. if n % i == 0: # if i divides n
12. print('Composite')
13. break # come out of the loop at Line 10
14.
15. print('Prime')
After Line 7 finds out that a number is even, Line 8 goes back to the next number in the range specified
at Line 3, skipping Lines 9 – 15. Similarly, after finding a number composite at Line 12, Line 13 brings
the control to Line 14, out of the for loop at Line 10. Note that Line 13 is enclosed in two loops (Line 3
and Line 10), but it comes out of the currently enclosing nearest loop, that is, Line 10’s loop. The
program now prints all the numbers from 3 to 100, and most of the outputs are correct, but some are
wrong. For instance, the initial few lines are:
3
Prime
4
Composite
5
Prime
6
Composite
7
Prime
8
Composite
9
Composite
Prime
10
Composite
…
Notice the output for number 9. Our code prints both Composite and Prime. Why does this happen?
This is because break at Line 13 brings the control to Line 14. After that Line 15 continues to print
Prime.
40
How do we correct this? We can keep track whether Composite was printed or not. If it was, then we
need not print Prime; otherwise we should. The modified code looks like this.
1. import math
2.
3. for n in range(3, 100+1):
4. print(n, ‘’, end=’’)
5.
6. if n % 2 == 0:
7. print(‘Composite’)
8. continue # go to the next number
9.
10. composite = False
11.
12. for i in range(3, 1 + int(math.sqrt(n)), 2):
13. if n % i == 0: # if i divides n
14. print('Composite')
15. composite = True
16. break
17.
18. if composite == False: # number is not composite
19. print('Prime')
The above code works well and produces the expected output.
Future Connect : Such a reuse of functionality (checking primality of one number) can be nicely
encoded using functions. We can then call such a function in a loop.
Python provides a special construct to support the functionality we implemented using the variable
composite. It is the else clause of a loop (for or while).
1. import math
2.
3. for n in range(3, 100+1):
4. print(n, ‘’, end=’’)
5.
6. if n % 2 == 0:
7. print(‘Composite’)
8. continue
9.
41
10. for i in range(3, 1 + int(math.sqrt(n)), 2):
11. if n % i == 0: # if i divides n
12. print('Composite')
13. break
14. else: # else corresponding to for
15. print('Prime')
Note the indentation of the else clause at Line 14. It corresponds to the for loop at Line 10 (and not to
the if statement at Line 11). The else clause is executed if the loop exhausts all the elements in the
sequence of numbers (or when the while condition becomes False), but does not get executed if the loop
gets terminated by break. To understand this clause better, let’s look at the following simple example.
Can you analyze the program and find out the output of the above code? It is as below.
1
2
3
4
5
6
7
8
9
End: 9
1
2
3
4
5
6
7
42
8
9
End: 10
Note how the while loop increments the variable to 10, whereas the for loop retains its value to 9 at the
end.
To understand the use of break and continue, let’s consider the following example.
Example: Given a list, print all the positive numbers, but stop as soon as 0 is reached.
For instance, if the input list is [-4, 2, 54, 21, -32, 3, 6, 3, 1, 0, -5, 321], the output should print the
values 2, 54, 21, 3, 6, 3, 1. It should not print 321. The program is as below.
Python also supports a pass statement, which can be used when no processing is required.
For instance, consider a program where we wish to find the sum of positive integers in a list which may
contain negative numbers also.
Of course, the program can be modified to not require pass. But sometimes, it can be useful for
readability purposes.
43
1. price = int(input('Price: '))
2. deno1, deno2, deno3 = map(int, input('Denominations: ').split())
3.
4. print("Can you form", price, "exactly using", deno1, deno2, deno3, "?")
5.
6. for d1 in range(0, 1 + price // deno1):
7. for d2 in range(0, 1 + price // deno2):
8. for d3 in range(0, 1 + price // deno3):
9. if d1 * deno1 + d2 * deno2 + d3 * deno3 == price:
10. print(d1, d2, d3)
11. else:
12. print('No')
Line 2 uses the split() function to receive three fields (strings). To map those to integers, we can use the
map() function. Lines 6–8 run a triply-nested loop which check all possible combinations of the three
denominations, upper bounding by the price. For instance, if the price is 550, there is no point in using
more than five 100-rupee notes. If such an exact combination is found, we print it at Line 10. If no such
combination exists, then we print No at Line 12 (Line 11 is an else corresponding to the outermost for
loop).
UNIT SUMMARY
We explored various basic control constructs in Python such as conditionals and loops. We solved
various problems using those constructs. We also looked at various loop modifiers which allow special
control flow.
EXERCISES
n1, n2 = input().split()
if n1 < n2:
print(n1, "<", n2)
elif n1 == n2:
print(n1, “==”, n2)
elif n1 < n2:
print(n1, “again <”, n2)
45
else:
print(n1, ">=", n2)
A. 100 < 99
B. 100 >= 99
C. 100 == 99
D. 100 again < 99
2. Suggest a value for the dob variable that illustrates usefulness of short-circuiting for the following
conditional statement.
A. ‘29/10/1999’
B. ‘29/08/2000’
C. ‘1/1/2001’
D. ’123’
numbers = [65, 32, -3, 32, 43, 32, -22, 2, -5, 64, 0, 8]
i=0
for n in numbers:
if 2**i == n:
print(n)
i=i+1
A. 32 32 64 8
B. 32 64 8
C. 32 64
D. 32
4. What should be the value of the expression X in the following program to print all the squares between
3 and 99?
46
The output of the above code is:
Hello World!
Hello from loop
Hello from loop
Such arguments can help tune the functionality. This is what we do with a print() function.
Recall our program from Unit 2 which found out if a range of numbers is prime or not. We will modify
it to use functions and see how the program looks simplified.
Example: Find out if numbers from 3-to-100 are prime, using functions.
1. import math
2.
3. for n in range(3, 100+1):
4. print(n, ‘’, end=’’)
5.
6. if n % 2 == 0:
7. print(‘Composite’)
8. continue
9.
10. for i in range(3, 1 + int(math.sqrt(n)), 2):
11. if n % i == 0: # if i divides n
12. print('Composite')
13. break
14. else: # else corresponding to for
15. print('Prime')
Which part can we move to a function? That is up to us. For instance, we can simply make printing
Prime or Composite inside a function. Alternatively, we can convert the for loop at Line 12 into a
function. From the perspective of this example, we can make primality testing of a given number as a
function. This function can then be called from the loop at Line 3.
1. import math
2.
3. def isPrime(n):
4. if n % 2 == 0:
5. print(‘Composite’)
6. continue # there is no enclosing loop
52
7.
8. for i in range(3, 1 + int(math.sqrt(n)), 2):
9. if n % i == 0: # if i divides n
10. print('Composite')
11. break
12. else: # else corresponding to for
13. print('Prime')
14.
15. for n in range(3, 100+1):
16. print(n, ‘’, end=’’)
17. isPrime(n)
See how we have moved the code to the isPrime() function and the main module is simplified (Lines
15 – 17). Unfortunately, this code reorganization poses an issue. The continue statement at Line 6 is
now no longer inside a loop. Hence, it is a syntax error. From the perspective of a function, we want
that after printing Composite at Line 5, the control should go back to the caller and the next number
should be checked for primarily. This can be done using a return statement.
6. return
Now the program works as expected. It will be nice however, if the caller can take charge of the
processing after finding if a number is prime or not. This demands the function isPrime() to return True
or False.
1. import math
2.
3. def isPrime(n):
4. if n % 2 == 0:
5. return False
6.
7. for i in range(3, 1 + int(math.sqrt(n)), 2):
8. if n % i == 0:
9. return False
10. else:
11. return True
12.
13. for n in range(3, 100+1):
14. if isPrime(n): print(n, end=', ')
Lines 5, 9, and 11 return a boolean value depending upon the argument number’s primality. If the
number is composite, the caller does not do anything, otherwise it prints the number. The output of the
above code is:
53
1. import math
2.
3. def getAllPrimes(maxn): # find primes upto maxn
4. global allPrimes # I want to access allPrimes list
5.
6. for n in range(3, maxn+1):
7. if n % 2 == 0:
8. continue # we can use this now
9.
10. for i in range(3, 1 + int(math.sqrt(n)), 2):
11. if n % i == 0:
12. break
13. else:
14. allPrimes.append(n) # add to the list
15.
16. # main module begins here
17. allPrimes = []
18. getAllPrimes(100) # populate allPrimes
19. print(allPrimes)
20. getAllPrimes(50) # populate allPrimes
21. print(allPrimes)
The function getAllPrimes() remains almost the same, except for two changes. One: at Line 4, it marks
allPrimes to be a global variable. Two: it does not (need to) return allPrimes. allPrimes is now a regular
variable in the main module initialized to an empty list at Line 17. It is populated by getAllPrimes() at
Lines 18 and 20. Note that after these two function calls, we simply print allPrimes list in Lines 19 and
21. What will be the output of the code?
If you said that the output will remain the same, that is not correct. Here is the output.
[3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
[3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 3, 5, 7, 11,
13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
Note how the second call to getAllPrimes() has appended the prime numbers to the original list. This
happened because we forgot to initialize the list to empty before calling getAllPrimes(50) at Line 20.
Remember : Global variables make the code easy, but can also lead to initialization errors if we are
not careful.
To understand the scope of global variables, let’s look at a simple example. What will be the output of
this code?
55
1. g = 10 # global variable
2. print("1:", g).
3.
4. def fun():
5. g = 20 # local variable
6. print("3:", g)
7.
8. g = 30 # global variable
9. print("2:", g)
10. fun()
11. print("4:", g)
A function can be defined in the middle of the main module (Line 4). The program works as follows.
To summarize, when fun() accesses a variable, it is the local variable by default. If we want it to access
the global variable g, we will have to add the statement: global g, similar to the last example (global
allPrimes). Then the output will change to 4: 20.
Good Programming Practice : If there are central structures holding data which you need to access
from several functions, then only use global variables. Otherwise, avoid globals and pass the data as
arguments to functions.
56
10. for n in range(1, 10000):
11. factors = findFactors(n)
12.
13. if n == sum(factors):
14. print(n, ":", factors)
We make use of two functions: findFactors and sum. The latter is a predefined function in Python, while
we will have to write the former. With our experience of primality testing, we know how to write it.
1. def findFactors(n):
2. factors = []
3.
4. for f in range(1, n):
5. if n % f == 0:
6. factors.append(f)
7.
8. return factors;
Fibonacci strings have a special property. If we remove the last two digits, then the string reads the same
forward and backward. Such strings are called palindromes (e.g., nitin, a, malayalam, amma, tattarrattat,
and if you permit spaces and punctuation, “borrow or rob”, “was it a car or a cat I saw?”, “never odd or
even”, “A man, a plan, a canal – panama”).
1. def isPalindrome(bstring):
2. blen = len(bstring)
3. first = bstring[:blen // 2] # integer division to get the index
4. second = bstring[(blen + 1) // 2:]
5.
6. if first == second[::-1]: # extended range, for reversing
7. return True
8.
57
9. return False
10.
11. # main module begins
12. first = "0"
13. second = "01"
14. i=0
15.
16. while i < 10:
17. if i >= 2 and isPalindrome(first[:-2]):
18. print(first[:-2] + " is a palindrome")
19.
20. first, second = second, second + first
21. i=i+1
We define the first two strings in Lines 12 and 13. We check this property for the first few Fibonacci
strings (Line 16). The strings are generated in Line 20 using the multiple assignment form. Line 17 uses
a conjunct to call function isPalindrome() by removing the last two binary digits of the eligible strings.
If the property is satisfied, all the strings should get printed by Line 18.
The function isPalindrome() splits the given string into two halves. [0:blen // 2] is the first half, while
the second half is [(blen + 1) // 2: blen], which we use in Lines 3 and 4. Interestingly, the expressions
work whether the string is even length or odd. Note that we use the same variable names (first and
second) in both the main module and the function, but both refer to two different entities. Line 6 uses a
peculiar syntax [::-1] which defines a range, but also uses a step (-1 in this case, 1 by default) which
allows us to reverse the string second. If the two strings are the same, we declare it to be a palindrome.
0 is a palindrome
010 is a palindrome
010010 is a palindrome
01001010010 is a palindrome
0100101001001010010 is a palindrome
01001010010010100101001001010010 is a palindrome
01001010010010100101001001010010010100101001001010010 is a palindrome
01001010010010100101001001010010010100101001001010010100100101001001010010100100
1010010 is a palindrome
An interesting aspect of writing functions is that we can reuse it in different contexts. For instance, the
isPalindrome() function works not only for binary strings, but also for arbitrary strings.
58
1. import primefun
2.
3. for n in range(3, 50+1):
4. if primefun.isPrime(n): print(n, end=', ')
5. print()
Thus, we can call a function from a module using modulename.functionname syntax (Line 4). The
output of the above code is:
3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97,
3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47,
Note that the first line is printed by the code in primefun.py, while the second one is by the code in
primefunmodule.py. We can also rename an imported entity and can import everything from a module,
so that we can directly make use of the functions and globals from that module (instead of using the
syntax modulename.functionname).
1. import primefun
2.
3. for n in range(3, 50+1):
4. if primefun.isPrime(n): print(n, end=', ')
5. print()
6.
7. myprime = primefun.isPrime # shorthand name
8.
9. if myprime(22): print(22) # use the shorthand name
10. else: print("22 is composite")
11.
12. from primefun import * # import everything from a module
13.
14. if isPrime(23): print(23) # and use the function directly
15. else: print("23 is composite")
Line 7 creates a shorthand name (like a pointer) to our isPrime function, which we can use in Line 9.
This avoids using primefun.isPrime() type syntax as in Line 4. another way to use the isPrime() function
directly is by importing only the function isPrime from the module primefun or importing everything
(indicated by *) as in Line 12, and then using the function in Line 14. The output of the above code is:
65
Let’s create a matrix module which can have some useful functions such as addition and multiplication.
To initialize the matrix, let’s add another function, and to test it, let’s add a printing function too. The
four function signatures would look like this:
How is a matrix represented? It can be a list of lists. For instance, [ [1, 2, 3], [4, 5, 6] ]. Before reading
further, it would be helpful to try out writing the above four functions yourself.
Here is how we will be able to use it in a client program, assuming the above module is stored in
matrix.py.
111
111
111
10 12 14
16 18 20
69
22 24 26
11 13 15
17 19 21
23 25 27
48 54 60
48 54 60
48 54 60
The four matrices correspond to the four mprint statements in the program (Line 5 for mat1, Line 6 for
mat2, Line 9 for mat3 which is addition of mat1 and mat2, while Line 12 for mat4 which is the
multiplication of mat1 and mat2).
70
A. 12
B. 21
C. 11
D. 22
def rprint(l):
if len(l) > 0:
rprint(l[1:])
print(l[0], end='')
rprint([1, 2, 3, 4, 5])
A. 12345
B. 2345
C. 54321
D. 5432
def fib(n):
if n < 2: return 1
return fib(n - 1) + fib(n - 2)
If the function is invoked as fib(5), how many total invocations of the function fib() happen? That is,
how many times, fib() gets called in total.
A. 5
B. 8
C. 9
D. 15
4. Consider a file dateutil.py which has useful functions such as today(), diff(date1, date2), etc. We want
to use this function today() in our program written in file main.py in the same directory. How should
this be done correctly?
77
Let’s implement this functionality.
A file is opened for processing in Line 1 (the operating system sets up resources using which the file
can be accessed), processed (Line 3), and closed in Line 6 (allowing the operating system to release the
resources).
Can you guess the output of the above program? Well, it turns out to be more than what one may
anticipate.
This is because the file already contains a newline after every line, and print() adds another. We know
how to fix it though (try it out).
It would be nice if we can provide the filename (hello.txt) on the command-line, instead of hardcoding
in the program or prompting the user every time.
For accessing command-line arguments, we can use sys.argv. This variable is a list of arguments we
provide while invoking the Python interpreter. Thus, argv[0] is “cat.py” and argv[1] is “hello.txt”.
82
1. import sys
2.
3. myfile = open(sys.argv[1]) # argv[1] contains the second argument
4.
5. for line in myfile:
6. print(line, end='')
7.
8. myfile.close()
We can use this new program on other text files too. For instance:
Note that since power.py is in another directory, we have provided a relative path.
$ ls
cat.py cp.py hello.txt # initially there are three files.
$ python3 cp.py hello.txt hellodup.txt # copies file, does not print anything
$ ls
cat.py cp.py hello.txt hellodup.txt # now there are four.
83
The third line is empty.
We know how to read a file and use command-line arguments. But we don’t know how to write to a
file. It needs a second argument to the open() function.
Past Connect : Second argument can be made optional using default arguments.
1. import sys
2.
3. infile = open(sys.argv[1] , 'r') # open for reading
4. outfile = open(sys.argv[2] , 'w') # open for writing
5.
6. for line in infile: # read
7. print(line, file=outfile, end='') # write
8.
9. infile.close()
10. outfile.close()
You must have guessed why we named our program as cp.py. Linux has a command by that name to
copy files.
We would like to split it into words, and store it in another file textout.txt, as follows.
1: Lexical
2: analyzer
3: groups
4: a
5: sequence
6: of
7: characters
8: into
9: tokens.
84
10: Syntax
11: analyzer
12: checks
13: if
14: these
15: tokens
16: adhere
17: to
18: a
19: grammar
20: syntax.
How do we write such a program? First, we need to utilize reading and writing of files and handling
command-line arguments, as we just did. Apart from that, we need string processing to split a line into
words using space as a separator. To print the word number, we need an iterative way to go over the
words of a line.
1. import sys
2.
3. infile = open(sys.argv[1] , 'r') # textin.txt
4. outfile = open(sys.argv[2] , 'w') # textout.txt
5. ii = 1 # word number
6.
7. for line in infile:
8. words = line.split()
9.
10. for ww in words:
11. print(ii, ww, file=outfile, sep=': ')
12. ii = ii + 1 # word number update
13.
14. infile.close()
15. outfile.close()
Note the similarity in Lines 7 and 10. The former goes over all the lines of a file, while the latter goes
over all the words in a sequence of words. The function split() on Line 8 creates a word-sequence out
of a string.
85
You can see that writing such simple utilities is not difficult in Python. Once you know how to do it, it
is a matter of finding the right functions. Of course, writing programs in the best possible way requires
a significant amount of practice.
Such a processing requires us file handling similar to the previous example, except for the roles of the
two files reversed for reading and writing. Apart from that, we need to separate out each word from the
number, concatenate all the words, except when full-stop appears. Separating a word from its number
can be done in multiple ways, one of them is using split(), and strings can be concatenated using +
operator. How do we check if a string contains a full-stop?
1. import sys
2.
3. infile = open(sys.argv[1] , 'r')
4. outfile = open(sys.argv[2] , 'w')
5.
6. outline = '' # empty string
7.
8. for line in infile:
9. words = line.strip().split(': ')
10. outline = outline + words[1] + " " # one line, operator + joins strings
11.
12. if "." in words[1]: # found the full-stop
13. print(outline, file=outfile)
14. outline = ''
15.
16. infile.close()
17. outfile.close()
The loop at Line 8 goes over all the lines in the input file. Each line (containing the number and the
word) also contains a newline, which we remove using a function strip(). The function also removes
any preceding spaces and tabs (which we do not have in our file). We then separate the number and the
word using split(). Note how the output of strip() acts as an input to split(). After splitting, the word is
stored in words[1], which we use in Line 12. Note the syntax of checking if a substring exists in a larger
string.
Discussion: The reversal of the output is not perfect. If a line contains a dot in the middle, our program
will split it in multiple sentences (e.g., cse.iitm.ac.in will follow a newline). We can modify the program
to check if a word ends in a dot. But there will be words which will cause issues (e.g., e.g. this line.).
86
If a word is misspelt it will be corrected.
Without reading further, try to reason out why this can happen. Also, you are highly encouraged to try
out whether you observe a similar behavior on your machine.
To understand this peculiar behavior, we need to first know that file input / output is buffered. This
means, the operating system is not going to read the file on disk byte-by-byte, but as a large chunk. This
chunk is copied to a buffer, which essentially is an array having a fixed size. How much buffer we have
read or written is tracked using a pointer (or an array offset). Therefore, when we read / write a line in
a file, this pointer, which is often called a file pointer, moves ahead. As long as the operation is getting
done inside this buffer, further disk processing is avoided (because disk access is costlier than accessing
this buffer in memory). If we end up reading beyond this buffer, the next part of the data is brought
from the disk into another buffer and this process continues. The operating system also knows the size
of the file. So it does not allow reading beyond the current file size.
Now let’s come back and analyze the output of the above erroneous program. Due to Line 5, the
operating system has already read the three lines into its buffer (including the newline characters). This
buffer is processed line-by-line when we only read from the file. But because we also write in Line 7,
it ends up writing at the end of the file (since the three lines have already been read). This adds the
fourth line in the modified file and the file size increases. But where does the file pointer point due to
this writing? It is at the end of the fourth sentence, which is the end of the file. Therefore, the next
iteration of the for loop does not take place (you can check that by printing a message in the loop) and
the loop gets over. The modified file is closed at Line 9 (which makes the operating system dump the
modified buffer to the file on disk, if not done already).
How do we fix this? Clearly, to modify a word we have already read, the file pointer needs to go back
in the file and rewrite that word. This is done using a function called seek(). So our second attempt is to
use it to move the file pointer.
1. import sys
2.
3. myfile = open(sys.argv[1], 'r+')
4.
5. for line in myfile:
6. modline = line.replace('mispelt', 'misspelt')
7. curoff = myfile.tell() # current file pointer
8. myfile.seek(curoff - len(line)) # go back
9. print(modline, file=myfile, end='')
10.
11. myfile.close()
88
If a word is misspelt it will be corrected.
his is the second line, which also contains misspelt word.
his is the third line.
While the two words are corrected, it creates a problem at the start of the lines: T goes missing. Why?
This happens because the new word misspelt is longer than the old one by one character. Therefore,
when we write the modified line in Line 10, it overwrites one character of the next line. We show it
pictorially below.
… m i s p e l t i t w … e d . \n T h i
… m i s s p e l t i t w … e d . \n h i
A similar issue occurs on the next line too. Thus, unlike our usual variables (such as strings and
sequences), we cannot insert text in between a file. We will have to overwrite. In other words, the above
program works only when the old and the modified text are of the same length. What happens when the
new text is shorter than the original one? You can again use the pictorial buffer representation above to
find out the answer.
Can we not fix this problem? We can. Instead of reading line-by-line, if we can read the whole file
together, then we can replace all the occurrences of the word and write all the modified contents at once.
This is done using a read() function.
1. import sys
2.
3. myfile = open(sys.argv[1], 'r+')
4. alllines = myfile.read()
5. modlines = alllines.replace('mispelt', 'misspelt')
6. myfile.seek(0) # go to the start of the file
7. myfile.write(modlines)
8. myfile.close()
Line 4 reads the whole file into alllines variable, on which, we perform find-and-replace (Line 5). We
move the file pointer to the start of the file in Line 6, and write the modified lines in Line 7.
90
Example: Retrieve academic record from a file and compute CGPA.
Recall our academics module in which we kept track of our courses, total points, and earned credits.
Our program had hardcoded this academic record. Ideally, this data (academic record) should be kept
separate from the functionality (our Python program). We can store this data into a file (academics.txt),
with semesters separated by an empty line.
CS1100 9 10
CS1200 12 9
AM1100 9 9
# empty line to indicate end of semester
CS2200 12 10
CS2310 6 10
CS2710 6 9
# this empty line is required
Interestingly, our academics.py module can remain the same. We can simply modify academicsuser.py
to read this data file, populate lists in the academics module, and invoke the CGPA calculation. Thus, a
sample run would be as follows.
Can you write the academicsuser.py program? We present and discuss it below.
1. import sys
2. from academics import *
92
3.
4. acadfile = open(sys.argv[1])
5.
6. for line in acadfile:
7. if line == '\n': # end of semester
8. cprint()
9. print("This Sem CGPA:", cgpa(), "\n")
10. else:
11. cc, cr, ea = line.split()
12. add(cc, int(cr), int(ea))
13.
14. acadfile.close()
We first import sys for argv, and then all the functions from our module academics (Lines 1 and 2). We
open the file specified on the command-line in Line 4. We process each line in the file in the loop at
Line 6. If the line is empty, it indicates the end of a semester. So, we print the academic record and
CGPA (if block). Otherwise, we split the line into course number, credits, and earned points and add
those to our internal record (else block), to be used for CGPA calculation later. We finally close the file
in Line 14.
Function Remarks
93
data can be dumped into a datafile using writeAll(). Before a readAll(), we need to ensure that the
dictionaries are empty.
95
4/23/2022 16:08:31, [email protected], PRIYA SHARMISHTHA, 7753763928,
Indian institute of technology madras ,Student
4/23/2022 16:13:12,[email protected],Prema S,9783297231,NMAMIT,Student
4/23/2022 16:50:56, [email protected], Nigam Vaishnav, 447439860637,
IITM,Student
4/23/2022 17:06:19,[email protected], Abha Joshi, 9307876327, Indian Institute of Technology Guwahati,
Faculty
4/23/2022 17:08:28,[email protected], Abha Joshi, 9307876327, Indian Institute of Technology Guwahati,
Student
Solution: We can solve this comma problem by saving the spreadsheet in .csv format with tab (\t) as the
separator. Typically (but not always), a tab is not present in the form-input data. Another option is to
encode the data using the base64 module, which allows arbitrary binary data also to be processed. In
our program, we will use tab-separated values.
Solution: Similar to handling commas, the validation logic (typically, in JavaScript for HTML forms)
can either disallow newlines, or convert those to a different form (such as <br> in HTML). Alternatively,
while converting from spreadsheet to CSV format, double-quotes can be used as the text-delimiter,
which stores multiline field in double-quotes (if double-quotes are present in the field, those can be
encoded, or escaped as \” or “”). In our program, we will assume that this is taken care of in a
preprocessing step, and each line is a new record.
98
Challenge 3: Header can be mistakenly recognized as a record.
Our .csv file contains the first line as the header. While processing, if we forget, our functions may treat
it as a valid record, which can change the output. For instance, if we are booking rooms for all the
attendees, we may end up booking (and paying for) one more room than required.
Solution: This is easy to resolve. We can simply remove it as a sanitization prepass. Alternatively, our
Python script can skip the first line.
With these challenges and possible solutions, let’s now find out some interesting statistics from the
registration data. How do we represent the data? We can store each column as a list. Thus, for our
example data, we can have six associative lists for timestamps, emails, names, phones, affiliations, and
designations. These are populated by reading the .csv file.
1. def nonStudents():
2. for ii in range(len(designations)):
3. if not designations[ii] == 'Student':
4. printRecord(ii)
We make use of the auxiliary function printRecord() to print details of a registrant. The final output of
using the above function is:
5. def iit():
6. for ii in range(len(affiliations)):
99
7. lowaffil = affiliations[ii].lower() # to lower-case
8.
9. if ("iit" in lowaffil and not "iiit" in lowaffil) or \
10. "indian institute of technology" in lowaffil:
11. printRecord(ii)
Line 7 uses a string function lower() to convert the affiliation to lower-case. This allows us to avoid any
capitalization issues while comparing strings. We check if it is “iit” or its full-form in the condition on
Lines 9 and 10. Since we know that there would be registrants from IIIT also, we filter those out from
the output. The final output of using the above function is:
100
Enumerating all pairs can be done using a nested loop (Lines 17 and 18). If the two records match (Line
19), then we print them. The match() function uses the string function upper() to remove any differences
in the capitalization. Output of using the above function is:
Line 24 creates a joint aggregate from individual aggregates. This joint aggregate is sorted in Line 25,
which sorts all the associative lists. Note that we have provided affiliations as the first list in zip(). Once
these lists are sorted based on affiliation, we go over this sorted aggregate row-by-row, that is, one index
across all three aggregates (Line 27). Each entry in this row corresponds to one record. Thus, record[0]
corresponds to the first affiliation in the sorted order, record[1] is the name corresponding to it, and
record[2] is the email id of that registrant. We print this record in Line 28. The output using the above
function is:
101
pattern in Line 8 matches 99405 33241 and does not match the other two. It looks for five digits \d{5},
then a non-digit \D (such as space or hyphen), followed by another set of five digits. The third pattern
on Line 9 for US mobile format consists of an opening parenthesis \( followed by three digits \d{3}
followed by a closing parenthesis, followed by an optional space \s?, followed by three digits \d{3},
followed by a non-digit \D (such as space or hyphen), followed finally by four digits \d{4}. \s indicates
a whitespace character, ? makes the preceding expression optional, and parenthesis need to be escaped
because those have a special meaning in re.
These three patterns are OR’ed in the regular expression regex at Line 11 using | operator (understood
by the re module). Line 12 then calls the findall() method which finds all occurrences of the regular
expression in text. These are returned as a list.
Is it possible to have a single regular expression for 99405 33241 and 8932732436? Yes, we can modify
digits55 as '\d{5}\D?\d{5}' and it will match both the phone numbers.
Good Programming Practice : It is easy to make the regular expressions complex. Therefore, split
those into multiple parts, test each of them separately, and then join them, as we have done in the above
example.
1. import re
2.
3. text = "My departmental email id is [email protected]. This is what I use for all the
official purposes. The institute also provides an id [email protected]. During PhD, I used
to have a personal email id [email protected], which is still in use. When
gmail was not around, id [email protected] was the one I used. I also own a
twitter handle @rupeshsomething, which I hardly use. That's it from my side for now. See
you in class @11."
4.
5. regex = r'\w+@\w+'
6. emails = re.findall(regex, text.lower()) # use lower-case string
7. print(emails)
105
"""
We want the output of our program to recognize the highlighted dates (and nothing else, such as only
October). We build the regular expression using twelve months’ names.
1. months = ('jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec')
2. regex12 = ''
3.
4. for mm in months:
5. regex12 = regex12 + mm + '|' # create regex using 12 months
6.
7. regex12 = ‘(‘ + regex12[:-1] + ')' # remove the extra |
8. finalregex = '(' + regex12 + r'[a-z.]*\s+\d\d?)(,?\s?\d{2,4})?'
9.
10. dates = re.findall(finalregex, text.lower())
11.
12. for onedate in dates:
13. print(onedate)
We explain various parts of the regular expression on Line 8. Variable regex12 contains the three-letter
abbreviations of the twelve months separated by |. [a-z.]* allows the month to contain either only the
abbreviation, or the full name, or anything in between, including a dot. This needs to be followed by at
least one space. \d\d? matches one or two digits of the day. This is optionally followed by the second
part of the pattern which may contain comma and space, followed by the year in two, three, or four
digits. If you wish to support only two or four digit year, it can be achieved as \d{2}|\d{4}.
Each tuple in the output contains three entries corresponding to the three parenthesized expressions in
our regular expression: month followed by day, month in regex12, and optional part of the year.
Discussion: The function findall() finds all the occurrences of the given regular expression. If you are
interested only in one, you can use the search() function. For instance:
109
This indicates that the web-server manage.py is unable to find appone. This is because, although the
directory appone exists, it needs to be registered explicitly.
…
from django.contrib import admin
from django.urls import path
urlpatterns = [
path('admin/', admin.site.urls),
]
…
from django.contrib import admin
from django.urls import path, include
urlpatterns = [
path('admin/', admin.site.urls),
path('appone/', include('appone.urls')),
]
At this stage, the server may issue certain errors (e.g., ModuleNotFoundError: No module named
'appone.urls'). Ignore the errors for now. If we try to check again in the browser, we may encounter
Unable to connect message. This means the registration of our web-app is not yet complete.
124
Once this basic setup is running, we can add arbitrary HTML text and can process it from views.py.
basic.html
<p>
That's all for now.
You can open this file in a browser to see how it looks. But we would like our webapp to show this
content. Keeping everything else the same, let’s modify appone/views.py file as follows.
We use our file handling functions to read basic.html as a string, and return it as the HTTP response
(when the browser issues an HTTP request to our server).
126
1. from django.contrib import admin
2. from django.urls import path, include
3.
4. urlpatterns = [
5. path('admin/', admin.site.urls),
6. path('appone/', include('appone.urls')),
7. path('acads/', include('acads.urls')),
8. ]