Question - 1
Question 1 (a)
Write the code to create a DataFrame „df‟ and perform the following operations.
Maths Science SST
Amit 100 100.0 60.0
Mohan 95 50.0 57.48
Sudha 85 90.0 53.58
i) Add one column Total=Maths+Science+SST.
ii) Add the marks of Kishor with values 75.6, 88.5, 90.3
iii) Display the marks of Maths and Science
iv) Update marks of Science of Sudha to 85.0
v) Delete the row – Mohan
Solution:
import pandas as pd
s1={'Maths':100,'Science':100.0,'SST':60.0}
s2={'Maths':95,'Science':50.0,'SST':57.48}
s3={'Maths':85,'Science':90.0,'SST':53.58}
lst=[s1,s2,s3]
df=pd.DataFrame(lst,index=['Amit','Mohan','Sudha'])
print("The created dataframe is:")
print(df)
df['Total']=df['Maths']+df['Science']+df['SST']
print("The dataframe after adding a new column Total:")
print(df)
df.loc['Kishor',:]=[75.6,88.5,90.3,0]
print("The dataframe after adding a new row:")
print(df)
print("Marks of Maths & Science")
1
print(df.iloc[::,0:2])
df.iat[2,1]=85.0
print("Dataframe after changing the marks of Science of Sudha")
print(df)
df=df.drop(['Mohan'])
print("Dataframe after deleting the record of Mohan")
print(df)
Sample output
The created dataframe is:
Maths Science SST
Amit 100 100.0 60.00
Mohan 95 50.0 57.48
Sudha 85 90.0 53.58
The dataframe after adding a new column Total:
Maths Science SST Total
Amit 100 100.0 60.00 260.00
Mohan 95 50.0 57.48 202.48
Sudha 85 90.0 53.58 228.58
The dataframe after adding a new row:
Maths Science SST Total
Amit 100.0 100.0 60.00 260.00
Mohan 95.0 50.0 57.48 202.48
Sudha 85.0 90.0 53.58 228.58
Kishor 75.6 88.5 90.30 0.00
Marks of Maths & Science
2
Maths Science
Amit 100.0 100.0
Mohan 95.0 50.0
Sudha 85.0 90.0
Kishor 75.6 88.5
Dataframe after changing the marks of Science of Sudha
Maths Science SST Total
Amit 100.0 100.0 60.00 260.00
Mohan 95.0 50.0 57.48 202.48
Sudha 85.0 85.0 53.58 228.58
Kishor 75.6 88.5 90.30 0.00
Dataframe after deleting the record of Mohan
Maths Science SST Total
Amit 100.0 100.0 60.00 260.00
Sudha 85.0 85.0 53.58 228.58
Kishor 75.6 88.5 90.30 0.00
--------------------------------------------------------------------------------------------------------------------------------------
Question – 2
Question 1(a)
Consider an array with values 10, 20, 30, 40, 50. Create a series from this array with default indexes
and write Python statements for the following.
i) Set the values of all elements to 100
ii) Add 10 to all elements of the series and display it.
iii) Display the 1st and the 4th elements of the series.
iv) Set the value of 3rd element to 500.
Solution
import numpy as np
import pandas as pd
lst=[10,20,30,40]
ar=np.array(lst)
3
sr=pd.Series(ar)
print("Created series is:")
print(sr)
print("Series after adding 10 with all elements:")
print(sr+10)
print("The first and the fourth elements of the series are:")
print(sr[0],sr[3])
print("Series after setting the third value as 500")
sr[2]=500
print(sr)
sample output
Created series is:
0 10
1 20
2 30
3 40
dtype: int32
Series after adding 10 with all elements:
0 20
1 30
2 40
3 50
dtype: int32
The first and the fourth elements of the series are:
10 40
Series after setting the third value as 500
0 10
1 20
2 500
3 40
dtype: int32
Question – 3
Question 1(a)
a) Create a series that stores the name (as index) and area (as value) of six states in KM (using
dictionaries).
i) Write the code to find out the biggest and the smallest three areas from the given series.
ii) To display the series in the alphabetical order of state names.
iii) To display the details of the states having the area greater than 25000 KM.
iv) To change the indices as „State1‟, „State2‟, „State3‟, „State4‟, „State5‟ and „State6‟
Solution
import pandas as pd
s1=pd.Series({"Kerala":50000,"TamilNadu":23000,"Karnataka":35000,"UP":75000,"AP":40000,"M
P":20000})
print("Created Series is:")
4
print(s1)
ans=‟y‟
while ans==‟y‟ or ans==‟Y‟:
print(“1. The biggest and the smallest three areas”)
print(“2. Display the series in alphabetical order of state names”)
print(“3. States having area above 25000 KM”)
print(“4. Change the indices”)
ch=int(input(“Enter your choice”))
if ch==1:
s1.sort_values(inplace=True)
print("Biggest three atates are:")
print(s1.tail(3))
print("Smallest three states are")
print(s1.head(3))
elif ch==2:
print("Given series in the alphabetical order of state names:")
s1.sort_index(inplace=True)
print(s1)
elif ch==3:
print("Details of the states having the area greater than 25000")
print(s1[s1>25000])
elif ch==4:
print("Series after chaning the index values:")
s1.index=['State1','State2','State3','State4','State5','State6']
print(s1)
else:
print(“Invalid choice”)
ans=input(“Do you wish to continue”))
sample output
Created Series is:
Kerala 50000
TamilNadu 23000
Karnataka 35000
UP 75000
AP 40000
5
MP 20000
dtype: int64
Biggest three atates are:
AP 40000
Kerala 50000
UP 75000
dtype: int64
Smallest three states are
MP 20000
TamilNadu 23000
Karnataka 35000
dtype: int64
Given series in the alphabetical order of state names:
AP 40000
Karnataka 35000
Kerala 50000
MP 20000
TamilNadu 23000
UP 75000
dtype: int64
Details of the states having the area greater than 25000
AP 40000
Karnataka 35000
Kerala 50000
UP 75000
dtype: int64
Series after chaning the index values:
State1 40000
State2 35000
State3 50000
State4 20000
State5 23000
State6 75000
dtype: int64
6
Question – 4
Question 1(a)
Create a DataFrame as given below and also write the code to add or remove rows from the dataframe
according to the user‟s choice. Also store the dataframe to a CSV file.
Yr1 Yr2 Yr3
Qtr1 34500 44900 54500
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
Solution
import pandas as pd
d1={'Yr1':34500,'Yr2':44900,'Yr3':54500}
d2={'Yr1':36000,'Yr2':46100,'Yr3':51000}
d3={'Yr1':47000,'Yr2':57000,'Yr3':58500}
lst=[d1,d2,d3]
df=pd.DataFrame(lst,index=['Qtr1','Qtr2','Qtr3'])
print("Created dataframe is:")
print(df)
ans='Y'
while ans=='Y' or ans=='y':
print("1.Add a new row\n2.Remove a row")
ch=int(input("Enter your choice(1 or 2)"))
if ch==1:
yr1=eval(input("Enter the sales amount of year1: "))
yr2=eval(input("Enter the sales amount of year2: "))
yr3=eval(input("Enter the sales amount of year3: "))
df.loc["Qtr"+str(len(df)+1)]=[yr1,yr2,yr3]
print("Dataframe after insertion of new row:")
print(df)
elif ch==2:
print(df.index)
lab=input("Enter the row index")
df=df.drop([lab])
print("Dataframe after deletion of the specified row")
print(df)
else:
print("Invalid choice")
ans=input("Do you wish to continue")
df.to_csv("D:\\SAPS\\sample.csv",sep=',')
print("A csv file created at the specified location using the dataframe")
Sample output
Created dataframe is:
Yr1 Yr2 Yr3
Qtr1 34500 44900 54500
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
1.Add a new row
2.Remove a row
Enter your choice(1 or 2)1
Enter the sales amount of year1: 25000
7
Enter the sales amount of year2: 32000
Enter the sales amount of year3: 40000
Dataframe after insertion of new row:
Yr1 Yr2 Yr3
Qtr1 34500 44900 54500
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
Qtr4 25000 32000 40000
Do you wish to continuey
1.Add a new row
2.Remove a row
Enter your choice(1 or 2)2
Index(['Qtr1', 'Qtr2', 'Qtr3', 'Qtr4'], dtype='object')
Enter the row indexQtr1
Dataframe after deletion of the specified row
Yr1 Yr2 Yr3
Qtr2 36000 46100 51000
Qtr3 47000 57000 58500
Qtr4 25000 32000 40000
Do you wish to continuen
A csv file created at the specified location using the dataframe
Question – 5
Question 1(a)
DataFrame - Employee
Name Position Age Projects
Rabia Manager 30 NaN
Evan Programmer NaN NaN
Jia Manager 34 16
Lalit Programmer 40 20
Write a program to create the above given DataFrame employee and perform the following operations on
it.
1. Display the details of employees having age greater than 30
2. Display the details of all programmers
3. Display the details of a particular employee
Solution
import numpy as np
import pandas as pd
e1={'Name':'Rabia','Position':'Manager','Age':30,'Projects':np.NaN}
e2={'Name':'Evan','Position':'Programmer','Age':np.NaN,'Projects':np.NaN}
e3={'Name':'Jia','Position':'Manager','Age':34,'Projects':16}
e4={'Name':'Lalit','Position':'Programmer','Age':40,'Projects':20}
8
lst=[e1,e2,e3,e4]
employee=pd.DataFrame(lst)
print("Created dataframe is:")
print(employee)
ans='y'
while ans=='y' or ans=='Y':
print("1.Details of employees having age above 30\n2.Details of all Programmers\n3.Details of a
particular employee")
ch=int(input("Enter your choice (1-3)"))
if ch==1:
print("Employees having age freater than 30")
print(employee[employee['Age']>30])
elif ch==2:
print("Details of programmers")
print(employee[employee['Position']=='Programmer'])
elif ch==3:
print("Names of employees")
print(employee.Name)
nm=input("Enter the name of the employee whose details you want to display")
print(employee[employee.Name==nm])
else:
print("Invalid choice")
ans=input("Do you wish to continue")
Sample output
Created dataframe is:
Name Position Age Projects
0 Rabia Manager 30.0 NaN
1 Evan Programmer NaN NaN
2 Jia Manager 34.0 16.0
3 Lalit Programmer 40.0 20.0
1.Details of employees having age above 30
2.Details of all Programmers
3.Details of a particular employee
Enter your choice (1-3)1
Employees having age freater than 30
Name Position Age Projects
2 Jia Manager 34.0 16.0
3 Lalit Programmer 40.0 20.0
Do you wish to continuey
9
1.Details of employees having age above 30
2.Details of all Programmers
3.Details of a particular employee
Enter your choice (1-3)2
Details of programmers
Name Position Age Projects
1 Evan Programmer NaN NaN
3 Lalit Programmer 40.0 20.0
Do you wish to continuey
1.Details of employees having age above 30
2.Details of all Programmers
3.Details of a particular employee
Enter your choice (1-3)3
Names of employees
0 Rabia
1 Evan
2 Jia
3 Lalit
Name: Name, dtype: object
Enter the name of the employee whose details you want to displayRabia
Name Position Age Projects
0 Rabia Manager 30.0 NaN
Do you wish to continuen
10
Question – 6
Question 1(a)
Write a program to create two series that stores the salary obtained by 3 employees for 2 months (using lists).
Calculate the sum, average and difference in their salaries using Series.
Solution
import pandas as pd
mth1=pd.Series([30000,35000,28000],index=["Ram","Shyam","Mohan"])
mth2=pd.Series([32000,40000,28000],index=["Ram","Shyam","Mohan"])
print("salary of the first month")
print(mth1)
print("Salary of the second month")
print(mth2)
ans='y'
while ans=='y' or ans=='Y':
print("1.Sum of salaries")
print("2. Average of salaries")
print("3. Difference of salaries")
ch=int(input("enter your choice"))
if ch==1:
print("Sum of the salaries")
print(mth1+mth2)
elif ch==2:
print("Average of the salaries")
print((mth1+mth2)/2)
elif ch==3:
print("Difference between the salaries")
print(mth1-mth2)
else:
print("Invalid choice")
ans=input("Do you wish to continue")
Sample output
salary of the first month
Ram 30000
Shyam 35000
Mohan 28000
dtype: int64
Salary of the second month
Ram 32000
Shyam 40000
Mohan 28000
dtype: int64
1.Sum of salaries
2. Average of salaries
3. Difference of salaries
11
enter your choice1
Sum of the salaries
Ram 62000
Shyam 75000
Mohan 56000
dtype: int64
Do you wish to continuey
1.Sum of salaries
2. Average of salaries
3. Difference of salaries
enter your choice2
Average of the salaries
Ram 31000.0
Shyam 37500.0
Mohan 28000.0
dtype: float64
Do you wish to continuey
1.Sum of salaries
2. Average of salaries
3. Difference of salaries
enter your choice3
Difference between the salaries
Ram -2000
Shyam -5000
Mohan 0
dtype: int64
Do you wish to continuen
12
Question – 7
Question 1(a)
Create a dataframe with RollNo, Name, Age and Marks of 5 subjects with default index. Write the
commands to do the following operations on the dataframes.
i) Calculate the total marks and display in the field „Total‟
ii) Change the index from default to RollNo.
iii) Display the details of 1st and 3rd students.
iv) Add a new row to the dataframe.
Solution
import pandas as pd
s1={'RollNo':1,'Name':'Arun','Age':15,'Mark1':78,'Mark2':85,'Mark3':90,'Mark4':92,'Mark5':88}
s2={'RollNo':2,'Name':'Bibin','Age':16,'Mark1':98,'Mark2':80,'Mark3':92,'Mark4':96,'Mark5':98}
s3={'RollNo':3,'Name':'Dijo','Age':15,'Mark1':70,'Mark2':80,'Mark3':90,'Mark4':95,'Mark5':97}
students=pd.DataFrame([s1,s2,s3])
print(students)
students['Total']=students.Mark1+students.Mark2+students.Mark3+students.Mark4+students.Mark5
print(students)
print("Details of the first and the third students")
print(students.iloc[0::2,::])
print("Dataframe after setting RollNo as index:")
students.set_index('RollNo',inplace=True)
print(students)
print("Details of the first and the third students")
print(students.iloc[0::2])
print("Adding new record")
rno=int(input("Enter roll number"))
nm=input("Enter name")
ag=int(input("Enter age"))
m1=int(input("Enter mark1: "))
m2=int(input("Enter mark2: "))
m3=int(input("Enter mark3: "))
m4=int(input("Enter mark4: "))
m5=int(input("Enter mark5: "))
tot=eval(input("Enter total: "))
students.loc[rno]=[nm,ag,m1,m2,m3,m4,m5,tot]
print("Dataframe after insertion of a new row")
print(students)
Sample output
RollNo Name Age Mark1 Mark2 Mark3 Mark4 Mark5
0 1 Arun 15 78 85 90 92 88
1 2 Bibin 16 98 80 92 96 98
2 3 Dijo 15 70 80 90 95 97
RollNo Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
0 1 Arun 15 78 85 90 92 88 433
13
1 2 Bibin 16 98 80 92 96 98 464
2 3 Dijo 15 70 80 90 95 97 432
Details of the first and the third students
RollNo Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
0 1 Arun 15 78 85 90 92 88 433
2 3 Dijo 15 70 80 90 95 97 432
Dataframe after setting RollNo as index:
Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
RollNo
1 Arun 15 78 85 90 92 88 433
2 Bibin 16 98 80 92 96 98 464
3 Dijo 15 70 80 90 95 97 432
Details of the first and the third students
Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
RollNo
1 Arun 15 78 85 90 92 88 433
3 Dijo 15 70 80 90 95 97 432
Adding new record
Enter roll number4
Enter nameKevin
Enter age15
Enter mark1: 100
Enter mark2: 100
Enter mark3: 100
Enter mark4: 100
Enter mark5: 100
Enter total: 500
Dataframe after insertion of a new row
Name Age Mark1 Mark2 Mark3 Mark4 Mark5 Total
RollNo
1 Arun 15 78 85 90 92 88 433
2 Bibin 16 98 80 92 96 98 464
3 Dijo 15 70 80 90 95 97 432
4 Kevin 15 100 100 100 100 100 500
14
Question – 8
Question 1(a)
Write a Python program to create 2 DataFrames that stores the marks secured by 5 students in 2
examinations and perform the following operations:
i) To create a new data frame containing total marks (adding marks secured
in both exams)
ii) To display the top 3 scorers details
Solution
import pandas as pd
exam1=pd.DataFrame({"Name":["Anu","Biju","Cino","Deljith"],"Marks":[50,45,24,48]},index=["Rno1"
,"Rno2","Rno3","Rno4"])
exam2=pd.DataFrame({"Name":["Anu","Biju","Cino","Deljith"],"Marks":[45,40,43,40]},index=["Rno1"
,"Rno2","Rno3","Rno4"])
print("Details of Exam1")
print(exam1)
print("Details of Exam2")
print(exam2)
print("Total marks of two exams")
df3=pd.DataFrame({"Name":exam1.Name,"Total":exam1.Marks+exam2.Marks})
print(df3)
print("Details of the top 3 scorers")
df3.sort_values(["Total"],ascending=False,inplace=True)
print(df3.head(3))
Sample output
Details of Exam1
Name Marks
Rno1 Anu 50
Rno2 Biju 45
Rno3 Cino 24
Rno4 Deljith 48
Details of Exam2
Name Marks
Rno1 Anu 45
15
Rno2 Biju 40
Rno3 Cino 43
Rno4 Deljith 40
Total marks of two exams
Name Total
Rno1 Anu 95
Rno2 Biju 85
Rno3 Cino 67
Rno4 Deljith 88
Details of the top 3 scorers
Name Total
Rno1 Anu 95
Rno4 Deljith 88
Rno2 Biju 85
Question – 9
Question 1(a)
a) Write a Python program to create the given DataFrame and also write the code to perform the
following operations.
Population Hospitals Schools
Delhi 10927986 189 7916
Mumbai 12691836 208 8508
Kolkata 4631392 149 7226
Chennai 4328063 157 7617
i) Display the details of Mumbai
ii) Add one more column Colleges with appropriate data
iii) Change the no. of hospitals of Chennai to 160
iv) Add the details of the city „Hyderabad‟
Sample output
16
Created dataframe is
Population Hospitals Schools
Delhi 10927986 189 7916
Mumbai 12691836 208 8508
Kolkata 4631392 149 7226
Chennai 4328063 157 7617
Details of the city Mumbai
Population 12691836
Hospitals 208
Schools 8508
Name: Mumbai, dtype: int64
Dataframe after adding new field Colleges
Population Hospitals Schools Colleges
Delhi 10927986 189 7916 100
Mumbai 12691836 208 8508 200
Kolkata 4631392 149 7226 125
Chennai 4328063 157 7617 170
Dataframe after changing no. of hospitals of Chennai
Population Hospitals Schools Colleges
Delhi 10927986 189 7916 100
Mumbai 12691836 208 8508 200
Kolkata 4631392 149 7226 125
Chennai 4328063 160 7617 170
Dataframe after adding details of the city Hyderabad
Population Hospitals Schools Colleges
Delhi 10927986 189 7916 100
Mumbai 12691836 208 8508 200
Kolkata 4631392 149 7226 125
Chennai 4328063 160 7617 170
17
Hyderabad 10238596 200 5250 150
Question – 10
Question 1(a)
Write a Python program to create a series to store the amount of sales made by a salesman in the last year
(whole months) and perform the following operations.
i) Display the sales amount which is greater than 10000.
ii) Display the sales amount in the first four months.
iii) Display the series in the descending order of sales amount.
Solution
import pandas as pd
import numpy as np
sales=np.array([25000,10000,22500,21750,24000,25000,9500,22500,21750,24000,23000,22800])
ser=pd.Series(sales,index=['jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec'])
print("Created series is:")
print(ser)
print("Sales amount more than 10000")
print(ser[ser>10000])
print("Sales amount in the first four months")
print(ser.head(4))
print("Series in the descending order of sales amount")
ser.sort_values(ascending=False,inplace=True)
print(ser)
Sample output
Created series is:
jan 25000
feb 10000
mar 22500
apr 21750
18
may 24000
jun 25000
jul 9500
aug 22500
sep 21750
oct 24000
nov 23000
dec 22800
dtype: int32
Sales amount more than 10000
jan 25000
mar 22500
apr 21750
may 24000
jun 25000
aug 22500
sep 21750
oct 24000
nov 23000
dec 22800
dtype: int32
Sales amount in the first four months
jan 25000
feb 10000
mar 22500
apr 21750
dtype: int32
Series in the descending order of sales amount
jan 25000
19
jun 25000
may 24000
oct 24000
nov 23000
dec 22800
mar 22500
aug 22500
apr 21750
sep 21750
feb 10000
jul 9500
dtype: int32
20