SEC 16: Analytics/Computation using Python
Suggested Practical List For SEC:16 - Analytics/Computation using Python
Note: Any platform for Python can be used for lab exercises
1. Write a program to perform following operations:
a. Write a program to take age of 10 students from the user in a list. Determine the
average age without using predefined functions.
b. Write a python program to create two lists as follows:
fruits = ["Apple", "Mango", "Peach", "Banana"]
prices= [100, 80, 150, 70]
Create a single dictionary from above lists such that it stores the names of fruits
as keys and prices as its corresponding values.
2. Write a program to create two numpy arrays and two lists, each comprising 10,000
elements. Show that numpy arrays are storage efficient than Python lists and summation
of two numpy arrays is computationally efficient than summation of two lists.
3. Write a program to perform the following operations using numpy library:
a. Create a 5X2 integer array with elements in the range 50 to 100 such that the
difference between each element is 5. Reshape the resultant array to size 10X1.
b. Create a 1D random array comprising elements in range 1 to 100. Compute the
minimum, maximum, mean, median, standard deviation, unique values, and
count of unique values.
c. Add two 2D arrays and calculate the square root of each element of the resulting
array.
d. Consider the following array:
intArray= numpy.array([[34,43,73],[82,22,12],[53,94,66]])
Perform each of the following operations on the original array as given above:
I. Delete the second column of the array.
II. Sort the array by the second row.
III. Sort the array by second column.
IV. Find row-wise maximum element.
V. Find column-wise maximum element.
VI. Swap the first two columns of the array.
VII. Retrieve all the even numbers from the array.
e. Create an identity matrix of size 5x5 with all the diagonal elements set to value 5.
4. Write a Python program using pandas library to perform the following operations:
a. Create a series from a dictionary representing the count of students in each class
X, XI, and XII as given below.
dictionary = { 'XII' : 25, 'XI' : 30, 'X' : 50 }
Display the series
i) sorted on the index
ii) sorted on values
b. Create a series from the numpy array of 10 randomly generated values. Find
maximum and minimum values and their positions.
c. Create a series as shown below with days of the week as index labels and
forecasted maximum temperature as their associated values:
Monday 25
Tuesday 40
Wednesda 18
y
Thursday 27
Friday 32
Saturday 39
Sunday 28
Retrieve the temperature of the 4th day of the week as well as Tuesday.
d. Create two data frames for storing the results of class X students of sections A
and B, where each dataframe comprises the following details: name, percentage,
and qualifying status in preboard exams.
Section A Section B
Name Percentag Qualify Name Percentage Qualify
e 0 Parvee 89.5 yes
0 Aroma 79.5 yes n
1 Kiran 29.0 no 1 Ahil 92.0 yes
2 Rayan 90.5 yes 2 Shaila 90.5 yes
3 Rohan NaN no 3 Shruti 91.5 yes
4 Amit 32.0 no 4 Mark 90.0 yes
5 Yash 65.0 yes
6 Mona 56.0 yes
7 Kartik NaN NaN
8 Kavita 29.0 no
9 Pooja 89.0 Nan
e. Merge the dataframes created in the previous part of the question to obtain the
combined result of class X students in preboard exams.
f. Use the resultant dataframe obtained above to filter out students with 80
percentage and above. Also, determine the count of students with missing data.
5. Write a program to perform the following operations on penguins dataset of seaborn
library using pandas:
a. Load the dataset into pandas dataframe.
b. List number of observations/records and number of attributes of the dataframe.
c. Display the name of attributes and row indexes of the dataframe along with the
data type of each attribute.
d. Display the first 5 and last 5 records of the dataframe.
e. Retrieve the value of the second column for the third and fourth records.
f. Display the summary of the data distribution of all attributes.
g. Compute the pairwise correlation between all attributes.
6. Write a program to perform the following operation using pandas library:
a. Create a dataframe with six columns and thirty rows with randomly generated
numeric values.
b. Replace at least 20 values in the dataframe at randomly generated indexes to
null values.
c. Determine the number of missing values in each attribute.
d. Remove the rows having more than 70% missing values.
e. Normalize the attributes by mapping them to range 0 to 1.
7. Use the iris dataset of the sklearn library to create the following visual representations of
data using matplotlib and/or seaborn library:
a. Scatter plot showing the relationship between petal length and petal width of
different instances of iris flowers.
b. Histograms showing the data distribution of each of the four attributes.
c. Pie Chart showing the frequency count of each flower type.
d. Pair Plot showing the relationship between every pair of attributes
8. Consider the mpg dataset of the seaborn library and apply linear regression to estimate
mpg (Miles Per Gallon) for cars using the attributes namely, weight, cylinders, and
displacement. Use the predefined class LinearRegression defined in the sklearn library
9. Write a program to predict species - Adelie, Gentoo, and Chinstrap of the penguins
dataset of seaborn library using logistic regression method. Make the prediction based
on features, namely, bill_length_mm, bill_depth_mm, flipper_length_mm, and
body_mass_g. Apply regularization to avoid overfitting. Use the predefined class
LogisticRegression defined in the sklearn library
10. Write a Python program that makes a connection with database College.db and creates
a table STUDENT with the attributes, namely, rollNumber, studName, and class. Use
sqlite library for database connectivity.
Prepared By: Prof. (Dr.) Hema Banati (Dyal Singh College), Dr. Sheetal Rajpal (Dyal Singh
College), Ms. Neha Gandhi (Shaheed Rajguru College), Dr. Deepali Bajaj (Shaheed Rajguru
College)