03/09/2025, 08:01 ML LAB
In [8]: '''Write a python program to perform the following tasks on the Titanic D
link: https://www.kaggle.com/datasets/yasserh/titanic-dataset'''
# A. Load the dataset into a pandas dataframe.
import pandas as pd
df=pd.read_csv('Titanic-Dataset.csv')
# B. Find the total number of passengers that survived and not survived.
Survived = df['Survived'].value_counts().rename(index={0: 'Not Dead', 1:
print(Survived)
# C. Compute the number of passengers from belonging to each Passenger Cl
Passenger = df['Pclass'].value_counts().rename(index={1: 'First Class', 2
print(Passenger)
# D. Find the average age of the passengers.
age=df['Age'].mean()
print(f'The average age of is:: {age}')
# E.Find the number of passengers that survived and not survived from dif
pa=df[['Survived','Pclass']].value_counts()
print(pa)
# F.Find the average Fare of each PClass.
Fare = df.groupby('Pclass')['Fare'].mean()
print(Fare)
# G. Sort the Dataframe by Passenger Name in Alphabetical order.
sor = df['Name'].sort_values()
print(sor)
localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 1/10
03/09/2025, 08:01 ML LAB
Survived
Not Dead 549
Dead 342
Name: count, dtype: int64
Pclass
Third Class 491
First Class 216
Second Class 184
Name: count, dtype: int64
The average age of is:: 29.69911764705882
Survived Pclass
0 3 372
1 1 136
3 119
0 2 97
1 2 87
0 1 80
Name: count, dtype: int64
Pclass
1 84.154687
2 20.662183
3 13.675550
Name: Fare, dtype: float64
845 Abbing, Mr. Anthony
746 Abbott, Mr. Rossmore Edward
279 Abbott, Mrs. Stanton (Rosa Hunt)
308 Abelson, Mr. Samuel
874 Abelson, Mrs. Samuel (Hannah Wizosky)
...
286 de Mulder, Mr. Theodore
282 de Pelsmaeker, Mr. Alfons
361 del Carlo, Mr. Sebastiano
153 van Billiard, Mr. Austin Blyler
868 van Melkebeke, Mr. Philemon
Name: Name, Length: 891, dtype: object
In [7]: # 2. Write a python program to perform the following tasks:
# a.Extract the username (before @) from the 'Email' column
# in a DataFrame and store it in a new 'Username' column.
df=pd.read_csv('customers-100.csv')
username = df['Email'].str.extract(r'([\w\.-]+)@')
username.columns = ['Name']
print(username)
# B.Return rows from DataFrame where A > 5 and B < 10.
cf=pd.DataFrame({'A':[3,8,6,2,9],'B':[5,2,9,3,1],'C':[1,7,4,5,2]})
data = cf[(cf['A'] > 5) & (cf['B'] < 10)]
print(data)
localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 2/10
03/09/2025, 08:01 ML LAB
Name
0 zunigavanessa
1 vmata
2 beckycarr
3 stanleyblackwell
4 colinalvarado
.. ...
95 hhart
96 vkemp
97 swagner
98 mccarthystephen
99 colleen91
[100 rows x 1 columns]
A B C
1 8 2 7
2 6 9 4
4 9 1 2
In [6]: # C.Take two dates (YYYY-MM-DD) as input and calculate the difference in
# days, hours, and minutes using pandas.
# Prompt user for two dates
date1 = input("Enter the first date (YYYY-MM-DD): ")
date2 = input("Enter the second date (YYYY-MM-DD): ")
# Convert to pandas Timestamp
d1 = pd.to_datetime(date1)
d2 = pd.to_datetime(date2)
# Calculate time difference
delta = abs(d2 - d1)
# Display result
print(f"Difference: {delta.days} days")
Difference: 366 days
In [5]: # D.Explain what df.to_csv('student_data.csv', index=False) does in Pytho
# 1.Prompt the user to enter the file path of the CSV file containing the
import pandas as pd
from tabulate import tabulate
file_path=(input('Enter the file path'))
# 2. Read the CSV file into a Pandas DataFrame.
df=pd.read_csv(file_path)
# 3.Calculate the mean, median, and mode of the test scores using Pandas
mean_= df['Test Score'].mean()
median_= df['Test Score'].median()
mode_= df['Test Score'].mode().iloc[0]
# 4.Display the mean, median, and mode in a table.
stats = pd.DataFrame({
'Statistic': ['Mean', 'Median', 'Mode'],
'Value': [mean_, median_, mode_]
},index=['A', 'B', 'C'])
print(stats.to_markdown()) #
localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 3/10
03/09/2025, 08:01 ML LAB
| | Statistic | Value |
|:---|:------------|--------:|
| A | Mean | 83.5 |
| B | Median | 85 |
| C | Mode | 85 |
In [4]: ''' E.Write a python program to generate 2 random
numpy arrays then print the common values between the 2 arrays.'''
import numpy as np
import random
A=np.random.randint(1,10,size=10)
B=np.random.randint(1,10,size=10)
print(F"Value of A::{A}")
print(F"Value of B::{B}")
common= np.intersect1d(A,B) # used for finding the comman value
print(f"Comman values::{common}")
Value of A::[3 5 1 2 4 9 7 5 8 5]
Value of B::[3 9 8 8 1 4 5 6 8 3]
Comman values::[1 3 4 5 8 9]
In [10]: # F.Consider the 2 matrices
import numpy as np
# Define matrices
A = np.array([[5, 0, 4, 2],
[3, 9, 7, 6],
[1, 1, 8, 2]])
B = np.array([[1, 5],
[3, 6],
[0, 2],
[4, 7]])
# Transposes
AT = A.T
BT = B.T
print("Transpose of A:\n", AT)
print("Transpose of B:\n", BT)
# Matrix multiplications
AB = np.dot(A, B)
AAT = np.dot(A, AT)
BBT = np.dot(B, BT)
print("\nA * B:\n", AB)
print("\nA * (Transpose of A):\n", AAT)
print("\nB * (Transpose of B):\n", BBT)
# Inverses
try:
AAT_inv = np.linalg.inv(AAT)
print("\nInverse of (A * AT):\n", AAT_inv)
except np.linalg.LinAlgError:
print("\nA * AT is not invertible.")
try:
BBT_inv = np.linalg.inv(BBT)
print("\nInverse of (B * BT):\n", BBT_inv)
localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 4/10
03/09/2025, 08:01 ML LAB
except np.linalg.LinAlgError:
print("\nB * BT is not invertible.")
Transpose of A:
[[5 3 1]
[0 9 1]
[4 7 8]
[2 6 2]]
Transpose of B:
[[1 3 0 4]
[5 6 2 7]]
A * B:
[[ 13 47]
[ 54 125]
[ 12 41]]
A * (Transpose of A):
[[ 45 55 41]
[ 55 175 80]
[ 41 80 70]]
B * (Transpose of B):
[[26 33 10 39]
[33 45 12 54]
[10 12 4 14]
[39 54 14 65]]
Inverse of (A * AT):
[[ 0.04952381 -0.0048254 -0.02349206]
[-0.0048254 0.01243598 -0.01138624]
[-0.02349206 -0.01138624 0.0410582 ]]
Inverse of (B * BT):
[[ 2.88692284e+13 3.75299969e+14 -1.87649984e+14 -2.88692284e+14]
[-4.71530730e+14 -1.25099990e+14 8.13149933e+14 2.11707675e+14]
[ 1.29911528e+14 -5.62949953e+14 -0.00000000e+00 3.89734583e+14]
[ 3.46430741e+14 -0.00000000e+00 -5.62949953e+14 -8.66076851e+13]]
In [1]: # G.Consider the set S = {4, 2, 0, 1}, compute the sum of elements in ev
import itertools
S = {4, 2, 0, 1}
subset_sums = []
# Generate all subsets and print them with their sums
print("Subset and their sums:")
for r in range(len(S) + 1):
for subset in itertools.combinations(S, r):
subset_sum = sum(subset)
subset_sums.append(subset_sum)
print(f"Subset: {subset}, Sum: {subset_sum}")
# Optionally, compute the total sum of all subset sums
print("\nTotal sum of all subset sums:", sum(subset_sums))
localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 5/10
03/09/2025, 08:01 ML LAB
Subset and their sums:
Subset: (), Sum: 0
Subset: (0,), Sum: 0
Subset: (1,), Sum: 1
Subset: (2,), Sum: 2
Subset: (4,), Sum: 4
Subset: (0, 1), Sum: 1
Subset: (0, 2), Sum: 2
Subset: (0, 4), Sum: 4
Subset: (1, 2), Sum: 3
Subset: (1, 4), Sum: 5
Subset: (2, 4), Sum: 6
Subset: (0, 1, 2), Sum: 3
Subset: (0, 1, 4), Sum: 5
Subset: (0, 2, 4), Sum: 6
Subset: (1, 2, 4), Sum: 7
Subset: (0, 1, 2, 4), Sum: 7
Total sum of all subset sums: 56
In [19]: '''h. Load any RGB image in python using openCV then convert the
image into Grayscale and plot using matplot lib, also plot the histogram
of different channels (R, G, B) of the image.'''
import cv2
import matplotlib.pyplot as plt
import numpy as np
# Step 1: Load image using OpenCV
# Replace 'your_image.jpg' with the path to your actual image
img = cv2.imread('Bmwm5.jpg')
# Convert BGR (OpenCV default) to RGB
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# Step 2: Convert to Grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Step 3: Plot original and grayscale images
plt.figure(figsize=(12, 5))
# Original RGB image
plt.subplot(1, 2, 1)
plt.imshow(img_rgb)
plt.title('Original RGB Image')
plt.axis('off')
# Grayscale image
plt.subplot(1, 2, 2)
plt.imshow(gray, cmap='gray')
plt.title('Grayscale Image')
plt.axis('off')
plt.tight_layout()
plt.show()
# Step 4: Plot histogram of each RGB channel
colors = ('r', 'g', 'b')
channel_ids = [0, 1, 2]
localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 6/10
03/09/2025, 08:01 ML LAB
plt.figure(figsize=(8, 5))
for channel_id, color in zip(channel_ids, colors):
# Histogram for each channel (from BGR image)
hist = cv2.calcHist([img], [channel_id], None, [256], [0, 256])
plt.plot(hist, color=color)
plt.title('Histogram for R, G, B Channels')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency')
plt.xlim([0, 256])
plt.grid(True)
plt.show()
In [23]: '''i. Load the "fmri" dataset using the load_dataset function of seaborn.
Plot a line plot using x="timepoint" and y = "signal" for different even
localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 7/10
03/09/2025, 08:01 ML LAB
import seaborn as sns
import matplotlib.pyplot as plt
# Step 1: Load the fmri dataset
fmri = sns.load_dataset("fmri")
# Step 2: Display the first few rows (optional)
print(fmri.head())
# Step 3: Plot the line plot grouped by 'event' and 'region'
plt.figure(figsize=(10, 6))
sns.lineplot(data=fmri, x="timepoint", y="signal", hue="event", style="re
# Step 4: Customize the plot
plt.title("fMRI Signal over Time by Event and Region")
plt.xlabel("Timepoint")
plt.ylabel("Signal")
plt.legend(title="Event / Region")
plt.grid(True)
plt.tight_layout()
plt.show()
subject timepoint event region signal
0 s13 18 stim parietal -0.017552
1 s5 14 stim parietal -0.080883
2 s12 18 stim parietal -0.081033
3 s11 18 stim parietal -0.046134
4 s10 18 stim parietal -0.037970
In [25]: '''j. Load the "titanic" dataset using the load_dataset function of seab
Plot two box plots using x='pclass',y = 'age' and y = 'fare' '''
import seaborn as sns
import matplotlib.pyplot as plt
# Step 1: Load Titanic dataset
titanic = sns.load_dataset("titanic")
# Step 2: Boxplot for Age vs Pclass
plt.figure(figsize=(10, 5))
localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 8/10
03/09/2025, 08:01 ML LAB
plt.subplot(1, 2, 1)
sns.boxplot(data=titanic, x='pclass', y='age')
plt.title('Age Distribution by Passenger Class')
plt.xlabel('Passenger Class')
plt.ylabel('Age')
# Step 3: Boxplot for Fare vs Pclass
plt.subplot(1, 2, 2)
sns.boxplot(data=titanic, x='pclass', y='fare')
plt.title('Fare Distribution by Passenger Class')
plt.xlabel('Passenger Class')
plt.ylabel('Fare')
plt.tight_layout()
plt.show()
In [26]: '''k. Use the "diamonds" dataset from seaborn to plot a histogram for the
Use the hue parameter for the 'cut' column of the diamonds dataset.'''
import seaborn as sns
import matplotlib.pyplot as plt
# Step 1: Load the diamonds dataset
diamonds = sns.load_dataset("diamonds")
# Step 2: Plot histogram for 'price' with hue for 'cut'
plt.figure(figsize=(10, 6))
sns.histplot(data=diamonds, x="price", hue="cut", kde=False, multiple="st
# Step 3: Customize the plot
plt.title("Histogram of Diamond Prices by Cut")
plt.xlabel("Price")
plt.ylabel("Count")
plt.grid(True)
plt.tight_layout()
plt.show()
localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 9/10
03/09/2025, 08:01 ML LAB
localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 10/10