0% found this document useful (0 votes)
12 views10 pages

ML LAB - Ipynb - (4) - JupyterLab

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views10 pages

ML LAB - Ipynb - (4) - JupyterLab

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

03/09/2025, 08:01 ML LAB

In [8]: '''Write a python program to perform the following tasks on the Titanic D
link: https://www.kaggle.com/datasets/yasserh/titanic-dataset'''
# A. Load the dataset into a pandas dataframe.
import pandas as pd
df=pd.read_csv('Titanic-Dataset.csv')

# B. Find the total number of passengers that survived and not survived.
Survived = df['Survived'].value_counts().rename(index={0: 'Not Dead', 1:
print(Survived)

# C. Compute the number of passengers from belonging to each Passenger Cl


Passenger = df['Pclass'].value_counts().rename(index={1: 'First Class', 2
print(Passenger)

# D. Find the average age of the passengers.


age=df['Age'].mean()
print(f'The average age of is:: {age}')

# E.Find the number of passengers that survived and not survived from dif
pa=df[['Survived','Pclass']].value_counts()
print(pa)

# F.Find the average Fare of each PClass.


Fare = df.groupby('Pclass')['Fare'].mean()
print(Fare)

# G. Sort the Dataframe by Passenger Name in Alphabetical order.


sor = df['Name'].sort_values()
print(sor)

localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 1/10


03/09/2025, 08:01 ML LAB

Survived
Not Dead 549
Dead 342
Name: count, dtype: int64
Pclass
Third Class 491
First Class 216
Second Class 184
Name: count, dtype: int64
The average age of is:: 29.69911764705882
Survived Pclass
0 3 372
1 1 136
3 119
0 2 97
1 2 87
0 1 80
Name: count, dtype: int64
Pclass
1 84.154687
2 20.662183
3 13.675550
Name: Fare, dtype: float64
845 Abbing, Mr. Anthony
746 Abbott, Mr. Rossmore Edward
279 Abbott, Mrs. Stanton (Rosa Hunt)
308 Abelson, Mr. Samuel
874 Abelson, Mrs. Samuel (Hannah Wizosky)
...
286 de Mulder, Mr. Theodore
282 de Pelsmaeker, Mr. Alfons
361 del Carlo, Mr. Sebastiano
153 van Billiard, Mr. Austin Blyler
868 van Melkebeke, Mr. Philemon
Name: Name, Length: 891, dtype: object

In [7]: # 2. Write a python program to perform the following tasks:


# a.Extract the username (before @) from the 'Email' column
# in a DataFrame and store it in a new 'Username' column.
df=pd.read_csv('customers-100.csv')
username = df['Email'].str.extract(r'([\w\.-]+)@')
username.columns = ['Name']
print(username)
# B.Return rows from DataFrame where A > 5 and B < 10.
cf=pd.DataFrame({'A':[3,8,6,2,9],'B':[5,2,9,3,1],'C':[1,7,4,5,2]})
data = cf[(cf['A'] > 5) & (cf['B'] < 10)]
print(data)

localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 2/10


03/09/2025, 08:01 ML LAB

Name
0 zunigavanessa
1 vmata
2 beckycarr
3 stanleyblackwell
4 colinalvarado
.. ...
95 hhart
96 vkemp
97 swagner
98 mccarthystephen
99 colleen91

[100 rows x 1 columns]


A B C
1 8 2 7
2 6 9 4
4 9 1 2

In [6]: # C.Take two dates (YYYY-MM-DD) as input and calculate the difference in
# days, hours, and minutes using pandas.

# Prompt user for two dates


date1 = input("Enter the first date (YYYY-MM-DD): ")
date2 = input("Enter the second date (YYYY-MM-DD): ")

# Convert to pandas Timestamp


d1 = pd.to_datetime(date1)
d2 = pd.to_datetime(date2)

# Calculate time difference


delta = abs(d2 - d1)

# Display result
print(f"Difference: {delta.days} days")

Difference: 366 days

In [5]: # D.Explain what df.to_csv('student_data.csv', index=False) does in Pytho


# 1.Prompt the user to enter the file path of the CSV file containing the
import pandas as pd
from tabulate import tabulate
file_path=(input('Enter the file path'))
# 2. Read the CSV file into a Pandas DataFrame.
df=pd.read_csv(file_path)
# 3.Calculate the mean, median, and mode of the test scores using Pandas
mean_= df['Test Score'].mean()
median_= df['Test Score'].median()
mode_= df['Test Score'].mode().iloc[0]
# 4.Display the mean, median, and mode in a table.
stats = pd.DataFrame({
'Statistic': ['Mean', 'Median', 'Mode'],
'Value': [mean_, median_, mode_]
},index=['A', 'B', 'C'])

print(stats.to_markdown()) #

localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 3/10


03/09/2025, 08:01 ML LAB

| | Statistic | Value |
|:---|:------------|--------:|
| A | Mean | 83.5 |
| B | Median | 85 |
| C | Mode | 85 |

In [4]: ''' E.Write a python program to generate 2 random


numpy arrays then print the common values between the 2 arrays.'''
import numpy as np
import random
A=np.random.randint(1,10,size=10)
B=np.random.randint(1,10,size=10)
print(F"Value of A::{A}")
print(F"Value of B::{B}")
common= np.intersect1d(A,B) # used for finding the comman value
print(f"Comman values::{common}")

Value of A::[3 5 1 2 4 9 7 5 8 5]
Value of B::[3 9 8 8 1 4 5 6 8 3]
Comman values::[1 3 4 5 8 9]

In [10]: # F.Consider the 2 matrices


import numpy as np

# Define matrices
A = np.array([[5, 0, 4, 2],
[3, 9, 7, 6],
[1, 1, 8, 2]])

B = np.array([[1, 5],
[3, 6],
[0, 2],
[4, 7]])

# Transposes
AT = A.T
BT = B.T

print("Transpose of A:\n", AT)


print("Transpose of B:\n", BT)

# Matrix multiplications
AB = np.dot(A, B)
AAT = np.dot(A, AT)
BBT = np.dot(B, BT)

print("\nA * B:\n", AB)


print("\nA * (Transpose of A):\n", AAT)
print("\nB * (Transpose of B):\n", BBT)

# Inverses
try:
AAT_inv = np.linalg.inv(AAT)
print("\nInverse of (A * AT):\n", AAT_inv)
except np.linalg.LinAlgError:
print("\nA * AT is not invertible.")

try:
BBT_inv = np.linalg.inv(BBT)
print("\nInverse of (B * BT):\n", BBT_inv)

localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 4/10


03/09/2025, 08:01 ML LAB

except np.linalg.LinAlgError:
print("\nB * BT is not invertible.")

Transpose of A:
[[5 3 1]
[0 9 1]
[4 7 8]
[2 6 2]]
Transpose of B:
[[1 3 0 4]
[5 6 2 7]]

A * B:
[[ 13 47]
[ 54 125]
[ 12 41]]

A * (Transpose of A):
[[ 45 55 41]
[ 55 175 80]
[ 41 80 70]]

B * (Transpose of B):
[[26 33 10 39]
[33 45 12 54]
[10 12 4 14]
[39 54 14 65]]

Inverse of (A * AT):
[[ 0.04952381 -0.0048254 -0.02349206]
[-0.0048254 0.01243598 -0.01138624]
[-0.02349206 -0.01138624 0.0410582 ]]

Inverse of (B * BT):
[[ 2.88692284e+13 3.75299969e+14 -1.87649984e+14 -2.88692284e+14]
[-4.71530730e+14 -1.25099990e+14 8.13149933e+14 2.11707675e+14]
[ 1.29911528e+14 -5.62949953e+14 -0.00000000e+00 3.89734583e+14]
[ 3.46430741e+14 -0.00000000e+00 -5.62949953e+14 -8.66076851e+13]]

In [1]: # G.Consider the set S = {4, 2, 0, 1}, compute the sum of elements in ev
import itertools

S = {4, 2, 0, 1}
subset_sums = []

# Generate all subsets and print them with their sums


print("Subset and their sums:")
for r in range(len(S) + 1):
for subset in itertools.combinations(S, r):
subset_sum = sum(subset)
subset_sums.append(subset_sum)
print(f"Subset: {subset}, Sum: {subset_sum}")

# Optionally, compute the total sum of all subset sums


print("\nTotal sum of all subset sums:", sum(subset_sums))

localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 5/10


03/09/2025, 08:01 ML LAB

Subset and their sums:


Subset: (), Sum: 0
Subset: (0,), Sum: 0
Subset: (1,), Sum: 1
Subset: (2,), Sum: 2
Subset: (4,), Sum: 4
Subset: (0, 1), Sum: 1
Subset: (0, 2), Sum: 2
Subset: (0, 4), Sum: 4
Subset: (1, 2), Sum: 3
Subset: (1, 4), Sum: 5
Subset: (2, 4), Sum: 6
Subset: (0, 1, 2), Sum: 3
Subset: (0, 1, 4), Sum: 5
Subset: (0, 2, 4), Sum: 6
Subset: (1, 2, 4), Sum: 7
Subset: (0, 1, 2, 4), Sum: 7

Total sum of all subset sums: 56

In [19]: '''h. Load any RGB image in python using openCV then convert the
image into Grayscale and plot using matplot lib, also plot the histogram
of different channels (R, G, B) of the image.'''

import cv2
import matplotlib.pyplot as plt
import numpy as np

# Step 1: Load image using OpenCV


# Replace 'your_image.jpg' with the path to your actual image
img = cv2.imread('Bmwm5.jpg')

# Convert BGR (OpenCV default) to RGB


img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Step 2: Convert to Grayscale


gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Step 3: Plot original and grayscale images


plt.figure(figsize=(12, 5))

# Original RGB image


plt.subplot(1, 2, 1)
plt.imshow(img_rgb)
plt.title('Original RGB Image')
plt.axis('off')

# Grayscale image
plt.subplot(1, 2, 2)
plt.imshow(gray, cmap='gray')
plt.title('Grayscale Image')
plt.axis('off')

plt.tight_layout()
plt.show()

# Step 4: Plot histogram of each RGB channel


colors = ('r', 'g', 'b')
channel_ids = [0, 1, 2]

localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 6/10


03/09/2025, 08:01 ML LAB

plt.figure(figsize=(8, 5))
for channel_id, color in zip(channel_ids, colors):
# Histogram for each channel (from BGR image)
hist = cv2.calcHist([img], [channel_id], None, [256], [0, 256])
plt.plot(hist, color=color)

plt.title('Histogram for R, G, B Channels')


plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency')
plt.xlim([0, 256])
plt.grid(True)
plt.show()

In [23]: '''i. Load the "fmri" dataset using the load_dataset function of seaborn.
Plot a line plot using x="timepoint" and y = "signal" for different even

localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 7/10


03/09/2025, 08:01 ML LAB

import seaborn as sns


import matplotlib.pyplot as plt

# Step 1: Load the fmri dataset


fmri = sns.load_dataset("fmri")

# Step 2: Display the first few rows (optional)


print(fmri.head())

# Step 3: Plot the line plot grouped by 'event' and 'region'


plt.figure(figsize=(10, 6))
sns.lineplot(data=fmri, x="timepoint", y="signal", hue="event", style="re

# Step 4: Customize the plot


plt.title("fMRI Signal over Time by Event and Region")
plt.xlabel("Timepoint")
plt.ylabel("Signal")
plt.legend(title="Event / Region")
plt.grid(True)
plt.tight_layout()
plt.show()

subject timepoint event region signal


0 s13 18 stim parietal -0.017552
1 s5 14 stim parietal -0.080883
2 s12 18 stim parietal -0.081033
3 s11 18 stim parietal -0.046134
4 s10 18 stim parietal -0.037970

In [25]: '''j. Load the "titanic" dataset using the load_dataset function of seab
Plot two box plots using x='pclass',y = 'age' and y = 'fare' '''
import seaborn as sns
import matplotlib.pyplot as plt

# Step 1: Load Titanic dataset


titanic = sns.load_dataset("titanic")

# Step 2: Boxplot for Age vs Pclass


plt.figure(figsize=(10, 5))

localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 8/10


03/09/2025, 08:01 ML LAB

plt.subplot(1, 2, 1)
sns.boxplot(data=titanic, x='pclass', y='age')
plt.title('Age Distribution by Passenger Class')
plt.xlabel('Passenger Class')
plt.ylabel('Age')

# Step 3: Boxplot for Fare vs Pclass


plt.subplot(1, 2, 2)
sns.boxplot(data=titanic, x='pclass', y='fare')
plt.title('Fare Distribution by Passenger Class')
plt.xlabel('Passenger Class')
plt.ylabel('Fare')

plt.tight_layout()
plt.show()

In [26]: '''k. Use the "diamonds" dataset from seaborn to plot a histogram for the
Use the hue parameter for the 'cut' column of the diamonds dataset.'''
import seaborn as sns
import matplotlib.pyplot as plt

# Step 1: Load the diamonds dataset


diamonds = sns.load_dataset("diamonds")

# Step 2: Plot histogram for 'price' with hue for 'cut'


plt.figure(figsize=(10, 6))
sns.histplot(data=diamonds, x="price", hue="cut", kde=False, multiple="st

# Step 3: Customize the plot


plt.title("Histogram of Diamond Prices by Cut")
plt.xlabel("Price")
plt.ylabel("Count")
plt.grid(True)
plt.tight_layout()
plt.show()

localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 9/10


03/09/2025, 08:01 ML LAB

localhost:8888/lab/tree/Data_science/Machine Learning class AD/PRACTICAL LAB/ML LAB.ipynb? 10/10

You might also like