Machine Learning Lab Manual (Use-Case Oriented)
Unit 1: Introduction to ML & Learning Types
Classifying Email as Spam or Not Spam
Dataset: SMS Spam Collection (Kaggle)
Algorithm: Naïve Bayes / Logistic Regression
Code:
from sklearn.naive_bayes import MultinomialNB
model = MultinomialNB().fit(X_train, y_train)
Customer Segmentation for Marketing
Dataset: Mall Customers (Kaggle)
Algorithm: k-Means, Hierarchical Clustering
Code:
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=5).fit(X)
Unit 2: Regression & Linear Models
House Price Prediction
Dataset: California Housing (sklearn)
Algorithm: Linear Regression, Ridge, Lasso
Code:
from sklearn.linear_model import LinearRegression
model = LinearRegression().fit(X_train, y_train)
Predicting Student Admission Chances
Dataset: Graduate Admission dataset (Kaggle)
Algorithm: Logistic Regression, LDA
Code:
from sklearn.linear_model import LogisticRegression
clf = LogisticRegression().fit(X_train, y_train)
Unit 3: Classification & Clustering
Fake News Detection
Dataset: Fake News dataset (Kaggle)
Algorithm: Naïve Bayes
Code:
from sklearn.naive_bayes import MultinomialNB
nb = MultinomialNB().fit(X_train, y_train)
Handwritten Digit Recognition
Dataset: MNIST
Algorithm: k-Nearest Neighbors
Code:
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=5).fit(X_train, y_train)
Unit 4: Neural Networks & Probabilistic Models
Digit Recognition with ANN
Dataset: MNIST (Keras)
Algorithm: ANN
Code:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
model = Sequential([Dense(128, activation='relu', input_shape=(784,)), Dense(10,
activation='softmax')])
Medical Diagnosis with Bayesian Networks
Dataset: Alarm Network (benchmark BN)
Algorithm: Bayesian Networks
Code:
from pgmpy.models import BayesianModel
model = BayesianModel([('A','B'), ('B','C')])
Unit 5: Advanced Learning
Market Segmentation with GMM
Dataset: Synthetic Gaussian blobs
Algorithm: Gaussian Mixture Model
Code:
from sklearn.mixture import GaussianMixture
gmm = GaussianMixture(n_components=3).fit(X)
Self-Driving Car Simulation (CartPole)
Dataset: OpenAI Gym
Algorithm: Reinforcement Learning (DQN)
Code:
import gym
env = gym.make('CartPole-v1')
obs = env.reset()
Unit 6: Real-World Applications
Text Sentiment Analysis
Dataset: IMDB (Keras)
Algorithm: Logistic Regression / LSTM
Code:
from tensorflow.keras.datasets import imdb
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=10000)
Phishing Website Detection
Dataset: UCI Phishing Websites Dataset
Algorithm: Random Forest, SVM
Code:
from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier().fit(X_train, y_train)