Assignment for CA2
Q1. Explain the concept of SVM. Why is maximizing the margin beneficial?
Support Vector Machine (SVM) is a supervised machine learning algorithm used for
classification and regression tasks. The main idea of SVM is to find a hyperplane (decision
boundary) that best separates the data points of different classes. Among all possible
hyperplanes, SVM chooses the one that maximizes the margin — the distance between
the hyperplane and the closest data points (called support vectors).
Why maximizing margin is beneficial?
• A larger margin reduces the risk of misclassification.
• It improves generalization ability, meaning the model performs better on unseen data.
• It makes the classifier more robust to noise in the training data.
Q2. What is the difference between soft and hard margin? Give an example where
soft margin is necessary.
Hard Margin SVM:
• Assumes data is linearly separable without any errors.
• Finds a hyperplane that perfectly separates the classes.
• Works only when there are no outliers or noise.
Soft Margin SVM:
• Allows some misclassifications by introducing a penalty (slack variable).
• Balances between maximizing margin and minimizing classification errors.
• Useful when data is noisy or not perfectly separable.
Example: In spam classification, some emails may be mislabeled or ambiguous. A soft
margin allows the model to tolerate a few misclassifications while still achieving good
accuracy.
Q3. Run step by step on the dataset (2,3),(3,3),(6,6),(7,7) with k=2.
We apply k-means clustering with k=2.
Step 1: Choose k=2 random centroids (say (2,3) and (7,7)).
Step 2: Assign points to nearest centroid.
- (2,3) → Cluster 1
- (3,3) → Cluster 1
- (6,6) → Cluster 2
- (7,7) → Cluster 2
Step 3: Recalculate centroids.
- Cluster 1 mean = (2.5, 3)
- Cluster 2 mean = (6.5, 6.5)
Step 4: Reassign points. Result remains stable.
Final Clusters:
Cluster 1: (2,3), (3,3)
Cluster 2: (6,6), (7,7)
Q4. Explain how Kernel PCA overcomes the limitations of linear PCA.
Linear PCA works only when the data is linearly separable, as it uses linear combinations
of variables. It fails when data lies on a non-linear manifold (e.g., concentric circles).
Kernel PCA uses the kernel trick to map input data into a higher-dimensional feature
space. PCA is then performed in this new space, capturing non-linear relationships. This
allows Kernel PCA to extract meaningful features from complex datasets.
Example: For a dataset shaped like concentric circles, linear PCA fails, but kernel PCA
with an RBF kernel can separate the data effectively.