0% found this document useful (0 votes)
45 views51 pages

Machine Learning Practical File

The document is a practical file for a Machine Learning Lab course for B.Tech. students in the CSE(AI&ML) department, detailing various experiments and their significance. It includes a list of programs focused on different machine learning concepts, such as text pre-processing, bias-variance trade-off, and loss functions for classification and regression. Additionally, it features viva voce questions that assess students' understanding of these topics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views51 pages

Machine Learning Practical File

The document is a practical file for a Machine Learning Lab course for B.Tech. students in the CSE(AI&ML) department, detailing various experiments and their significance. It includes a list of programs focused on different machine learning concepts, such as text pre-processing, bias-variance trade-off, and loss functions for classification and regression. Additionally, it features viva voce questions that assess students' understanding of these topics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 51

MACHINE LEARNING LAB

MACHINE LEARNING LAB


PRACTICAL FILE
B.Tech. Semester V

CSE(AI&ML)

Session:2024-25, Odd Semester


Name:

Roll. No.:

Group/Branch:

DRONACHARYACOLLEGE OF ENGINEERING, GURUGRAM


DEPARTMENT OF CSE(AI&ML)
AFFILIATED TO GURUGRAMUNIVERSITY,
GURUGRAM

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

LIST OF EXPERIMENTS
S No Program Name Dates Signature

1. PML1: Text Pre-processing using NLKT, which focuses 06/08/2


on preparing and cleaning text data. 4
2. PML2: Bias Variance, examining the trade-off between 13/08/2
bias and variance in model performance. 4
3. PML3: Validation Test Train, explaining the process of 20/08/2
splitting data for model evaluation. 4
4. PML4(A): Classification Loss, detailing loss functions 27/08/2
used in classification tasks. 4
PML4(B): Regression Loss, describing loss functions
relevant to regression tasks.

5. PML5: K-Nearest Neighbors (KNN), an introduction to the 03/09/2


KNN algorithm for classification and regression.
4
6. PML6: KNN Visualization, providing visual insights into how 17/09/2
the KNN algorithm works.
4
7. PML7: SVM Binary Classification, covering the basics of 24/09/2
Support Vector Machine for binary classification. 4
8. PML8: Naive Bayes Classifier, exploring the Naive Bayes 01/10/2
algorithm for classification tasks and showing Confusion
Matrix parameters
4
9. PML9: Principal Component Analysis (PCA) using Scikit- 08/10/2
learn, focusing on dimensionality reduction techniques.
4
10. PML10: Decision Trees (DT), explaining how decision tree 15/10/2
algorithms function for classification and regression.
4
11. PML11: SVM Binary Classification (duplicate or 22/10/2
variant), reiterating or offering an alternative view on 4
SVM classification with ROC and AUC

** Viva voce is framed with explanations with every program

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

PROGRAM 1
PML1: Text Pre-processing using NLKT, which focuses on preparing and cleaning text data.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
VIVA VOCE (1)

1. What is the importance of text pre-processing in natural language processing (NLP)?

 Answer: Text pre-processing is crucial in NLP because it transforms raw text into a clean, structured
format that models can understand. This step helps reduce noise, improves data quality, and enhances the
efficiency and accuracy of machine learning algorithms.
 Explanation: Pre-processing techniques like tokenization, stop word removal, and
stemming/lemmatization help standardize text and reduce the size of the feature space.

2. What are some common text pre-processing techniques used in NLP?

 Answer: Common text pre-processing techniques include:


o Tokenization: Splitting text into individual words or tokens.
o Stop Word Removal: Removing common words like "the", "is", and "and" that do not add
value to text analysis.
o Stemming: Reducing words to their base or root form (e.g., "running" to "run").
o Lemmatization: Converting words to their canonical form (e.g., "better" to "good").
o Lowercasing: Converting all characters to lowercase to maintain uniformity.
 Explanation: Each technique helps reduce noise and improve the relevance of the text data for analysis.

3. What is tokenization, and why is it necessary?

 Answer: Tokenization is the process of breaking down text into smaller pieces, such as words, phrases,
or sentences. It is necessary because most NLP algorithms require input in the form of individual tokens
to analyze and extract meaningful information.
 Explanation: For example, in the sentence "Machine learning is powerful," tokenization would split it
into ["Machine", "learning", "is", "powerful"] for further processing.

4. What is the role of the Natural Language Toolkit (NLTK) in text pre-processing?

 Answer: NLTK is a Python library that provides tools for text processing and analysis, including
tokenization, stop word removal, stemming, lemmatization, and more. It simplifies the process of
preparing text data for machine learning models.
 Explanation: NLTK comes with pre-built functions and corpora that can be leveraged to create efficient
text processing pipelines without having to write extensive code from scratch.

5. What is the difference between stemming and lemmatization, and when would you use each?

 Answer: Stemming reduces words to their base or root form, which may not always be a valid word
(e.g., "running" becomes "run"). Lemmatization reduces words to their canonical form, ensuring that the
reduced form is a meaningful word (e.g., "running" becomes "run," "better" becomes "good").
 Explanation: Stemming is faster and less computationally intensive but may lead to less accurate results.
Lemmatization is more accurate as it considers the context of the word but is computationally more
expensive. You would use stemming when you need quick results and accuracy is not the priority, while
lemmatization is used when precision is essential.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
PROGRAM 2
PML2: Bias Variance, examining the trade-off between bias and variance in model performance.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
VIVA VOCE (2)
1. What is bias in the context of machine learning?

 Answer: In machine learning, bias refers to the error introduced by approximating a real-world problem,
which may be complex, by a simpler model. High bias leads to underfitting, where the model fails to
capture the underlying patterns in the data.
 Explanation: Models with high bias make strong assumptions about the data and are unable to learn
adequately from it. Examples include linear models applied to non-linear data.

2. What is variance in machine learning, and how does it impact model performance?

 Answer: Variance refers to the model's sensitivity to small changes in the training data. High variance
means that the model fits the training data too closely (overfitting) and may not generalize well to unseen
data.
 Explanation: A model with high variance captures noise in the training data, leading to poor
performance on test data. This happens when the model is too complex and fits every detail of the
training set.

3. Can you explain the bias-variance trade-off?

 Answer: The bias-variance trade-off is the balance between two sources of error that affect model
performance: bias and variance. Low bias usually comes with high variance (overfitting), and low
variance often comes with high bias (underfitting). The goal is to find the optimal balance where both are
minimized to improve model accuracy.
 Explanation: An ideal model achieves a balance that minimizes overall error on new data. Too simple
models underfit, while overly complex models overfit.

4. What are the characteristics of a model with high bias?

 Answer: A model with high bias has the following characteristics:


o Underfits the training data.
o Makes simplistic assumptions, often leading to a poor fit.
o Has lower training and validation accuracy.
 Example: A linear regression model used for highly non-linear data would have high bias and low
complexity.

5. What strategies can be employed to reduce variance in a machine learning model?

 Answer: To reduce variance, you can:


o Simplify the model by reducing the number of features or using regularization techniques (e.g.,
L1 or L2 regularization).
o Increase the amount of training data to help the model generalize better.
o Use ensemble methods like bagging (e.g., Random Forest) to combine predictions from multiple
models.
 Explanation: Reducing variance involves preventing the model from fitting noise in the training data,
thus improving generalization to new data

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
PROGRAM 3
PML3: Validation Test Train, explaining the process of splitting data for model evaluation.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
VIVA VOCE (3)

1. What is the difference between training, validation, and test sets in machine learning?

Solution:
 Training Set: This is the dataset used to train the machine learning model. The model learns patterns and
relationships in the data during this phase.
 Validation Set: This set is used to tune the hyperparameters of the model and evaluate its performance
during training. It helps to avoid overfitting by providing an unbiased evaluation of the model during
training.
 Test Set: This dataset is used to assess the model's final performance after training. It provides an
unbiased estimate of the model's generalization ability on unseen data.

2. Why is it important to separate the data into training, validation, and test sets?

Solution:
Separating the data helps to ensure that the model is able to generalize to unseen data. If the same data is
used for training and testing, the model may memorize the data (overfitting) and fail to perform well on
new, unseen data. The validation set is crucial for fine-tuning the model and ensuring it does not overfit,
while the test set provides a final, unbiased evaluation of model performance.

3.What is cross-validation, and how does it help improve model performance?

Solution:
Cross-validation is a technique where the dataset is split into several subsets (folds). The model is trained
on some folds and validated on the remaining fold. This process is repeated for each fold, and the
performance metrics are averaged to give a more reliable estimate of model performance. Cross-
validation helps reduce variance in the model evaluation and prevents overfitting, especially when the
dataset is small.

4. What is overfitting, and how can you detect it using validation and test sets?

Solution:
Overfitting occurs when a model learns the noise or random fluctuations in the training data rather than
the underlying patterns, leading to poor performance on unseen data. To detect overfitting:
 Validation Set: If the model performs well on the training set but poorly on the validation set, it may be
overfitting.
 Test Set: Similarly, if the model shows good performance on the training set but poor performance on
the test set, it suggests that the model is not generalizing well to new data.

5. How can hyperparameter tuning be done using the validation set?

Solution:
Hyperparameter tuning involves adjusting the model's hyperparameters (e.g., learning rate, regularization
strength) to improve its performance. This can be done using the validation set:
 Grid Search: Test a range of hyperparameter values to find the combination that gives the best
performance on the validation set.
 Random Search: Randomly sample hyperparameters and evaluate the model on the validation set.
 Bayesian Optimization: Use probabilistic models to guide the search for optimal hyperparameters.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

PROGRAM 4
PML4(A): Classification Loss, detailing loss functions used in classification tasks.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

PROGRAM 4
PML4(B): Regression Loss, describing loss functions relevant to regression tasks.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

VIVA VOCE (4)


CLASSIFICATION LOSS:

1. What is classification loss in machine learning?


Solution:
Classification loss measures the error between the predicted class probabilities and the true labels, guiding the
model to improve its predictions.
Explanation:
It helps the model adjust its parameters by quantifying how far off its predictions are from the actual labels,
using functions like cross-entropy loss for classification tasks.

2. What is the difference between binary cross-entropy and categorical cross-entropy loss?
Solution:
 Binary cross-entropy is used for binary classification tasks, while categorical cross-entropy is used for
multi-class classification.
 Binary cross-entropy compares predicted probabilities for a single class (0 or 1), while categorical cross-
entropy compares predicted probabilities across multiple classes.
Explanation:
Binary cross-entropy is for two classes, while categorical cross-entropy handles multiple classes by
calculating loss for each class and then averaging the result.

3. Why is SoftMax used in classification problems?


Solution:
SoftMax converts raw model outputs (logits) into probabilities, ensuring they sum to 1, which is required
for classification tasks.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
Explanation:
It transforms model outputs into interpretable probabilities, which can then be used with cross-entropy
loss to compute classification error.

4. Why is mean squared error (MSE) not used in classification tasks?


Solution:
MSE is designed for regression tasks, not classification, because it assumes continuous values rather than
discrete class labels.
Explanation:
For classification, cross-entropy is preferred as it compares predicted class probabilities to true labels,
making it more suitable for categorical outcomes.

5. How can class imbalance affect classification loss, and how can it be addressed?
Solution:
Class imbalance can cause the model to favor the majority class, leading to biased predictions. It can be
addressed using weighted loss, resampling, or focal loss.
Explanation:
Weighted loss assigns higher penalties to misclassifications of the minority class, helping balance the
model's performance across all classes.

REGRESSION LOSS:

1. What is regression loss in machine learning?


Solution:
Regression loss measures the difference between predicted continuous values and the actual target values.
Explanation:
It is used to evaluate how far the model's predictions are from the actual values. Common regression loss
functions include mean squared error (MSE) and mean absolute error (MAE).

2. What is mean squared error (MSE) loss and why is it commonly used in regression?
Solution:
MSE is the average of the squared differences between predicted and actual values. It is commonly used
because it penalizes large errors more heavily.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
3. What is the difference between mean squared error (MSE) and mean absolute error (MAE)?
Solution:
 MSE computes the square of the errors, making it sensitive to outliers.
 MAE computes the absolute value of the errors, treating all errors equally.
Explanation:
MSE is more sensitive to large errors, while MAE is more robust and less affected by outliers. MAE
gives a more straightforward interpretation in terms of average error magnitude.

4. What is Huber loss, and when is it used in regression?

Solution:
Huber loss combines MSE and MAE. It uses MSE for small errors and switches to MAE for large errors,
making it less sensitive to outliers than MSE.

5.How does regression loss help in model training?


Solution:
Regression loss quantifies how far the model's predictions are from the actual values, guiding the
optimization process to minimize this error.

Explanation:
By minimizing regression loss (e.g., using gradient descent), the model adjusts its parameters to reduce
the error between predicted and actual values, improving prediction accuracy over time.

PROGRAM 5
PML5: K-Nearest Neighbors (KNN), an introduction to the KNN algorithm for classification
and regression.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

PROGRAM 6
Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
PML6: KNN Visualization, providing visual insights into how the KNN algorithm works.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
VIVA VOCE (5,6)
1. What is the K-Nearest Neighbors (KNN) algorithm?
Solution:
KNN is a supervised machine learning algorithm used for classification and regression. It makes predictions based
on the majority class (for classification) or the average (for regression) of the k-nearest data points to a given
input.
Explanation:
In KNN, the distance between a test point and all training points is calculated (using Euclidean distance or another
metric), and the k-nearest neighbors are selected. For classification, the most frequent class among these
neighbors is assigned to the test point. For regression, the average of the neighbors’ values is used.

2. How is the value of k (the number of neighbors) chosen in KNN?


Solution:
The value of k is typically chosen based on cross-validation or experimentation. A small k can make the model
sensitive to noise, while a large k may smooth out distinctions between classes or values.
Explanation:
Choosing the right k is crucial for model performance. A small k might lead to overfitting (high variance), while a
large k can result in underfitting (high bias). Cross-validation helps to find the optimal k.

3. What distance metric is commonly used in KNN, and why?

Explanation:
Euclidean distance is straightforward and works well for many problems. It measures the straight-line distance
between two points in a multi-dimensional space. Other distance metrics like Manhattan or Minkowski can also
be used depending on the problem.

4.What are the main advantages and disadvantages of the KNN algorithm?
Solution:
 Advantages:
o Simple to implement and understand.
o No training phase (instance-based learning).
o Effective for non-linear decision boundaries.

 Disadvantages:
o Slow prediction phase for large datasets since it requires computing distances to all points.
o Sensitive to the choice of kkk and irrelevant features.
o Struggles with high-dimensional data (curse of dimensionality).
Explanation:
KNN is intuitive and does not require a training phase, but its performance can degrade with large datasets and
high-dimensional spaces due to the need to compute distances for each test point.

5. How does KNN handle ties in classification (when two or more classes have the same number of nearest
neighbors)?
Solution:
KNN typically handles ties by:

 Selecting the class of the nearest neighbor (if there's a tie in the number of neighbors).
 Choosing a class based on distance-weighted voting, where closer neighbors have more influence.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
Explanation:
In the event of a tie, KNN may resolve it by using additional heuristics, such as assigning the class of the closest
neighbor or giving more weight to nearer neighbors, thereby breaking the tie based on proximity.

PROGRAM 7
PML7: SVM Binary Classification, covering the basics of Support Vector Machine for binary
classification.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
VIVA VOCE (7)

1. What is Support Vector Machine (SVM) for binary classification?

Solution:
SVM for binary classification is a supervised machine learning algorithm that finds the optimal hyperplane
that separates two classes in a high-dimensional feature space, maximizing the margin between them.
Explanation:
SVM works by finding the decision boundary (hyperplane) that best separates the data into two classes while
maximizing the margin (distance between the hyperplane and the nearest points of each class, called support
vectors). This results in a classifier that generalizes well to unseen data.

2. How does SVM handle non-linearly separable data in binary classification?

Solution:
SVM handles non-linearly separable data by using the kernel trick. The kernel function maps the input data
into a higher-dimensional space where a linear hyperplane can separate the classes.
Explanation:
For data that isn't linearly separable, SVM uses kernels (e.g., polynomial, radial basis function) to transform
the data into a higher-dimensional space where it becomes easier to find a linear separating hyperplane.

3. What is the difference between binary classification and multiclass classification?

Solution:
 Binary Classification: Involves classifying data into two distinct classes (e.g., class 0 vs. class 1).
 Multiclass Classification: Involves classifying data into more than two classes (e.g., class 0, class 1, and
class 2).
Explanation:
Binary classification deals with two classes, and models are optimized to distinguish between them. In
multiclass classification, the model must distinguish between more than two classes, requiring different
strategies such as one-vs-rest (OvR) or one-vs-one (OvO).

4. How does SVM handle multiclass classification?

Solution:
SVM handles multiclass classification using strategies like One-vs-Rest (OvR) or One-vs-One (OvO),
where multiple binary classifiers are trained to handle different class combinations.
Explanation:
 One-vs-Rest (OvR): A binary classifier is trained for each class, distinguishing that class from all others.
 One-vs-One (OvO): A binary classifier is trained for every pair of classes, and the class with the most
votes from these classifiers is chosen.

5. What are the key differences in the approach of binary classification and multiclass classification with
SVM?

Solution:
 In binary classification, SVM aims to find a single hyperplane that separates two classes.
 In multiclass classification, SVM requires combining multiple binary classifiers using strategies like One-vs-
Rest (OvR) or One-vs-One (OvO).
Explanation:
For binary classification, a single optimal hyperplane is sufficient. In multiclass classification, since there are
more than two classes, additional classifiers or methods are needed to handle multiple classes. The SVM
framework needs adaptations to manage the increased complexity of multiclass problems.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

PROGRAM 8
PML8: Naive Bayes Classifier, exploring the Naive Bayes algorithm for classification tasks
and showing Confusion Matrix parameters

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
VIVA VOCE (8)
1. What is the fundamental assumption made by the Naive Bayes classifier?
Solution:
The Naive Bayes classifier assumes that the features are conditionally independent given the class label.
Explanation:
This "naive" assumption simplifies the computation of probabilities, enabling efficient classification even with
high-dimensional data.

2.How does Naive Bayes classify data based on probabilities?


Solution:
Naive Bayes calculates the posterior probability of each class using Bayes' Theorem and selects the class with the
highest probability.
Explanation:
It combines the prior probability of the class with the likelihood of the features to determine the most probable
class for the given data.

3. What is the role of the Gaussian distribution in Naive Bayes?


Solution:
In Naive Bayes, the Gaussian distribution is used to model the likelihood of continuous features in the data.
Explanation:
By assuming that continuous features follow a normal distribution, Naive Bayes can compute the probability of
each feature given the class label using the mean and variance.

4. What is a Confusion Matrix used for in classification?


Solution:
A confusion matrix is used to summarize the performance of a classification algorithm by displaying the counts of
true positives, true negatives, false positives, and false negatives.
Explanation:
It helps in understanding how well the classifier is performing by showing where it makes errors.

5. What does a True Positive (TP) in a Confusion Matrix indicate?


Solution:
A True Positive (TP) indicates the number of instances correctly classified as positive by the model.
Explanation:
It reflects the number of actual positive instances that were accurately predicted as positive by the classifier.

6. How do you calculate precision from a Confusion Matrix?


Solution:

Explanation:
It measures the proportion of predicted positives that are actually correct, indicating the accuracy of positive
predictions.

7. What is the difference between precision and recall in terms of a Confusion Matrix?
Solution:
Precision measures the accuracy of positive predictions, while recall measures the ability to correctly identify all
actual positive instances.
Explanation:
Precision focuses on the correctness of the positive predictions, and recall focuses on capturing all actual positive
cases, even at the expense of false positives.

8. How can Naive Bayes be applied to multiclass classification problems?


Solution:
Naive Bayes can be extended to multiclass classification by using the One-vs-Rest (OvR) or One-vs-One (OvO)
strategy, training separate classifiers for each class.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
Explanation:
In multiclass problems, Naive Bayes classifiers are adapted to handle multiple classes by applying binary
classification techniques for each class.
PROGRAM 9
PML9: Principal Component Analysis (PCA) using Scikit-learn, focusing on dimensionality
reduction techniques.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
VIVA VOCE (9)
1. What is Principal Component Analysis (PCA) and why is it used for dimensionality reduction?

Solution:
PCA is a linear technique that transforms high-dimensional data into a lower-dimensional form by finding the
principal components that capture the most variance in the data.
Explanation:
By reducing the number of dimensions, PCA helps simplify the dataset, improving computational efficiency and
mitigating the curse of dimensionality while preserving essential patterns in the data.

2. How does PCA reduce the dimensionality of data in Scikit-learn?

Solution:
PCA in Scikit-learn reduces dimensionality by projecting the data onto a set of orthogonal axes (principal
components) that maximize variance, using the PCA class.
Explanation:
The principal components are ordered by the amount of variance they capture, and by selecting a subset of these
components, PCA reduces the number of features while retaining most of the original data's variability.

3. How do you apply PCA in Scikit-learn for dimensionality reduction?

Solution:
To apply PCA in Scikit-learn, you use the PCA class, fit the model to the data using fit() and transform it with
transform() or fit_transform() for dimensionality reduction.
Explanation:
The fit() method computes the principal components, and transform() reduces the data to the selected number of
components, which you can specify by setting the n_components parameter.

4. What parameter in Scikit-learn's PCA class controls the number of principal components?

Solution:
The n_components parameter in the PCA class controls the number of principal components to keep after
dimensionality reduction.
Explanation:
Setting n_components to a number less than the original number of features reduces the data's dimensions, while
setting it to a float between 0 and 1 keeps enough components to preserve that percentage of the variance.

5. How can PCA be useful for visualizing high-dimensional data?

Solution:
PCA can project high-dimensional data into 2 or 3 dimensions, allowing easier visualization of complex datasets
while retaining most of the original variance.
Explanation:
By reducing the dimensionality to 2 or 3 principal components, PCA helps visualize the data in scatter plots,
revealing patterns, clusters, or relationships that are not visible in higher-dimensional spaces

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

PROGRAM 10
PML10: Decision Trees (DT), explaining how decision tree algorithms function for
classification and regression.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
VIVA VOCE (10)

1. What is a Decision Tree and how does it work for classification?


Solution:
A Decision Tree is a tree-like model that splits data into subsets based on feature values, making decisions at each
node, and is used for classification by assigning the most common class in each leaf.

Explanation:
In classification, the tree is built by recursively splitting the dataset using the feature that best separates the
classes, typically based on metrics like Gini impurity or Information Gain.

2. How does a Decision Tree handle continuous data for classification?


Solution:
For continuous data, a Decision Tree splits the data at a threshold value that optimizes class separation, creating
branches based on whether the feature values are above or below the threshold.

Explanation:
Continuous features are divided into intervals, and the tree determines the best split by minimizing a criterion such
as Gini impurity or entropy, which helps distinguish the classes at each node.

3. What is the role of splitting criteria in a Decision Tree?


Solution:
Splitting criteria, like Gini impurity or Information Gain, are used to evaluate the best feature and threshold to
split the data at each node to improve classification or regression performance.

Explanation:
The criteria measure how well a split divides the data, with the goal of maximizing the homogeneity of the target
variable within each branch after the split.

4. How does a Decision Tree algorithm work for regression?


Solution:
For regression, a Decision Tree splits the data into subsets based on feature values and assigns the mean or median
value of the target variable to each leaf.

Explanation:
Unlike classification, where the most frequent class is assigned to a leaf, in regression, the Decision Tree predicts
a continuous value by averaging the target variable within each subset (leaf).

5. What is overfitting in Decision Trees, and how can it be avoided?


Solution:
Overfitting in Decision Trees occurs when the tree is too complex, capturing noise in the data, and can be avoided
by setting limits on tree depth, requiring a minimum number of samples per leaf, or pruning the tree.

Explanation:
Overfitting happens when a tree perfectly fits the training data, including noise, which reduces its ability to
generalize to new data. Regularization techniques like pruning or limiting depth help prevent this.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB
PROGRAM 11
PML11: SVM Binary Classification (duplicate or variant), reiterating or offering an alternative
view on SVM classification with ROC and AUC.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

To facilitate the
graduates with the
ability to visualize,
gather information,
articulate,
analyze, solve complex
problems, and make
decisions. These are
essential to address
the
challenges of complex
and computation
intensive problems
increasing their
productivity.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

PEO2 – TECHNICAL
SKILLS

To facilitate the
graduates with the
technical skills that
prepare them for
immediate
employment and
pursue certification
providing a deeper
understanding of the
technology in
advanced areasof
computer science and
related fields, thus
Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

encouraging to pursue
higher
education and research
based on their interest.

PEO3 – SOFT SKILLS

To facilitate the
graduates with the soft
skills that include
fulfilling the mission,
setting
goals, showing self-
confidence by
communicating

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

effectively, having a
positive attitude, get
involved in team- work,
being a leader,
managing their career
and their life.

PEO4 – PROFESSIONAL
ETHICS

To facilitate the
graduates with the
knowledge of
professional and ethical
responsibilities

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

by paying attention to
grooming, being
conservative with style,
following dress codes,
safety
codes, and adapting
themselves to
technological
advancements.
PEO1 – ANALYTICAL
SKILLS

To facilitate the
graduates with the
ability to visualize,

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

gather information,
articulate,
analyze, solve complex
problems, and make
decisions. These are
essential to address
the
challenges of complex
and computation
intensive problems
increasing their
productivity.

PEO2 – TECHNICAL
SKILLS

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

To facilitate the
graduates with the
technical skills that
prepare them for
immediate
employment and
pursue certification
providing a deeper
understanding of the
technology in
advanced areasof
computer science and
related fields, thus
encouraging to pursue
higher

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

education and research


based on their interest.

PEO3 – SOFT SKILLS

To facilitate the
graduates with the soft
skills that include
fulfilling the mission,
setting
goals, showing self-
confidence by
communicating
effectively, having a
positive attitude, get

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

involved in team- work,


being a leader,
managing their career
and their life.

PEO4 – PROFESSIONAL
ETHICS

To facilitate the
graduates with the
knowledge of
professional and ethical
responsibilities
by paying attention to
grooming, being
conservative with style,
Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25


MACHINE LEARNING LAB

following dress codes,


safety
codes, and adapting
themselves to
technological
advancements.

Varun Sharma (25341)

Department of CSE(AI&ML) 2024-25

You might also like