0% found this document useful (0 votes)
51 views33 pages

Question Bank - Student

The document provides an overview of machine learning basics, including definitions, types of learning, and key concepts such as supervised and unsupervised learning, reinforcement learning, and dimensionality reduction. It discusses various algorithms, applications, and the machine learning life cycle, along with comparisons between different methods and techniques. Additionally, it highlights the importance of data preparation, representation, and the challenges associated with high-dimensional data.

Uploaded by

abiderdude123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views33 pages

Question Bank - Student

The document provides an overview of machine learning basics, including definitions, types of learning, and key concepts such as supervised and unsupervised learning, reinforcement learning, and dimensionality reduction. It discusses various algorithms, applications, and the machine learning life cycle, along with comparisons between different methods and techniques. Additionally, it highlights the importance of data preparation, representation, and the challenges associated with high-dimensional data.

Uploaded by

abiderdude123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 33

UNIT- I - MACHINE LEARNING BASICS

Introduction to Machine Learning (ML) - Essential concepts of ML – Types of learning – Machine


learning methods based on Time – Dimensionality – Linearity and Nonlinearity – Early trends in
Machine learning–Data Understanding Representation and visualization.
PART A ( 2 Marks)

1.Define Machine Learning.(Dec 2022/ April 2024)


Machine Learning is a part of artificial intelligence where computers learn from data
without being specifically programmed. It involves creating models that can find patterns, make
predictions, or improve decisions based on data.
2. Distinguish Supervised learning from Unsupervised learning. (April/May 2023)

Feature Supervised Learning Unsupervised Learning

Data Type Labeled data (input-output pairs) Unlabeled data (no specific
output)
Objective Predict or classify output based on input Find hidden patterns or structures
in data
Training Process Learns from labeled examples Learns from the data without
explicit labels
Applications Classification, Regression Clustering, Association,
Dimensionality reduction

Examples Spam detection, Predicting house prices Customer segmentation, Market


basket analysis

Output Predictive models (specific outcomes) Descriptive models (patterns or


groupings)

3. How Machine learning differs from Artificial Intelligence.


Machine Learning (ML): A subset of Artificial Intelligence that focuses on developing algorithms
that enable computers to learn from data and improve from experience without being explicitly
programmed.
Artificial Intelligence (AI): The broader field encompassing all techniques that enable computers
to mimic human intelligence, including reasoning, problem-solving, learning, and natural language
processing.
4. Define Curse of Dimensionality. (April/May 2023)
The curse of dimensionality refers to the difficulties that arise when analyzing data in high-
dimensional spaces. As dimensions increase, data becomes sparse, making it harder to find patterns,
increasing computational cost, and often leading to overfitting in models.
5. Define Reinforcement Learning with an example.(April/May 2023)
Reinforcement Learning (RL) is a type of Machine Learning where an agent learns to make
decisions by performing actions in an environment to maximize cumulative reward. It involves
learning from the consequences of actions through trial and error.
Example: A robot learns to navigate a maze by receiving rewards for reaching the end and
penalties for hitting walls. Over time, the robot optimizes its path to maximize the reward.
6. Define rational agent. (Nov/Dec 2023)
A rational agent in Machine Learning is an entity that acts to achieve the best possible outcome or,
when uncertainty is present, the best expected outcome based on its knowledge and capabilities. It
perceives its environment, makes decisions, and takes actions that maximize its performance
measure.
7. List the steps involved in a simple problem-solving agent. (Nov/Dec 2023)

Steps involved in a Simple Problem-Solving Agent:

1. Problem Definition: Define the problem by specifying the initial state, the goal state, and the
set of possible actions.
2. Search: Explore possible actions and states to find a sequence that leads to the goal state.

8. Identify some early trends observed in the field of machine learning.(April/May 2024)
 Rule-Based Systems: Initial approaches were based on explicitly programmed rules, using if-
then logic to make decisions.
 Statistical Methods: Early trends also focused on statistical models and pattern recognition,
including linear regression and clustering algorithms.
9. Compare linear and nonlinear machine learning algorithms. (April/May 2024)
Linear Algorithms: Assume a linear relationship between input features and output. Examples
include Linear Regression and Linear SVM. These are simpler, easier to interpret, and less prone to
overfitting.
Nonlinear Algorithms: Can model complex, nonlinear relationships. Examples include Decision
Trees and Neural Networks. These are more flexible but often computationally intensive and more
prone to overfitting.
10. What are the different types of techniques available to reduce the dimensionality?
(April/May 2023)
Principal Component Analysis (PCA)
Linear Discriminant Analysis (LDA)
t-Distributed Stochastic Neighbor Embedding (t-SNE)
Auto encoders
Feature Selection:
11. Define Data preparation and its process.
After collecting the data, we need to prepare it for further steps. Data preparation is a step where
we put our data into a suitable place and prepare it to use in our machine learning training.

This step can be further divided into two processes:

Data exploration:

It is used to understand the nature of data that we have to work with. We need to understand the
characteristics, format, and quality of data. A better understanding of data leads to an effective
outcome. In this, we find Correlations, general trends, and outliers.

Data pre-processing:

Now the next step is preprocessing of data for its analysis.

12. List out the key components of a reinforcement learning problem?


Key Components of a Reinforcement Learning Problem:
1. Agent: The learner or decision-maker that interacts with the environment.
2. Environment: The external system with which the agent interacts and which provides
feedback.
3. State: A representation of the current situation of the environment.
4. Action: The set of all possible moves the agent can make.
5. Reward: The immediate feedback received after performing an action.
6. Policy: A strategy used by the agent to determine its actions based on the current state.
7. Value Function: A prediction of the expected long-term return of states or state-action
pairs.
13. What is recall in Machine Learning?
Recall in Machine Learning: Recall, also known as sensitivity, is a metric used to assess the
effectiveness of a classification model. It is defined as the ratio of correctly predicted positive
instances (true positives) to the total number of actual positive instances (true positives + false
negatives). A higher recall indicates better model performance in identifying positive instances.
Recall= True Positives
True Positives+False Negatives
14.List out various techniques for data representation in machine learning.
Techniques for Data Representation in Machine Learning:

1. Tabular Representation: Organizing data in rows and columns, suitable for structured data
such as spreadsheets.
2. Vector Representation: Representing data as vectors, commonly used for text data (e.g., TF-
IDF, word embeddings).

15. Why is Dimensionality Reduction important in machine learning?

Dimensionality reduction is crucial as it simplifies models, reduces computational cost, and


mitigates the risk of overfitting by eliminating redundant features. It also enhances model
performance and interpretability by focusing on the most relevant aspects of the data.

16. Explain Machine learning Life cycle

Machine learning life cycle involves seven major steps, which are given below:

o Gathering Data

o Data preparation

o Data Wrangling

o Analyse Data

o Train the model

o Test the model

o Deployment

17. List out the Applications of Machine learning.

i.Image Recognition: ii.Speech Recognition

iii.Traffic prediction: iv. Product recommendations:

v.Self-driving cars: vi. Email Spam and Malware Filtering:

vii.Virtual Personal Assistant: viii.Online Fraud Detection:

ix.Stock Market trading: x. Medical Diagnosis:

xi. Automatic Language Translation:

18. Differences between Artificial Intelligence (AI) and Machine learning (ML):

Artificial Intelligence Machine learning


Artificial intelligence is a technology which Machine learning is a subset of AI which
allows a machine to automatically learn
from past datawithout programming
enables a machine to simulate human explicitly.
behavior.
The goal of AI is to make a smart computer The goal of ML is to allow machines to
system like humans to solve complex learn from data so that they can give
problems. accurate output.

In AI, we make intelligent systems to In ML, we teach machines with data to


perform any task like a human. perform a particular task and give an
accurate result.

19.Define classification of Machine Learning.

At a broad level, machine learning can be classified into three types:

1. Supervised learning

2. Unsupervised learning

3. Reinforcement learning

20.What is Time Series Analysis?

"Time series analysis is a statistical technique dealing in time series data, or trend analysis."

A time-series contains sequential data points mapped at a certain successive time duration, it
incorporates the methods that attempt to surmise a time series in terms of understanding either the
underlying concept of the data points in the time series or suggesting or making predictions.
PART B
1. Define machine learning. Discuss in detail about the types of learning. (Nov 2022/Nov 2023)
Dissect the challenges and techniques associated with handling high dimensional data in machine
2.
learning. (April/May 2024)
3. Explain the following uninformed search strategies with examples.(Nov/Dec 2023)
Define Machine Learning. What are the different types of Machine Learning?
4.
ii. Explain Linearity and Nonlinearity Techniques (April 2023/Dec 2022)
5. Examine various techniques for data representation in machine learning. (April/May 2024)
Explain the fundamental concepts of supervised learning and unsupervised learning. Illustrate the
6.
workflow of Machine Learning process in detail.(April/May 2024)

7. Explain the process of turning data into probabilities in machine learning.(April/May 2024)
8. Explain the Machine learning life cycle techniques.
Create an integration of Machine Learning models into real-world applications. Provide
9.
examples from various domains such as healthcare, finance, and transportation.
Discuss in detail about the different types of data representation and Visualizations. (Nov/Dec
10
2023)
UNIT – II

SUPERVISED LEARNING
Learning a Class from Examples, Linear, Non-linear, Multi-class and Multi-label
classification, Decision Trees: ID3, Classification and Regression Trees,
Regression: Linear Regression, Multiple Linear Regression, Logistic Regression,
Bayesian Network, Bayesian Classifier

1. What is CART? (Nov/Dec 2023)

CART is a predictive algorithm used in Machine learning and it explains how the target
variable’s values can be predicted based on other matters. It is a decision tree where each
fork is split into a predictor variable and each node has a prediction for the target variable
at the end.
The CART algorithm works via the following process:

●The best split point of each input is obtained.


●Based on the best split points of each input in Step 1, the new “best” split point is
identified.
●Split the chosen input according to the “best” split point.
●Continue splitting until a stopping rule is satisfied or no further desirable splitting is
available.

2. Compare classification and regression models. (Nov/Dec 2023)

Classification Regression

In this problem statement, the target In this problem statement, the target variables are
variables are discrete. continuous.

Problems like Spam Email


Problems like House Price Prediction, Rainfall
Classification, Disease prediction like
Prediction like problems are solved using
problems are solved using
regression Algorithms.
Classification Algorithms.

In this algorithm, we try to find the best


possible decision boundary which can In this algorithm, we try to find the best-fit line
separate the two classes with the which can represent the overall trend in the data.
maximum possible separation.

3. Write down the Applications of Naïve Bayes Classifiers.


o It is used for Credit Scoring.
o It is used in medical data classification.
o It can be used in real-time predictions because Naïve Bayes Classifier is an eager
learner.
o It is used in Text classification such as Spam filtering and Sentiment analysis.

4. Why is it called Naïve Bayes?

The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, which can be
described as:

o Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is
independent of the occurrence of other features. Such as if the fruit is identified on the
bases of color, shape, and taste, then red, spherical, and sweet fruit is recognized as an
apple. Hence each feature individually contributes to identify that it is an apple without
depending on each other.
o Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.

5. What is the primary assumption of linear regression?

● Linear relationship between the features and target

● Small or no multicollinearity between the features.

● Homoscedasticity Assumption

● Normal distribution of error terms

● No autocorrelations

6. What is Regression and its types? (Nov/Dec 2022)

Regression is a statistical method used to model and analyze the relationship between a
dependent variable (also called the response or outcome) and one or more independent variables
(also called predictors or features). The goal of regression analysis is to understand how the
dependent variable changes when any of the independent variables are varied, and to predict the
dependent variable based on new data.
Types of Regression:
● Linear Regression
● Logistic Regression
● Ridge Regression
● Lasso Regression

7. What do you mean by Information Gain? (April/May 2023)


o Information gain is the measurement of changes in entropy after the segmentation of a
dataset based on an attribute.
o It calculates how much information a feature provides us about a class.
o According to the value of information gain, we split the node and build the decision tree.
o A decision tree algorithm always tries to maximize the value of information gain, and a
node/attribute having the highest information gain is split first. It can be calculated using
the below formula.

8. What are the limitations of the CART model? (April/May 2023)

● Overfitting.

● High Variance.

● low bias.

● the tree structure may be unstable.

9. Define supervised learning in machine learning. (April/May 2024)

Supervised learning is a type of machine learning where the model is trained on a labeled
dataset. This means the algorithm learns from input-output pairs, where the input features are
associated with the correct output (label). The goal of supervised learning is for the model to
learn a mapping from inputs to outputs so that it can make predictions on new, unseen data.

10. Why do we use regression analysis? (April/May 2024)

Regression analysis helps in the prediction of a continuous variable. There are various scenarios
in the real world where we need some future predictions such as weather condition, sales
prediction, marketing trends, etc., for such a case we need some technology which can make
predictions more accurately. So for such a case we need Regression analysis which is a
statistical method and used in machine learning and data science. Below are some other reasons
for using Regression analysis:

1. Regression estimates the relationship between the target and the independent variable.
2. It is used to find the trends in data.
3. It helps to predict real/continuous values.
4. By performing the regression, we can confidently determine the most important factor, the
least important factor, and how each factor is affecting the other factors.

11. Mention the types of learners in classification problems.


● Lazy Learners: Lazy Learner firstly stores the training dataset and wait until it receives
the test dataset. In Lazy learner case, classification is done on the basis of the most
related data stored in the training dataset. It takes less time in training but more time for
predictions.
Example: K-NN algorithm, Case-based reasoning
● Eager Learners:Eager Learners develop a classification model based on a training
dataset before receiving a test dataset. Opposite to Lazy learners, Eager Learner takes
more time in learning, and less time in prediction. Example: Decision Trees, Naïve
Bayes, ANN.
12. Define Bayes theorem.
o Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to determine
the probability of a hypothesis with prior knowledge. It depends on the conditional
probability.
o The formula for Bayes' theorem is given as:

Where,

P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.

P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a
hypothesis is true.

P(A) is Prior Probability: Probability of hypothesis before observing the evidence.

P(B) is Marginal Probability: Probability of Evidence.

13. What are advantages and disadvantages in Naïve Bayse?

Advantages of Naïve Bayes Classifier:

o Naïve Bayes is one of the fast and easy ML algorithms to predict a class of datasets.
o It can be used for Binary as well as Multi-class Classifications.
o It performs well in Multi-class predictions as compared to the other Algorithms.
o It is the most popular choice for text classification problems.

Disadvantages of Naïve Bayes Classifier:

o Naive Bayes assumes that all features are independent or unrelated, so it cannot learn the
relationship between features.

14. Why use Decision Trees?

There are various algorithms in Machine learning, so choosing the best algorithm for the given
dataset and problem is the main point to remember while creating a machine learning model.
Below are the two reasons for using the Decision tree:

o Decision Trees usually mimic human thinking ability while making a decision, so it is
easy to understand.
o The logic behind the decision tree can be easily understood because it shows a tree-like
structure.

15. How does the Decision Tree algorithm Work?

The complete process can be better understood using the below algorithm:
o Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
o Step-2: Find the best attribute in the dataset using Attribute Selection Measure
(ASM).
o Step-3: Divide the S into subsets that contains possible values for the best attributes.
o Step-4: Generate the decision tree node, which contains the best attribute.
o Step-5: Recursively make new decision trees using the subsets of the dataset created in
step -3. Continue this process until a stage is reached where you cannot further classify
the nodes and called the final node as a leaf node.

16. Mention the steps involved in the CART algorithm.

The CART algorithm works via the following process:

● The best split point of each input is obtained.

● Based on the best split points of each input in Step 1, the new “best” split point is
identified.

● Split the chosen input according to the “best” split point.

● Continue splitting until a stopping rule is satisfied or no further desirable splitting is


available.
17. Define Polynomial Regression:

polynomial Regression:

o Polynomial Regression is a type of regression which models the non-linear dataset using
a linear model.
o It is similar to multiple linear regression, but it fits a non-linear curve between the value
of x and corresponding conditional values of y.
o Suppose there is a dataset which consists of datapoints which are present in a non-linear
fashion, so for such case, linear regression will not best fit to those datapoints. To cover
such datapoints, we need Polynomial regression.
o In Polynomial regression, the original features are transformed into polynomial features
of given degree and then modeled using a linear model. Which means the datapoints are
best fitted using a polynomial line.

18. Mention the main objectives of Simple Linear Regression.

Simple Linear regression algorithm has mainly two objectives:

o Model the relationship between the two variables. Such as the relationship between
Income and expenditure, experience and Salary, etc.
o Forecasting new observations. Such as Weather forecasting according to temperature,
Revenue of a company according to the investments in a year, etc.

19. What do you mean by Multiple Linear Regression?


In the previous topic, we have learned about Simple Linear Regression, where a single
Independent/Predictor(X) variable is used to model the response variable (Y). But there may
be various cases in which the response variable is affected by more than one predictor
variable; for such cases, the Multiple Linear Regression algorithm is used.
Moreover, Multiple Linear Regression is an extension of Simple Linear regression as it takes
more than one predictor variable to predict the response variable.

20. What is the need of Polynomial Regression?

The need of Polynomial Regression in ML can be understood in the below points:

o If we apply a linear model on a linear dataset, then it provides us a good result as we


have seen in Simple Linear Regression, but if we apply the same model without any
modification on a non-linear dataset, then it will produce a drastic output. Due to
which loss function will increase, the error rate will be high, and accuracy will be
decreased.
o So for such cases, where data points are arranged in a non-linear fashion, we need
the Polynomial Regression model. We can understand it in a better way using the
below comparison diagram of the linear dataset and non-linear dataset.

o In the above image, we have taken a dataset which is arranged non-linearly. So if we try
to cover it with a linear model, then we can clearly see that it hardly covers any data
point. On the other hand, a curve is suitable to cover most of the data points, which is of
the Polynomial model.

Hence, if the datasets are arranged in a non-linear fashion, then we should use the Polynomial
Regression model instead of Simple Linear Regression

PART – B
1 What are the primary problems with decision trees, especially with regard to overfitting?

2 Explain in detail about Implementation of the Naïve Bayes algorithm with a suitable
example python program.

3 Explain the concepts of Classification and Regression Trees. Identify how they are used in
practice for both classification and regression tasks. (April/May 2024)
4 Develop Logistic Regression in detail with an example program.(April/May 2023)

5 Construct the Decision Tree with an example in detail.(Nov/Dec 2022)

6 Discuss in detail about all the types of Regression in detail.

7 Distinguish simple linear regression and multiple linear regression. How do you handle
multiple predictors in regression analysis? (April/May 2024)

8 Distinguish linear regression and logistic regression with examples.(Nov/Dec 2023)

9 Analyze the Naïve Bayes Classifiers techniques with suitable examples.

Outlook Play
0 Rainy Yes
1 Sunny Yes
2 Overcast Yes
3 Overcast Yes
4 Sunny No
5 Rainy Yes
6 Sunny Yes
7 Overcast Yes
8 Rainy No
9 Sunny No
10 Sunny Yes
11 Rainy No
12 Overcast Yes
13 Overcast Yes
10 Build ID3 and derive the procedure to construct a decision tree using ID3. (Nov/Dec 2023)
UNIT 3
ADVANCED SUPERVISED AND ENSEMBLE LEARNING

Neural Networks: Introduction, Perceptron, Multilayer Perceptron, Support vector


machines: Linear and Non-Linear, Kernel Functions, K-Nearest Neighbors, Ensemble
Learning Model Combination Schemes, Voting, Error-Correcting Output Codes,
Bagging: Random Forest Trees, Boosting: Adaboost, Stacking.

1. What is a neural network?


A neural network is a machine learning model inspired by the human brain, consisting
of layers of interconnected nodes (neurons) that process data to recognize patterns.
Neural networks are used for tasks like classification, regression, and clustering. It is a
machine learning technique that uses a layered structure of interconnected nodes to teach
computers to process data in a manner similar to the human brain. It's a type of deep
learning, which is a subset of artificial intelligence.

2. What are the advantages of neural networks?


Ability to model complex non-linear relationships: Neural networks can capture
complex patterns and interactions in data.
Flexibility: They can be applied to various tasks, including classification, regression,
and pattern recognition.
Adaptability: Neural networks can adjust to changing data over time through re-
training.
Robustness: They perform well in high-dimensional spaces and are less sensitive to
noisy data than other models.

3. Define multilayer perceptron. (April/May 2024)


A multilayer perceptron (MLP) is a neural network that's made up of multiple layers,
including an input layer, hidden layers, and an output layer.
Layers
MLPs have at least three layers of nodes, each containing a set of neurons.
Connections
Each neuron in one layer connects to every neuron in the next layer with a specific
weight.
Activation function
MLPs use a nonlinear activation function to produce outputs.
Training
MLPs are trained using algorithms like backpropagation and optimization techniques
like gradient descent.
Applications
MLPs are used in many applications, including image recognition, natural language
processing, and speech recognition

4. What is ensemble learning?

Ensemble Learning refers to a machine learning technique where multiple models are
combined to create a stronger predictive model. The goal is to improve the performance
and accuracy of the model by aggregating the predictions of several models. Popular
ensemble methods include:

Bagging: Trains multiple models independently on different subsets of data.


Boosting: Combines models sequentially, each one correcting the errors of the previous
one.
Stacking: Combines multiple models and uses another model to learn from the
predictions of the base models.

5. What are the limitations in perceptron?

Linearity: It can only solve linearly separable problems, meaning it struggles with data
that cannot be separated by a straight line or hyperplane.
Single-layer: The perceptron architecture is limited to a single layer of neurons, which
restricts its ability to model complex relationships.
Convergence issues: If the data is not linearly separable, the perceptron may not
converge to a solution.

6. What is the role of the activation function in a perceptron?

The activation function in a perceptron determines whether a neuron should be activated


or not by applying a transformation to the input. It introduces non-linearity into the
model, enabling the network to learn complex patterns. For a simple perceptron, the
activation function is typically a step function that outputs binary values (0 or 1), based
on whether the weighted sum of inputs exceeds a threshold.

7. List the Kernel Functions. (Nov/Dec 2023)

Kernel functions map non-linearly separable data into a higher-dimensional space to


make it linearly separable. Common kernel functions include:

● Linear Kernel

● Polynomial Kernel

● Radial Basis Function (RBF)

● Sigmoid Kernel
8. Compare the differences between Linear and Non-Linear SVM.

Aspect Linear SVM Non-Linear SVM


Works for linearly
RssSeparability Handles non-linear data.
separable data.
Decision A curved hyperplane in the original
A straight hyperplane.
Boundary space.
Kernel
No kernel function needed. Requires a kernel function (e.g., RBF).
Function
Computation Faster and computationally
More computationally intensive.
al Cost cheaper.

9. Explain Linear SVM.

A Linear SVM is a type of Support Vector Machine used when the data is linearly
separable. It works by finding a hyperplane that maximizes the margin between two
classes, ensuring that the distance from the hyperplane to the nearest data points (support
vectors) is as large as possible. The optimal hyperplane is determined by solving an
optimization problem that minimizes classification error while maximizing the margin.
Linear SVM is efficient and works well with linearly separable data.

10. What is AdaBoost?

AdaBoost (Adaptive Boosting) is a machine learning technique that combines multiple


weak classifiers to create a strong classifier with improved accuracy. It is an ensemble
learning technique that uses an iterative process to focus on misclassified data points and
adjust the weights of training samples

11. Define K-Nearest Neighbors algorithm.

K-Nearest Neighbors (KNN) is a simple, instance-based learning algorithm used for


classification and regression. It classifies a new data point based on the majority label of
its k nearest neighbors in the feature space.

Steps:
● Choose the number k.

● Calculate the distance between the query point and all training points

(e.g., Euclidean distance).

● Identify the k closest points.

● Assign the label by majority voting for classification or average for

regression.

12. What are the different types of voting?

Voting is an ensemble method to aggregate predictions from multiple models.

● Hard Voting: Uses the majority vote to decide the final class.

● Soft Voting: Averages the predicted probabilities and selects the class with the
highest average probability.

13. Why do we use Error-Correcting Output Codes?

Error-Correcting Output Codes (ECOC) are used in ensemble methods to decompose


multi-class problems into several binary classification problems.

Advantages:

o Makes multi-class classification more robust to errors.


o Improves generalization and handles imbalanced datasets effectively.

14. What is bagging? (April/May 2024)

Bagging (Bootstrap Aggregating) is an ensemble technique that combines multiple


models trained on different random subsets of the dataset. Each model makes
predictions, and the final prediction is an average (for regression) or a majority vote (for
classification).

Example: Random Forest.

15. How does the Random forest algorithm avoid overfitting?


Bagging: Each tree in the forest is trained on a random subset of the data, reducing the
risk of overfitting to the entire training set.
Random feature selection: At each decision node, a random subset of features is
considered, which prevents individual trees from becoming too complex.
Averaging: By combining predictions from multiple trees, the variance is reduced,
leading to more generalized predictions.

16. List the advantages of bagging over boosting.(April/May 2023)

Less Sensitive to Noise: Bagging reduces variance without overemphasizing noisy data
points, unlike boosting, which can overfit on noise.

Parallel Computation: Models in bagging are independent and can be trained


simultaneously.

Stability: Bagging works well with high-variance models (e.g., decision trees), making
predictions more stable.

Reduced Overfitting: Averaging predictions reduces the risk of overfitting compared to


boosting.

17. Name common activation functions used in MLPs.

Sigmoid, ReLU (Rectified Linear Unit), Tanh, and Softmax.

18. What is the purpose of Stacking in Ensemble Learning?

Stacking is an ensemble technique that combines predictions from multiple base


models using a meta-model.

Purpose: To improve predictive performance by leveraging the strengths of diverse


models.

Steps:

1. Train base models on training data.


2. Use their predictions as input features for a meta-model.
3. Train the meta-model to produce the final prediction.

19. What is the voting scheme in Ensemble learning?

The Voting scheme is an ensemble method where multiple models vote on the final
output. For classification, each model predicts a class, and the class with the majority
of votes is selected as the final prediction (majority voting). For regression, the
average of the outputs from all models is taken as the final prediction.

20. How does Adaboost work to improve the performance of weak learners?

AdaBoost works by sequentially training weak learners, where each new learner
focuses on the errors made by the previous ones. It assigns higher weights to
misclassified data points, forcing the model to correct its mistakes. The predictions of all
learners are then combined, with each learner's influence determined by its accuracy.
AdaBoost improves the overall model by emphasizing difficult cases and refining the
weak learners’ predictions.
PART – B

1. Describe support vector machine with an example. (April/May 2024)

2. Compare and contrast different types of Ensemble Methods .

3. Analyze the working principle of a Multilayer Perceptron (MLP) and explain


how backpropagation is used for training.

4. Build the Machine Learning model to implement the Loan Status Prediction
using Support Vector Machine (SVM) Algorithm (use dataset name as
“Customer_details. csv”). (April/May 2023)

5. Classify the Ensemble Learning Model Combination Schemes.

6. You are building a voting ensemble with three classifiers: Logistic Regression,
SVM, and Decision Tree. Discuss how model diversity impacts the
performance of voting-based ensembles. Justify your answer with examples..

7. Explain random forest algorithms in detail.(Nov/Dec 2023, April/May 2024)

8. Explain the concept of Stacking in ensemble learning. How does it differ from
other ensemble methods like bagging and boosting? Explain the process of
building a stacked model and discuss the advantages and challenges of stacking.

9. Compare and contrast AdaBoost and XGBoost. How do they differ in terms of
boosting mechanisms and performance?
10. Given a dataset with features such as age, income, and purchase history, apply
the Random Forest algorithm to predict whether a customer will buy a product
(binary classification: yes, or no). Outline the steps involved in applying the
Random Forest model and explain how you would evaluate its performance.

UNIT - IV
UNSUPERVISED LEARNING

Introduction to clustering, Hierarchical: AGNES, DIANA, Partitional: K-means clustering, K-


Mode Clustering, Self-Organizing Map, Expectation Maximization, Gaussian Mixture Models,
Principal Component Analysis, Locally Linear Embedding, Factor Analysis, Fuzzy Modeling,
Genetic Modeling.

PART A

1. What is Clustering?

Clustering is an unsupervised machine learning technique that groups similar data points into
clusters. Clustering scans unlabeled data and groups data points with similar features together.

Clustering can be used in many real-world applications, such as patient studies, marketing,
biomedical, and geospatial databases.
2. List out the applications of clustering algorithms. (Nov/Dec 2023)

1. Market segmentation.
2. Customer behaviour analysis.
3. Document categorization.
4. Image segmentation.
5. Anomaly detection in cybersecurity.
6. Genomics and bioinformatics.
7. Social network analysis.
8. Recommender systems.

3. What is K-mode clustering?

K-mode Clustering is a variant of K-means clustering that is used for categorical data. In K-
means, centroids are defined as the mean of numerical values, but K-mode clustering uses
modes (most frequent values) to define the centroid for each cluster. The algorithm works
similarly to K-means:

1. Assign a mode (frequent category) for each feature.

2. Assign each data point to the nearest mode.

3. Update the mode based on the most frequent categories in each cluster. K-mode is used
for clustering categorical attributes and is widely used in market segmentation and customer
data analysis.

4. How does the K-means algorithm determine the optimal number of clusters?

Elbow Method: Plot the sum of squared errors (SSE) for different values of k. The point where
the SSE starts to level off (the "elbow") suggests the optimal k.
Silhouette Score: Measures how similar an object is to its own cluster compared to other
clusters. The optimal k maximizes the average silhouette score.
Gap Statistic: Compares the performance of clustering against random data to identify the
best k.
5.What is the Expectation Maximization algorithm used for?

The Expectation Maximization (EM) algorithm is used to estimate the parameters of a statistical
model when there are unobserved latent variables or missing data in the observed data,
essentially finding the maximum likelihood estimates of those parameters by iteratively
performing "expectation" and "maximization" steps based on the incomplete information
available

6.How DIANA differs from AGNES in Hierarchical Clustering?

● DIANA (Divisive Analysis):

o It is a top-down approach.
o Starts with a single cluster containing all data points and splits them iteratively.

● AGNES (Agglomerative Nesting):

o It is a bottom-up approach.
o Starts with each data point as an individual cluster and merges them iteratively.

7.Define Gaussian Mixture Models (GMM).(April/May 2024)

Gaussian Mixture Models represent data as a mixture of multiple Gaussian distributions, where
each Gaussian corresponds to a cluster. The model uses probabilistic measures to assign data
points to clusters based on their likelihood.

8.What is the dendrogram in hierarchical clustering?

A dendrogram is a tree-like diagram that shows the hierarchical relationship between objects in
a hierarchical clustering algorithm. It's a network structure that's made up of a root node,
branches, and leaves. The main purpose of a dendrogram is to help determine how to best group
objects into clusters.

9.How does Self-Organizing Map work for clustering?

Self-Organizing Map (SOM) works by projecting high-dimensional data onto a lower-


dimensional grid of neurons. Each neuron adjusts its weights to represent data points, with
similar data points being grouped together in the grid. SOM is trained using an unsupervised
learning process where the neurons "learn" to cluster similar data points. The final map shows
how data points are clustered based on their proximity in the grid. SOMs are particularly useful
for visualizing and interpreting high-dimensional data.

10.List out the applications of Unsupervised Machine Learning in Modern Business.


(April/May 2024)

1. Customer segmentation for personalized marketing.


2. Fraud detection in financial transactions.
3. Product recommendation systems.
4. Sentiment analysis in customer reviews.
5. Inventory management and demand forecasting.
6. Identifying trends in healthcare data.

11.What is k in the K-means Algorithm? How is it selected? (Nov/Dec 2023)

● K: The number of clusters into which data is partitioned.

Selection Methods:
o Elbow Method: Analysing the variance explained as a function of K.
o Silhouette Score: Measuring the quality of clustering.
o Domain knowledge or trial-and-error.

12. What are the uses of Fuzzy Sets in Modeling?

1. Handling uncertainty in data.


2. Building expert systems for decision-making.
3. Image processing and pattern recognition.
4. Control systems, like in washing machines or air conditioners.
5. Linguistic modeling in natural language processing.

13. When Will the Curse of Dimensionality Occur and How to Solve It?

● Occurrence: When data has too many dimensions, leading to sparse data and reduced
algorithm performance.

● Solutions:
o Dimensionality reduction techniques like PCA or LLE.
o Feature selection and engineering.
o Regularization methods.

14. Define LLE.

Locally Linear Embedding (LLE) is a non-linear dimensionality reduction technique in machine


learning that aims to project high-dimensional data onto a lower-dimensional space while
preserving the local relationships between data points, essentially capturing the intrinsic
geometry of the data by reconstructing each point using its nearest neighbors in the high-
dimensional space and maintaining those neighborhood structures in the lower-dimensional
embedding.

15. What are the limitations of K-means clustering?

The primary limitation of K-means clustering is its sensitivity to the initial selection of cluster
centroids, which can lead to suboptimal clustering results if not chosen carefully, and the
requirement to pre-define the number of clusters ("k") within the data, which can be challenging
to determine accurately in many cases.

16. What is Fuzzy Modeling?

Fuzzy Clustering is a type of clustering algorithm in machine learning that allows a data point to
belong to more than one cluster with different degrees of membership. Unlike traditional
clustering algorithms, such as k-means or hierarchical clustering, which assign each data point
to a single cluster, fuzzy clustering assigns a membership degree between 0 and 1 for each data
point for each cluster.

17. How Does Fuzzy Clustering Differ from Hard Clustering?

● Fuzzy Clustering:

o A data point can belong to multiple clusters with varying degrees of


membership.
o Used in situations with overlapping clusters.

● Hard Clustering:

o Each data point belongs to exactly one cluster.


o No overlap between clusters.

18. How does GMM differ from k-means clustering?

While both GMM (Gaussian Mixture Model) and K-means clustering are unsupervised learning
algorithms used for grouping data, the key difference is that GMM assigns data points to
clusters probabilistically based on a mixture of Gaussian distributions, allowing for soft cluster
assignments and handling complex cluster shapes, whereas K-means uses a hard assignment
based on the nearest centroid, making it better for simple, spherical clusters.

19.What is Genetic Modeling in clustering?

Genetic modeling in clustering is a technique that uses genetic algorithms to find optimal
solutions for clustering problems. These algorithms are inspired by evolution and use
mathematics to implement the idea of survival of the fittest. They can search for a better
solution from many possible ones, and are less sensitive to the initial cluster centres.

20. What are the applications of LLE?

● Data visualization:

LLE is particularly useful for visualizing complex high-dimensional datasets by


projecting them into 2D or 3D space.

● Feature extraction:

By extracting the lower-dimensional embedding, LLE can help identify important


features in data.

● Classification and clustering:

LLE can be used as a pre-processing step to improve the performance of classification


and clustering algorithms by revealing underlying data structures.
PART B

1. What is K-Mode clustering? Examine how it differs from K-means clustering and
give an example with details. (April/May 2024)
2. Examine about Hierarchical clustering algorithm and its types.

3. Inspect one limitation of LLE compared to other dimensionality reduction


techniques. (April/May 2024)
4. Build the Machine Learning Model to implement K-means algorithm to classify the
iris data set. Print both correct and wrong predictions. (April/May 2023)
5. Explain about EM algorithm in detail with an example. (Nov/Dec 2023)

6. Analyze the steps in k-means algorithm. Cluster the following set of 4 objects into
two clusters using k-means A (3,5), B (4,5), C (1,3), D (2,4). Consider the objects A
and C as the initial cluster centers. (April/May 2023)
7. Illustrate Principal Component Analysis (PCA) method of dimensionality reduction
technique with suitable examples. (Nov/Dec 2023)
8. Explain in detail about the K-nearest neighbor algorithm using a given dataset.

Mathematics Computer Science Result


4 3 Fail

6 7 Pass
7 8 Pass
5 5 Fail
8 8 Pass
9. Evaluate how genetic modeling techniques can be integrated with clustering
algorithms to optimize cluster assignments with examples. (April/May 2024)
10. Solve the multi-dimensional problem for the given network using Self-Organizing
Map.
1. What is K-Mode clustering? Examine how it differs from K-means clustering and
give an example with details. (April/May 2024)
2. Examine about Hierarchical clustering algorithm and its types.

3. Inspect one limitation of LLE compared to other dimensionality reduction


techniques. (April/May 2024)
4. Build the Machine Learning Model to implement K-means algorithm to classify the
iris data set. Print both correct and wrong predictions. (April/May 2023)

Input training Samples:

X1:(1,0,1,0)

X2:(1,0,0,0)

X3:(1,1,1,1)

X4:(0,1,1,0)

Initial Weight matrix:

Output Units: Unit 1, Unit 2

UNIT V
APPLICATIONS OF MACHINE LEARNING

Performance Measurement - Azure Machine Learning - Image Recognition – Speech


Recognition – Email spam and Malware Filtering – Online fraud detection – Medical
Diagnosis.
Part A
1. What is Image Recognition? [DEC 2023]
Image recognition is the ability of AI to detect an object, classify it, and recognize it. This
last step is close to the human level of image processing. The best example of image
recognition solution is face recognition.

2. What is a Random Forest?


A ‘random forest’ is a supervised machine learning algorithm that is generally used for
classification problems. It operates by constructing multiple decision trees during the
training phase. The random forest chooses the decision of the majority of the trees as the
final decision.

3. How does the Random Forest Algorithm work?


Step 1: Select random samples from a given data or training set.
Step 2: This algorithm will construct a decision tree for every training data.
Step 3: Voting will take place by averaging the decision tree.
Step 4: Finally, select the most voted prediction result as the final prediction result.

4. What is Speech Recognition?[MAY 2022]


Speech recognition is a machine's ability to listen to spoken words and identify them. It
recognizes phenones/phonetics in our speech to get the more significant part of speech, as
words and sentences.

5. How does Speech Recognition work? [MAY 2023]


Speech recognition starts by taking the sound energy produced by the person speaking and
converting it into electrical energy with the help of a microphone. It then converts this
electrical energy from analog to digital, and finally to text.

6. How Do You Design an Email Spam Filter? [DEC 2022]


Building a spam filter involves the following process:
● The email spam filter will be fed with thousands of emails

● Each of these emails already has a label: ‘spam’ or ‘not spam.’

● The supervised machine learning algorithm will then determine which


type of emails are being marked as spam based on spam words like the lottery, free
offer, no money, full refund, etc.
● The next time an email is about to hit your inbox, the spam filter will use statistical
analysis and algorithms like Decision Trees and SVM to determine how likely the
email is spam
● If the likelihood is high, it will label it as spam, and the email won’t hit your inbox

● Based on the accuracy of each model, we will use the algorithm with the highest
accuracy after testing all the models.
7. What is a Support Vector Machine?
Support Vector Machine (SVM) is a supervised learning algorithm used for classification and
regression problems. The main objective of SVM is to find a hyperplane in an N( total number
of features)-dimensional space that differentiates the data points. So we need to find a plane that
creates the maximum margin between two data point classes.
8. What are Support Vectors in SVM?
Support Vectors are data points that are nearest to the hyperplane. It influences the position and
orientation of the hyperplane. Removing the support vectors will alter the position of the
hyperplane. The support vectors help us build our support vector machine model.

9. What are Hyperplanes in SVM?


Hyperplanes are nothing but a boundary that helps to separate and group the data into particular
classes. A Hyperplane in 2-dimension is just a line. So the dimension of the hyperplane is
decided on the basis of the number of features in the dataset minus 1. So a hyperplane in R2 is a
line and in R3 is a plane.

10. Briefly Explain Logistic Regression.


Logistic regression is a classification algorithm used to predict a binary outcome for a given set
of independent variables. The output of logistic regression is either a 0 or 1 with a threshold
value of generally 0.5. Any value above 0.5 is considered as 1, and any point below 0.5 is
considered as 0.
11. What are false positives and false negatives? [DEC 2023]
False positives are those cases in which the negatives are wrongly predicted as positives. For
example, predicting that a credit card transaction is fraud when, in fact, its not fraud. False
negatives are those cases in which the positives are wrongly predicted as negatives. For
example, predicting that a credit card transaction is not fraud when, in fact, its fraud.
12. What is accuracy? [MAY 2023]
It is the number of correct predictions out of all predictions made.
Accuracy = (TP+TN)/(The total number of Predictions)
13. How does logistic regression handle categorical variables?
The inputs to a logistic regression model need to be numeric. The algorithm cannot handle
categorical variables directly. So, they need to be converted into a format that is suitable for the
algorithm to process. The various levels of a categorical variable will be assigned a unique
numeric value known as the dummy variable. These dummy variables are handled by the
logistic regression model as any other numeric value.
14. Define Bagging.
Creating a different training subset from sample training data with replacement is called
Bagging. The final output is based on majority voting.

15. Define Boosting.


Combing weak learners into strong learners by creating sequential models such that the
final model has the highest accuracy is called Boosting. Example: ADA BOOST, XG
BOOST.
16. What are the various applications of Machine Learning [MAY 2023]
Image Recognition, Speech Recognition. Email spam and Malware Filtering, Online fraud
detection, Medical Diagnosis.

17. What is Speech recognition? [MAY 2024]


Speech recognition is a process of converting voice instructions into text, and it is also
known as "Speech to text", or "Computer speech recognition."

18. Mention some spam filters used by Gmail. [DEC 2022]


i. Content Filter
ii. Header filter
iii. General blacklists filter
iv. Rules-based filters
v. Permission filters
19. Name Some machine learning algorithms used for email spam filtering and malware
detection.
Multi-Layer Perceptron, Decision tree, and Naïve Bayes classifier are used for email spam
filtering and malware detection.

20. What are the various ways that a fraudulent transaction can take place? [MAY 2023]
various ways that a fraudulent transaction can take place such as fake accounts, fake IDs, and
stealing money in the middle of a transaction.

21. Write any five Most popular Machine learning tools [MAY 2023]
PyTorch
TensorFlow
Colab
KNIME
Apache Mahoot
22. Write some assumptions about the Genuine Emails.
1. The genuine emails are the ones that are sent to the recipients with conveying useful
information.
2. The recipient expects those emails or reads those emails to get the new information.
23. Colab is supported under which platform?
Cloud services

24. Name the tool used for Data loading & Transformation and Data preprocessing &
visualization.
Rapid Miner

25. Mention the features of Keras.io.


API for neural networks
Written in Python
26. Brief KNIME
Can work with large data volumes.
Supports text mining & image mining through plugins

27. How the diagnostic sensory data is used in the Medical diagnosis process. [MAY 2024]
This diagnostic sensory data can then be given to a machine learning system which can then
analyze the signals and classify the medical conditions into different predetermined types.

28. Name a few applications that use speech recognition technology to follow voice
instructions.
Google Assistant, Siri, Cortana, and Alexa

PART B

1. Identify the techniques used to improve the accuracy of email spam and malware detection
systems. [MAY 2024]
2. Choose the ethical considerations involved in deploying machine learning for online fraud
detection. [MAY 2024]
3. Explain the role of Machine Learning in Image recognition and Medical diagnosis. Explain
any one application with its implementation. [MAY 2023]
4. Build the Machine Learning model to implement Email Spam classification using Naïve
Bayes or support vector machines. [MAY 2023]
5. Discuss about the application of machine learning in email spam and malware filtering.
[DEC 2022
6. Write a program for online fraud detection. [DEC 2022
7. Write the applications of machine learning in speech recognition, email Spam Malware
filtering, and online fraud detection. [DEC 2023]

PART C

1. Write short notes on precision and recall and explain the implementation program for Image
Recognition. [DEC 2023]

2. Describe how to evaluate machine learning models built for Speech Recognition.

3. Creating an email spam program for malicious purposes is unethical and illegal. However,
we can address email spam from the perspective of detecting and filtering spam emails,
which is a common problem in machine learning and cybersecurity. [MAY 2022]

4. Explain Online fraud detection in detail with a suitable program. [MAY 2024]
5. You are tasked with building a medical diagnosis system to assist healthcare providers in
identifying possible diseases based on a patient's symptoms. The system should predict
potential diseases and their likelihood based on a provided dataset containing:
● Symptoms reported by patients
● Patient demographic data (e.g., age, gender, weight, height)
● Laboratory test results (if available)
● Disease diagnoses corresponding to the symptoms and test results

The goal is to create an intelligent system that improves the efficiency of medical diagnoses,
reduces diagnostic errors, and aids medical professionals in decision-making.[MAY 2023]

You might also like