Ensemble Learning & k-NN Quiz Solutions

This document contains 15 multiple choice questions about machine learning algorithms such as bagging, boosting, random forests, and gradient boosting. It tests understanding of key concepts like how the algorithms handle features and samples, whether they are suitable for classification or regression, and optimal hyperparameters. The questions cover topics like how bagging and boosting aggregate results from weak learners, how random forests build trees on subsets of data, and that gradient boosting and random forests can handle both categorical and continuous features.

Uploaded by

utpal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

225 views12 pages

Ensemble Learning & k-NN Quiz Solutions

Uploaded by

utpal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

You are on page 1/ 12

1) Which of the following is/are true about bagging trees?

1. In bagging trees, individual trees are independent of each other

2. Bagging is the method for improving the performance by aggregating the results of weak
learners
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C
Both options are true. In Bagging, each individual trees are independent of each other because they
consider different subset of features and samples.

2) Which of the following is/are true about boosting trees?
1. In boosting trees, individual weak learners are independent of each other
2. It is the method for improving the performance by aggregating the results of weak learners
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: B
In boosting tree individual weak learners are not independent of each other because each tree
correct the results of previous tree. Bagging and boosting both can be consider as improving the
base learners results.

3) Which of the following is/are true about Random Forest and Gradient Boosting ensemble
methods?
1. Both methods can be used for classification task
2. Random Forest is use for classification whereas Gradient Boosting is use for regression task
3. Random Forest is use for regression whereas Gradient Boosting is use for Classification task
4. Both methods can be used for regression task
A) 1
B) 2
C) 3
D) 4
E) 1 and 4
Solution: E

4) In Random forest you can generate hundreds of trees (say T1, T2 …..Tn) and then
aggregate the results of these tree. Which of the following is true about individual(Tk) tree in
Random Forest?
1. Individual tree is built on a subset of the features
2. Individual tree is built on all the features
3. Individual tree is built on a subset of observations
4. Individual tree is built on full set of observations
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: A
Random forest is based on bagging concept, that consider faction of sample and faction of feature
for building the individual trees.

5) Which of the following is true about “max_depth” hyperparameter in Gradient Boosting?
1. Lower is better parameter in case of same validation accuracy
2. Higher is better parameter in case of same validation accuracy
3. Increase the value of max_depth may overfit the data
4. Increase the value of max_depth may underfit the data
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: A
Increase the depth from the certain value of depth may overfit the data and for 2 depth values
validation accuracies are same we always prefer the small depth in final model building.

6) Which of the following algorithm doesn’t uses learning Rate as of one of its
hyperparameter?
1. Gradient Boosting
2. Extra Trees
3. AdaBoost
4. Random Forest
A) 1 and 3
B) 1 and 4
C) 2 and 3
D) 2 and 4
Solution: D
Random Forest and Extra Trees don’t have learning rate as a hyperparameter.

7) In random forest or gradient boosting algorithms, features can be of any type. For example,
it can be a continuous feature or a categorical feature. Which of the following option is true
when you consider these types of features?
A) Only Random forest algorithm handles real valued attributes by discretizing them
B) Only Gradient boosting algorithm handles real valued attributes by discretizing them
C) Both algorithms can handle real valued attributes by discretizing them
D) None of these
Solution: C

8) Which of the following algorithm are not an example of ensemble learning algorithm?
A) Random Forest
B) Adaboost
C) Extra Trees
D) Gradient Boosting
E) Decision Trees
Solution: E

9) Suppose you are using a bagging based algorithm say a RandomForest in model building.
Which of the following can be true?
1. Number of tree should be as large as possible
2. You will have interpretability after using RandomForest
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: A
Since Random Forest aggregate the result of different weak learners, If It is possible we would want
more number of trees in model building. Random Forest is a black box model you will lose
interpretability after using it.

10) Which of the following is true about the Gradient Boosting trees?
1. In each stage, introduce a new regression tree to compensate the shortcomings of existing
model
2. We can use gradient decent method for minimize the loss function
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C

11) True-False: The bagging is suitable for high variance low bias models?
A) TRUE
B) FALSE
Solution: A

12) In gradient boosting it is important to use learning rate to get optimum output. Which of
the following is true about choosing the learning rate?
A) Learning rate should be as high as possible
B) Learning Rate should be as low as possible
C) Learning Rate should be low but it should not be very low
D) Learning rate should be high but it should not be very high
Solution: C

13) [True or False] Cross validation can be used to select the number of iterations in boosting;
this procedure may help reduce overfitting.
A) TRUE
B) FALSE
Solution: A

14) When you use the boosting algorithm you always consider the weak learners. Which of the
following is the main reason for having weak learners?
1. To prevent overfitting
2. To prevent under fitting
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: A

15) To apply bagging to regression trees which of the following is/are true in such case?
1. We build the N regression with N bootstrap sample
2. We take the average the of N regression tree
3. Each tree has a high variance with low bias
A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3
Solution: D
16) How to select best hyperparameters in tree based models?
A) Measure performance over training data
B) Measure performance over validation data
C) Both of these
D) None of these
Solution: B

17) In which of the following scenario a gain ratio is preferred over Information Gain?
A) When a categorical variable has very large number of category
B) When a categorical variable has very small number of category
C) Number of categories is the not the reason
D) None of these
Solution: A

When high cardinality problems, gain ratio is preferred over Information Gain
technique.

Skill test Questions and Answers

1) [True or False] k-NN algorithm does more computation on test time rather than train time.
A) TRUE
B) FALSE
Solution: A
The training phase of the algorithm consists only of storing the feature vectors and class labels of
the training samples.
In the testing phase, a test point is classified by assigning the label which are most frequent among
the k training samples nearest to that query point – hence higher computation.

2) True about k-NN algorithm?
A) It can be used for classification
B) It can be used for regression
C) It can be used in both classification and regression
Solution: C

3) Which of the following statement is true about k-NN algorithm?

1. k-NN performs much better if all of the data have the same scale
2. k-NN works well with a small number of input variables (p), but struggles when the number
of inputs is very large
3. k-NN makes no assumptions about the functional form of the problem being solved
A) 1 and 2
B) 1 and 3
C) Only 1
D) All of the above
Solution: D

4) Which of the following machine learning algorithm can be used for imputing missing values
of both categorical and continuous variables?
A) K-NN
B) Linear Regression
C) Logistic Regression
Solution: A

5) Which of the following is true about Manhattan distance?

A) It can be used for continuous variables
B) It can be used for categorical variables
C) It can be used for categorical as well as continuous
D) None of these
Solution: A

6) Which of the following distance measure do we use in case of categorical variables in k-NN?
1. Hamming Distance
2. Euclidean Distance
3. Manhattan Distance
A) 1
B) 2
C) 3
D) 1 and 2
E) 2 and 3
F) 1,2 and 3
Solution: A
Both Euclidean and Manhattan distances are used in case of continuous variables, whereas
hamming distance is used in case of categorical variable.

7) Which of the following will be Euclidean Distance between the two data point A(1,3) and
B(2,3)?
A) 1
B) 2
C) 4
D) 8
Solution: A
sqrt( (1-2)^2 + (3-3)^2) = sqrt(1^2 + 0^2) = 1

8) Which of the following will be Manhattan Distance between the two data point A(1,3) and
B(2,3)?
A) 1
B) 2
C) 4
D) 8
Solution: A
sqrt( mod((1-2)) + mod((3-3))) = sqrt(1 + 0) = 1

9) Which of the following will be true about k in k-NN in terms of Bias?
A) When you increase the k the bias will be increases
B) When you decrease the k the bias will be increases
C) Can’t say
D) None of these
Solution: A
large K means simple model, simple model always condider as high bias

10) Which of the following will be true about k in k-NN in terms of variance?
A) When you increase the k the variance will increases
B) When you decrease the k the variance will increases
C) Can’t say
D) None of these
Solution: B
Simple model will be consider as less variance model

11) When you find noise in data which of the following option would you consider in k-NN?
A) I will increase the value of k
B) I will decrease the value of k
C) Noise can not be dependent on value of k
D) None of these
Solution: A
To be more sure of which classifications you make, you can try increasing the value of k.

12) In k-NN it is very likely to overfit due to the curse of dimensionality. Which of the
following option would you consider to handle such problem?
1. Dimensionality Reduction
2. Feature selection
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C

13) Below are two statements given. Which of the following will be true both statements?
1. k-NN is a memory-based approach is that the classifier immediately adapts as we collect
new training data.
2. The computational complexity for classifying new samples grows linearly with the number
of samples in the training dataset in the worst-case scenario.
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C

14) A company has build a kNN classifier that gets 100% accuracy on training data. When
they deployed this model on client side it has been found that the model is not at all accurate.
Which of the following thing might gone wrong?
Note: Model has successfully deployed and no technical issues are found at client side except
the model performance
A) It is probably a overfitted model
B) It is probably a underfitted model
C) Can’t say
D) None of these
Solution: A
In an overfitted module, it seems to be performing well on training data, but it is not generalized
enough to give the same results on a new data.

15) You have given the following 2 statements, find which of these option is/are true in case of
k-NN?
1. In case of very large value of k, we may include points from other classes into the
neighborhood.
2. In case of too small value of k the algorithm is very sensitive to noise
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C

16) Which of the following statements is true for k-NN classifiers?
A) The classification accuracy is better with larger values of k
B) The decision boundary is smoother with smaller values of k
C) The decision boundary is linear
D) k-NN does not require an explicit training step.
Solution: D
Option A: This is not always true. You have to ensure that the value of k is not too high or not too
low.
Option B: This statement is not true. The decision boundary can be a bit jagged
Option C: Same as option B
Option D: This statement is true

17) True-False: It is possible to construct a 2-NN classifier by using the 1-NN classifier?
A) TRUE
B) FALSE
Solution: A
You can implement a 2-NN classifier by ensembling 1-NN classifiers

18) In k-NN what will happen when you increase/decrease the value of k?
A) The boundary becomes smoother with increasing value of K
B) The boundary becomes smoother with decreasing value of K
C) Smoothness of boundary doesn’t dependent on value of K
D) None of these
Solution: A

19) Following are the two statements given for k-NN algorthm, which is/are true?
1. We can choose optimal value of k with the help of cross validation
2. Euclidean distance treats each feature as equally important
A) 1
B) 2
C) 1 and 2
D) None of these
Solution: C

Question 1

Answer Explanation
Choose k to be the smallest value so that at This maintains the structure of the data while
least 99% of the variance is retained maximally reducing its dimension.
Question 2

Answer
Question 3

True
or Statement Explanation
False
Data visualization: To take 2D data, and find a
False None needed
different way of plotting it in 2D (using k=2)
As a replacement for (or alternative to) linear
PCA is not linear regression. They have
regression: For most learning applications, PCA
False different goals (and cost functions), so
and linear regression give substantially similar
they give different results.
results
Data compression: Reduce the dimension of
If your learning algorithm is too slow
your input data x(i), which will be used in a because the input dimension is too high,
True supervised learning algorithm (i.e., use PCA so then using PCA to speed it up is a
that your supervised learning algorithm runs reasonable choice.
faster)
If memory or disk space is limited, PCA
Data compression: Reduce the dimension of
allows you to save space in exchange for
True your data, so that it takes up less memory/disk
losing a little of the data's information.
space.
This can be a reasonable tradeoff.

Machine Learning Bits
100% (2)
Machine Learning Bits
28 pages
RCS-080 Machine Learning MCQs
100% (2)
RCS-080 Machine Learning MCQs
57 pages
Machine Learning Concepts Quiz
100% (1)
Machine Learning Concepts Quiz
9 pages
Machine Learning MCQs for Students
100% (1)
Machine Learning MCQs for Students
8 pages
Machine Learning Unit 4 MCQ
No ratings yet
Machine Learning Unit 4 MCQ
28 pages
ML Mcqs Without Answers
50% (2)
ML Mcqs Without Answers
21 pages
UNIT 1 Practice Quiz - MCQs - ML
100% (1)
UNIT 1 Practice Quiz - MCQs - ML
10 pages
DSF Unit IV MCQ Notes
No ratings yet
DSF Unit IV MCQ Notes
6 pages
Machine L-Lab-Manual
No ratings yet
Machine L-Lab-Manual
90 pages
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
No ratings yet
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
34 pages
ML U3 MCQ
No ratings yet
ML U3 MCQ
20 pages
ML Lab Manual (5cs4-23)
No ratings yet
ML Lab Manual (5cs4-23)
53 pages
ML Unit 1-Notes
No ratings yet
ML Unit 1-Notes
21 pages
Q1-What's The Trade-Off Between Bias and Variance?
100% (1)
Q1-What's The Trade-Off Between Bias and Variance?
5 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Pca
No ratings yet
Pca
19 pages
Marks Hi Marks: Be Comp MCQ PDF
100% (1)
Marks Hi Marks: Be Comp MCQ PDF
878 pages
Business Intelligence MCQ Practice Test
100% (1)
Business Intelligence MCQ Practice Test
8 pages
KNN Regression MCQs and Answers
No ratings yet
KNN Regression MCQs and Answers
8 pages
ML QB Ans
No ratings yet
ML QB Ans
48 pages
Nueral Network Mcqs
No ratings yet
Nueral Network Mcqs
6 pages
Question Bank - Machine Learning
No ratings yet
Question Bank - Machine Learning
16 pages
ML Assignment 6
No ratings yet
ML Assignment 6
5 pages
Machine 2021 Jan-Apr Practice
No ratings yet
Machine 2021 Jan-Apr Practice
26 pages
Machine Learning Quiz Questions
100% (1)
Machine Learning Quiz Questions
44 pages
Data Mining Mcqs PDF
No ratings yet
Data Mining Mcqs PDF
39 pages
Data Science & Statistics FAQs
100% (1)
Data Science & Statistics FAQs
41 pages
AI Exam
No ratings yet
AI Exam
2 pages
RBF Networks and KNN Overview
No ratings yet
RBF Networks and KNN Overview
9 pages
Data Science Exam Questions
No ratings yet
Data Science Exam Questions
8 pages
Aproiri Qand A
No ratings yet
Aproiri Qand A
9 pages
Assignment 8 Solution
No ratings yet
Assignment 8 Solution
7 pages
Understanding Machine Learning Solution Manual: 2 Gentle Start
No ratings yet
Understanding Machine Learning Solution Manual: 2 Gentle Start
67 pages
Machine Learning Report
No ratings yet
Machine Learning Report
58 pages
ML MCQ
100% (4)
ML MCQ
31 pages
Data Warehouse & Mining MCQs
No ratings yet
Data Warehouse & Mining MCQs
4 pages
AI Midterm Exam - Helwan University
No ratings yet
AI Midterm Exam - Helwan University
2 pages
ML MCQ Question Bank
100% (1)
ML MCQ Question Bank
4 pages
Class 10 AI Model Evaluation MCQs
No ratings yet
Class 10 AI Model Evaluation MCQs
2 pages
Stat982 (Chap13) Q Set
No ratings yet
Stat982 (Chap13) Q Set
27 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
27 pages
Machine Learning Quiz Review
No ratings yet
Machine Learning Quiz Review
6 pages
Deep Learning MCQs and Answers
No ratings yet
Deep Learning MCQs and Answers
4 pages
Data Mining
No ratings yet
Data Mining
9 pages
Machine Learning Quiz Insights
No ratings yet
Machine Learning Quiz Insights
8 pages
358 33 Powerpoint Slides DSC Chapter 15
No ratings yet
358 33 Powerpoint Slides DSC Chapter 15
55 pages
Assignment # 01 Bscs - 7 Semester: Machine Learning
100% (1)
Assignment # 01 Bscs - 7 Semester: Machine Learning
5 pages
M Inning
100% (1)
M Inning
146 pages
Data Mining MCQs and Answers Guide
No ratings yet
Data Mining MCQs and Answers Guide
8 pages
Chapter III - Supervised and Unsupervised Algorithms
No ratings yet
Chapter III - Supervised and Unsupervised Algorithms
122 pages
Bagging vs Boosting: Key Differences
No ratings yet
Bagging vs Boosting: Key Differences
20 pages
Key Concepts in Ensemble Learning
No ratings yet
Key Concepts in Ensemble Learning
14 pages
3 Marks Questions
No ratings yet
3 Marks Questions
3 pages
Machine Learning & AI Quiz Answers
No ratings yet
Machine Learning & AI Quiz Answers
15 pages
Machine Learning Multiple Choice Questions
100% (1)
Machine Learning Multiple Choice Questions
20 pages
Huawei Final Written Exam 2.2 Attempts
No ratings yet
Huawei Final Written Exam 2.2 Attempts
19 pages
DMT MCQ
No ratings yet
DMT MCQ
15 pages
Week 6-1
No ratings yet
Week 6-1
9 pages
MLRECT2 Solution
No ratings yet
MLRECT2 Solution
9 pages
Iot 4
No ratings yet
Iot 4
37 pages
Bhavana Dalvi's Academic Profile
No ratings yet
Bhavana Dalvi's Academic Profile
3 pages
Quality Management 4.1
No ratings yet
Quality Management 4.1
8 pages
SVD in Image Classification Preprocessing
No ratings yet
SVD in Image Classification Preprocessing
3 pages
The Data Science and AI Handbook
100% (1)
The Data Science and AI Handbook
90 pages
Supervised ML and Sentiment Analysis: Deeplearning - Ai
No ratings yet
Supervised ML and Sentiment Analysis: Deeplearning - Ai
69 pages
Python IEEE Projects - 2024 - Final List
No ratings yet
Python IEEE Projects - 2024 - Final List
28 pages
Classification & Prediction Guide
No ratings yet
Classification & Prediction Guide
63 pages
Full Stack Developer & IT Project Manager
No ratings yet
Full Stack Developer & IT Project Manager
9 pages
(22AML145) Machine Learning CIA-I Question Paper
No ratings yet
(22AML145) Machine Learning CIA-I Question Paper
2 pages
Bhadra Ai
No ratings yet
Bhadra Ai
9 pages
2024 Mrs Fall Abstract Program 11 14 2024
No ratings yet
2024 Mrs Fall Abstract Program 11 14 2024
2,972 pages
Documentacao Akkio
No ratings yet
Documentacao Akkio
240 pages
Google Cloud Big Data and Machine Learning Fundamentals
No ratings yet
Google Cloud Big Data and Machine Learning Fundamentals
151 pages
Advanced ML: Consistency & Algorithms
No ratings yet
Advanced ML: Consistency & Algorithms
3 pages
Welcome To ISLP Documentation! - Introduction To Statistical Learning (Python)
No ratings yet
Welcome To ISLP Documentation! - Introduction To Statistical Learning (Python)
8 pages
Data Science Quiz Questions
No ratings yet
Data Science Quiz Questions
7 pages
Lecture 03 - Introduction To LLMs
No ratings yet
Lecture 03 - Introduction To LLMs
32 pages
Gen AI & ML
No ratings yet
Gen AI & ML
41 pages
ES335 Assignment 3
No ratings yet
ES335 Assignment 3
4 pages
Deep Learning Based Brain Tumor Classification and
No ratings yet
Deep Learning Based Brain Tumor Classification and
12 pages
Machine Learning Practical Sem 5
No ratings yet
Machine Learning Practical Sem 5
3 pages
Resume Pavani Puniyamanthuala
No ratings yet
Resume Pavani Puniyamanthuala
2 pages
Real-Time Sign Language Detection For Video Conferencing
No ratings yet
Real-Time Sign Language Detection For Video Conferencing
6 pages
A Review On Finding Efficient Approach To Detect Customer Emotion Analysis Using Deep Learning Analysis
No ratings yet
A Review On Finding Efficient Approach To Detect Customer Emotion Analysis Using Deep Learning Analysis
17 pages
Career Counseling Chatbot Guide
No ratings yet
Career Counseling Chatbot Guide
6 pages
AI's Impact on Privacy Law
No ratings yet
AI's Impact on Privacy Law
60 pages
Scs302 Artificial Intelligence Notes
No ratings yet
Scs302 Artificial Intelligence Notes
110 pages
Lecture 1.3
No ratings yet
Lecture 1.3
11 pages
Diabetes Prediction Using Machine Learning Classification Techniques
No ratings yet
Diabetes Prediction Using Machine Learning Classification Techniques
34 pages

Ensemble Learning & k-NN Quiz Solutions

Uploaded by

Ensemble Learning & k-NN Quiz Solutions

Uploaded by

1) Which of the following is/are true about bagging trees?

1. In bagging trees, individual trees are independent of each other

Skill test Questions and Answers

3) Which of the following statement is true about k-NN algorithm?

5) Which of the following is true about Manhattan distance?

You might also like