Predictive Modeling MCQs IMT
Predictive Modeling MCQs IMT
Chapter 1
' Which of the
following statements is true?
is oftenbasedon nonparametn'calgorithms;no
(a) Statistics guaranteed
optimum.
(b) In statistics, models are typically nonlinear.
(c) Statisticsalgorithmsare not as efficient or stable for small data.
(d) In statistics, data is typically smaller, the model is important.
- What are the challenges in using Predictive
Analytics?
(a) Predictive models require data in the form of twodimensiona1 data
(rows and columns).
(b) Often, deploymentof predictive models require shift in resources
{0,
an organization.
(c) The models become too complex because of overfitting.
(d) All of the above.
. What is the format in which data must be available for predictive
modelling?
(a) One-dimension
(b) Two-dimension
(c) Three-dimension
(d) n-Dimension
. Computational methodsto discover and report influential patternsin data
are known as
(a) Data mining
(b) Data discovery
(c) Data analytics
(d) All of the above
. Ptedictiveanalytia is the processof
(a) Justcleaning data
(b) Justcompressing data
(c) Guessing about present output Without any data
(d) Information retrieval to make useful predictions about future outcomes
. Discovering interesting and meaningful patterns in data is knovm as
(a) Data analytics
(b) Predictive analytics
(6) Data discovery
(d) All of the above
Questions513
MutipIe-Choice
basedon theproximityof
7- Inputsare analyzedand grouped]clustered
input values to one another is
(a) Supervised learning
(b) Unsupervised learning
(c) Descriptive modeling
(d) Both0» and (c)
Answer Keys
(a) PCC
(b) ROC
(c) AUC
((1) None of the above
11. determine magnitude of error
(a) Average errors
(b) Mean squared error
(c) Median ermr
(d) Average absolute error
12. Mean, median, and mode give clear picture about data spread and
variability.
(a) True
(b) False
Answer Keys
(b)0
(c) 1
than 1
(d) Greater
is the phenomenon called
9. What when a trend is seen in individual
variables! but 15 reversed when variables are combined?
(a) Simpsons paradox
Rule
(b) Redskin
C Anscombes Quartet
(d) Platykuric . .
of the follewmg 18not a property of normal distribution?
10. WhiCh
(a) It is symmetnc
and mode are all same
(b) Mean,median bell curve
(c) It is
alsocalled
betweenthemeanand the+/-1 Standarddeviation
(d) 86%of datalies data distribution visuajjzanon can be done
11.Generally,one-dimensional
using
(a) Seatter plot
(b) Histogram
(c) Scatterplotmatrices
(d) Anscombes quartet
12.Whichof theseis true about Uniform Distribution?
(a) Thedistribution is mfimte
(b) Meanand midpoint are different
(c) Distribution is symmetric about the mean
(d) None of the above
13. Which of these is false about correlations between two variables?
(a) Measures the numerical relationship of one variable to Others
(b) One variable meaning is related to anothers
(c) Both of these
(d) None of the above
Answer Keys
mm
8 .What
bility? -
C Validation
r055
:3 Of
uI'Sedimensionality
C
(b)Ruleof11
(c)Iemporal .
(d) stands Seqwmg
MCAR for
9. completely atrandom
(a) [ ngconditional atrandom
(b)Missingconvolute atrandom
(c) Missing
thae
((1)Noneof as
Missing
10. val 15"19
(a) mm
(b) zero
(c) False
(d) null '
min-max
11.Ingeneral, normalization rangeofavanableto
changes
(b)_100t0100
(c) 50 to 50
(d) -1 to 1
(e) 0 to 1
12
sb;1<(:<:;2lsmatamaccurabe,mthedatausedtotrahmt
(a) Underfitting
mammm
(c) Randomness
(d) All of the above
Answer Keys
1- (a) 5. (d ) 9.(a)
2 (d) a (a) 10.
(d)
3. (c) 7. (b) 11.(d)
4.(c) 3,(a) 1"(b)
Chapter 5 W
Fallowing is an example database/dataset of I superma,
andfiveitems
transactions (milk, butter,
bread, beer,
diapers)
APu:Mt m
Indicatedby 1 in the item column. 0:
section ID Milk Dre the
l l
OO
0
01:wa OHO
OO 0
nswer Keys
1. (a) 5. (
z (b) 6.(,3
3 (a) 7. (d)
4 (b)
8.(d)
Chapter 6
is
' Which of the following statements incorrect?
(a) Descriptive modeling algorithms are also called as unsupervised
learning methods.
(b) Descriptive modeling algorithms try to find relationshipsbetweeninputs
(c) Descriptive modeling algorithms discover the best way to segmentthe
data.
(d) Descriptive modeling algorithms try to find relationships thatassociate
inputs to one or more target variables.
, Which of the following statements is incorrect?
(a) Decision Tree is a commonly used unsupervised modeling algorithm.
(b) K-Means clustering is a commonly used unsupervised modeling
algorithm.
(c) Kohonen Self-Organizing Maps (SOM) is a commonly used
unsupervised modeling algorithm.
(d) Principal ComponentAnalysis (PCA)is acommonlyusedunsupervised
modeling algorithm.
. Which of the following statements is incorrect?
(a) Inputs must be numeric for K-Means clustering algorithm.
(b) Kohonen SelfOrganizing Maps (SOM) needs all data to be populated,
there can be no missing values.
(c) Inputs need not be numeric for Kohonen SelfOrganizing Maps (SOM)
algorithm.
(d) When using Principal Component Analysis (PCA), any categorical
variable to be included in the model, must be converted to a number.
. Which of the following algorithms is best suited for reducing the number
of inputs for predictive models?
(a) K-Means clustering
(b) Kohonen Self-Organizing Maps (SOM)
(c) Principal Component Analysis (PCA)
(d) All of the above
5. Which of the following is NOT one of the distancemetric used in building
the K-Means clustering model?
(a) Mahalanobis distance metric
(b) Milwaukee distance metric
(c) Manhattan distance metric
(d) Minkowski distance metric
6. Which of the following is widely used as unsupervisedlearningneural
network algorithm?
(a) Perceptron
(b) Kohonen Self-Organizing Map (SOM)
(c) Both Perceptron and Kohonen Self-OrganizingMap (SOM)
(d) None of the above
7""
Analytics
Predictive
524Applied
KMEANS, what isthe ofclusters
number inthedata?
7.111
will
Algorithm thesame
determine dynamically
(a) bepre-specified
(b)It must
(c) It isalways2
(d) It is always3
Whichof theseisnotunsupervised
modelingalgorithm?
(a) K-means clustering
(b) Kohonen
maps(SOMs)
(c) Self-organizing
((1)Linearregression
In K-means, clustersmodel parametersare definedby
(a) A numberof weights
03) Number of clusters
(C) Value, one per unit
((1) One per unit
10. Generally, in Kogonen map, number of nodes are
(a) Post determined after ploting map
(b) Predetermined
(c) Predetermined by length and width of map
(d) Randomly
Answer Keys
L S s . .
#RecoldsinCluster 8,538 8,511 30,656 47,705
LASTDATE 0.319 0.304 0.179 0,225
FISTDATE 0886 0.885 0.908 0.900
0.711 0.716 0.074 0.303
321; 0.382 0.390 0.300 0.331
ERpf-ZA 0.499 0.500 0.331 0.391
{RF
1422: 0.369 0.366 0.568 0.496
DOM
X1103 0-449 0.300 0.368 0.370
DOMALNZ 0300 0.700 0.489 0.493
DOMAIN] 0.515 0.300 0.427 0.420
NGIFTALL
1 0384 0.385 0.233 0.287
LAST3117113151100 -348 0.343 0.430 0.400
7'"
526Applied Analytics
Predictive
(a) True
(b)False is to
applied clustering
algorithm
a
6.Ifda tmaries after
t .
clustermg. then't 1 is .
unders _ .
normahzanon difficult
to
(a) without
(b) withnormalizgh'on
(c) with compressxon
(d) noneof theabove . are
vanables
7. Generally, and
interval ratio problematic
tomterpmt
(a) True
(b) False
8. Asa thumprule orguidingprinciple,ANOVAmethodWorks
when there are _________ clusters.
(a) worst, small no. of
(b) best, small no. of
(c) best, large no. of
(d) worst, large no. of
H ierarchicalclusteringworks well with large number of records,
(a) True
(b) False
10. Decision trees are not distance-baqed algorithms and therefore
distributions.
andskewed are
by
(a) unaffected,outliers
(b) affected, outliers
(c) affected, nonnahzed
(d) unaffected, nonnahzed
11. In mulhvanabe problem, ANOVA determines which variables has
most
signihcant dnffemnce m ____________values between the clusters.
(a) mean
(b) variance
(c) error
(d) none
12 Mean value for variables in each cluster is called as
(a) Cluster mean
(b) Cluster average
(c) Cluster center
(d) Cluster median
Answer Key:
1.
Answer Keys
5 9. (b)
1. (b)
(a)
6. (a) 10. (b)
Answer Keys