0% found this document useful (0 votes)
205 views4 pages

SML BAD702 Qp1 First Ia Scheme

Uploaded by

pushpa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
205 views4 pages

SML BAD702 Qp1 First Ia Scheme

Uploaded by

pushpa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

MAHARAJA INSTITUTE OF TECHNOLOGY MYSORE

Dept. of Computer Science and Engineering (Data science)


Statistical Machine Learning For Data Science
(BAD702) CD
1st Internal Scheme
7th Semester. Total Marks: 40
22/09/2025 and 02:15am – 3:45am
Q# Question Description M
Identify different data types? Explain Rectangular data with example
Data Types(4 Marks)
Numeric:Data that are expressed on a numeric scale.
Continuous:Data that can take on any value in an interval. (Synonyms:
interval, float,numeric)
Discrete:Data that can take on only integer values, such as counts.
Categorical:Data that can take on only a specific set of values representing a
set of possible
Binary:A special case of categorical data with just two categories of values,
e.g .. 0/1 true/false.
Ordinal:Categorical data that has an explicit ordering. (Synonym: ordered
factor)
a 10
Data frame(with example 6 Marks)
Rectangular data (like a spreadsheet) is the basic data structure for statistical
and machine learning models.
Feature:A column within a table is commonly referred to as a feature.
Synonyms
attribute, input, predictor, variable
1
Outcome
Many data science projects involve predicting an outcome-often a yes/no
outcome (in Table 1-1, it is "auction was competitive or not"). The features
are some-times used to predict the outcome in an experiment or a study.
dependent variable, response, target, output
Records
A row within a table is commonly referred to as a record.
case, example, instance, observation, pattern, sample

b 10
5+5
Identify Correlation with its key term with example 3+3+4

Correlation coefficient

A metric that measures the extent to which numeric variables are


associated with one another (ranges from –1 to +1).

Correlation matrix
a 10
A table where the variables are shown on both rows and columns, and
the cell values are the correlations between the variables.

Scatterplot

A plot in which the x-axis is the value of one variable, and the y-axis
2 the value of another.

Illustrate implementation Scatterplot of correlation between returns for ATT


and Verizon using R and Python(5+5)

b 10

Explain Random Sampling and Sample Bias with example

Sample
3 a A subset from a larger data set. 10
Population:The larger data set or idea of a data set.
N (n) The size of the population (sample).
Random sampling
Drawing elements into a sample at random.

Stratified sampling :Dividing the population into strata and randomly sampling from each strata.
Stratum (pl., strata)
A homogeneous subgroup of a population with common characteristics.
Simple random sample
The sample that results from random sampling without stratifying the
population.
Bias Systematic error.
Sample bias
A sample that misrepresents the population.

Choose Selection Bias concepts with key term with an example

Selection bias
Bias resulting from the way in which observations are selected.

b Data snooping 10
Extensive hunting through data in search of something interesting.

Vast search effect


Bias or nonreproducibility resulting from repeated data modeling, or
modeling
data with large numbers of predictor variables.
Explain Regression to the Mean with example

a 10

Illustrate implementation of the concept the Bootstrap with code


b Bootstrap steps 3 marks 10
R code 3 marks
Python code 6 marks
Approval

Dr Pushpa D
Dr Pushpa D
Faculty Course Coordinator

Dr Pushpa D
DEC Facilitator Academic Coordinator
HOD

You might also like