Data Prep Quiz for Analysts

This document is a quiz on data preparation techniques. It covers topics like data quality issues, imputing missing data, outliers, feature selection, and principal component analysis (PCA). The questions ask about identifying data quality issues, replacing missing values, defining outliers, using domain knowledge to address issues, feature selection methods, the goal of feature sets, properties of zero-normalized data, and statements that are not true about PCA.

Uploaded by

Mr.Padmanaban V

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

449 views2 pages

Data Prep Quiz for Analysts

Uploaded by

Mr.Padmanaban V

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Quiz 4 - Data Preparation

1. Which of the following is NOT a data quality issue?

 Inconsistent data
 Scaled data
 Missing values
 Duplicate data

2. Imputing missing data means to

 replace missing values with something reasonable.

 drop samples with missing values.
 replace missing values with outliers.
 merge samples with missing values.

3. A data sample with values that are considerably different than

the rest of the other data samples in the dataset is called an/a
_____________.

 Outlier
 Invalid data
 Noise
 Inconsistent data

4. Which one of the following examples illustrates the use of

domain knowledge to address a data quality issue?

 Simply discard the samples that lie significantly outside the distribution of your
data
 Drop samples with missing values
 Merge duplicate records while retaining relevant data
 None of these

5. Which of the following is NOT an example of feature selection?

 Adding an in-state feature based on an applicant's home state.

 Re-formatting an address field into separate street address, city, state, and zip
code fields.
 Removing a feature with a lot of missing values.
 Replacing a missing value with the variable mean.

6. Which one of the following is the best feature set for your
analysis?

 Feature set with the smallest set of features that best capture the
characteristics of the data for the intended application
 Feature set with the smallest number of features
 Feature set with the largest number of features
 Feature set that contains exclusively re-coded features

7. The mean value and the standard deviation of a zero-

normalized feature are

 mean = 0 and standard deviation = 0

 mean = 1 and standard deviation = 0
 mean = 0 and standard deviation = 1
 mean = 1 and standard deviation = 1

8. Which of the following is NOT true about PCA?

 PCA stands for principal component analysis

 PC1 and PC2, the first and second principal components, respectively, are always
orthogonal to each other.
 PC1, the first principal component , captures the largest amount of variance in
the data along a single dimension.
 PCA is a dimensionality reduction technique that removes a feature that is
very correlated with another feature.

Answer Midterm Exam Data Mining1 2021 - 2022
100% (2)
Answer Midterm Exam Data Mining1 2021 - 2022
4 pages
MCQ Amt 2
No ratings yet
MCQ Amt 2
9 pages
Data Mining-Exams
100% (2)
Data Mining-Exams
3 pages
Business Intelligence MCQ Practice Test
100% (1)
Business Intelligence MCQ Practice Test
8 pages
Question - Bank (MCQ) - Advance Analytics - Question Bank eDBDA Sept 21
No ratings yet
Question - Bank (MCQ) - Advance Analytics - Question Bank eDBDA Sept 21
14 pages
Exercise - DBMS - Chapter 5 - Concurrency Control Techniques
No ratings yet
Exercise - DBMS - Chapter 5 - Concurrency Control Techniques
3 pages
Data Mining and Business Intelligence File
No ratings yet
Data Mining and Business Intelligence File
53 pages
Subject Name:: Knowledge Institute of Technology & Engineering-135
No ratings yet
Subject Name:: Knowledge Institute of Technology & Engineering-135
22 pages
Chapter 15 - Data Preparation and Analysis Strategy: Multiple Choice
No ratings yet
Chapter 15 - Data Preparation and Analysis Strategy: Multiple Choice
9 pages
Data Warehouse Quiz for IT Pros
67% (3)
Data Warehouse Quiz for IT Pros
3 pages
Design and Analysis of Algorithms Exam
100% (2)
Design and Analysis of Algorithms Exam
2 pages
Visvesvaraya Technological University, Belagavi: VTU-ETR Seat No.: A
No ratings yet
Visvesvaraya Technological University, Belagavi: VTU-ETR Seat No.: A
48 pages
Data Science (Eda) MCQ (Group-2)
100% (1)
Data Science (Eda) MCQ (Group-2)
2 pages
Final Exam
29% (7)
Final Exam
6 pages
Software Testing MCQs and Concepts
No ratings yet
Software Testing MCQs and Concepts
9 pages
Data Analytics Important Questions
No ratings yet
Data Analytics Important Questions
11 pages
Heart Disease Prediction via Deep Learning
No ratings yet
Heart Disease Prediction via Deep Learning
6 pages
Attribute Oriented Induction
100% (1)
Attribute Oriented Induction
6 pages
Assignment - Intro To Deep Learning
No ratings yet
Assignment - Intro To Deep Learning
4 pages
COCOMO
No ratings yet
COCOMO
45 pages
Advanced Software Engineering Question Bank
100% (1)
Advanced Software Engineering Question Bank
14 pages
Data Warehouse & Mining MCQs
No ratings yet
Data Warehouse & Mining MCQs
4 pages
Data Mining and Warehousing
No ratings yet
Data Mining and Warehousing
12 pages
Data Mining and Warehousing
100% (3)
Data Mining and Warehousing
30 pages
Machine Learning and Python Quiz
No ratings yet
Machine Learning and Python Quiz
13 pages
Data Mining
50% (2)
Data Mining
34 pages
Chapter 6 Measures of Skewness and Kurtosis
No ratings yet
Chapter 6 Measures of Skewness and Kurtosis
25 pages
Big Data MCQ
No ratings yet
Big Data MCQ
47 pages
NoSQL Solutions (MCQ & Structural)
No ratings yet
NoSQL Solutions (MCQ & Structural)
12 pages
Business Analytics Question Bank
100% (1)
Business Analytics Question Bank
3 pages
IS328 Data Mining-Tutorial 1 Solution
No ratings yet
IS328 Data Mining-Tutorial 1 Solution
5 pages
Quiz M2
100% (1)
Quiz M2
7 pages
DM Important Questions
100% (1)
DM Important Questions
2 pages
Assignments On Halstead'S Software Science
0% (2)
Assignments On Halstead'S Software Science
2 pages
CDS Exam Practice Questions
100% (1)
CDS Exam Practice Questions
23 pages
OLAP, Data Mining, and Analysis Techniques
No ratings yet
OLAP, Data Mining, and Analysis Techniques
2 pages
Advanced Database Management System-Mcq
No ratings yet
Advanced Database Management System-Mcq
8 pages
DBMS Question Bank PDF
No ratings yet
DBMS Question Bank PDF
10 pages
Twitter Sentiment Analysis
100% (2)
Twitter Sentiment Analysis
10 pages
Smart Campus Management System
No ratings yet
Smart Campus Management System
26 pages
Customer Segmentation Using Machine Learning
100% (1)
Customer Segmentation Using Machine Learning
28 pages
Important Questions and Answers of Big Data Course
No ratings yet
Important Questions and Answers of Big Data Course
4 pages
Data Mining MCQs and Concepts
No ratings yet
Data Mining MCQs and Concepts
7 pages
Pattern Recognition Course Syllabus
0% (1)
Pattern Recognition Course Syllabus
2 pages
02 - Data Types - MCQ
No ratings yet
02 - Data Types - MCQ
4 pages
TYCS - Data Science MCQ
No ratings yet
TYCS - Data Science MCQ
6 pages
Data Mining Insights for Analysts
No ratings yet
Data Mining Insights for Analysts
43 pages
250+ MCQs on Dependability & Security
No ratings yet
250+ MCQs on Dependability & Security
6 pages
Software Engineering II MCQs Guide
No ratings yet
Software Engineering II MCQs Guide
46 pages
Data Analyst Multiple Choice Questions
100% (1)
Data Analyst Multiple Choice Questions
24 pages
Machine Learning CA 2
No ratings yet
Machine Learning CA 2
19 pages
Computer Science - Data Warehouse MCQS With Answer
No ratings yet
Computer Science - Data Warehouse MCQS With Answer
35 pages
Compiler Design Final Question Bank
No ratings yet
Compiler Design Final Question Bank
5 pages
Data Mining PDF
No ratings yet
Data Mining PDF
67 pages
Data Warehousing and Data Mining Kca012 2022
100% (1)
Data Warehousing and Data Mining Kca012 2022
2 pages
ML Bits & Answers
100% (1)
ML Bits & Answers
4 pages
Research Methodology MCQ Questions Set-2
No ratings yet
Research Methodology MCQ Questions Set-2
7 pages
Software Dev
No ratings yet
Software Dev
3 pages
Prelims
No ratings yet
Prelims
3 pages
Data Science Quiz Answers
No ratings yet
Data Science Quiz Answers
5 pages
Mechanical Hazard Safety Guide
No ratings yet
Mechanical Hazard Safety Guide
21 pages
Mechanical Hazards: Safety Presentation
No ratings yet
Mechanical Hazards: Safety Presentation
12 pages
Cardiopulmonary Resuscitation (CPR) : L/O/G/O
100% (1)
Cardiopulmonary Resuscitation (CPR) : L/O/G/O
85 pages
Idustrial Hazards Due To Fire Acciddents, Mechanical and Electrical Equipments
No ratings yet
Idustrial Hazards Due To Fire Acciddents, Mechanical and Electrical Equipments
29 pages
Assignment 3 - 2 Microcontrollers With Datasheets
76% (45)
Assignment 3 - 2 Microcontrollers With Datasheets
2 pages
Data Prep Quiz for Analysts
100% (1)
Data Prep Quiz for Analysts
2 pages
Quiz 5 - Handling Missing Valuers in KNIME and Spark
No ratings yet
Quiz 5 - Handling Missing Valuers in KNIME and Spark
2 pages
Assignment 3 - 2 Microcontrollers With Datasheets
76% (45)
Assignment 3 - 2 Microcontrollers With Datasheets
2 pages
Quiz 6 - Classification
No ratings yet
Quiz 6 - Classification
2 pages
Quiz 3 - Data Exploration in KNIME and Spark
No ratings yet
Quiz 3 - Data Exploration in KNIME and Spark
2 pages
Quiz 11 - Cluster Analysis in Spark
No ratings yet
Quiz 11 - Cluster Analysis in Spark
2 pages
Quiz 1 - Machine Learning Overview
No ratings yet
Quiz 1 - Machine Learning Overview
2 pages
Quiz 8 - Model Evaluation
100% (1)
Quiz 8 - Model Evaluation
2 pages
Quiz 2 - Data Exploration
100% (1)
Quiz 2 - Data Exploration
2 pages
Model Evaluation in KNIME and Spark
No ratings yet
Model Evaluation in KNIME and Spark
2 pages
Quiz 10 - Regression, Cluster Analysis, & Association Analysis
No ratings yet
Quiz 10 - Regression, Cluster Analysis, & Association Analysis
3 pages
Ece SYLLABUS PDF
No ratings yet
Ece SYLLABUS PDF
119 pages
CC7003-Industrial Safety Management PDF
No ratings yet
CC7003-Industrial Safety Management PDF
5 pages