0% found this document useful (0 votes)

272 views19 pages

Datanest - Data Science Interview

This document provides information and questions for an interview for a data science internship at Datanest. It includes 10 problems covering topics like data cleansing, analysis, storytelling, structured thinking, solution implementation, computer vision, NLP, Bayesian statistics, frequentist statistics, and feature engineering. Candidates are asked to provide concise answers to the problems in a slide format by the July 1st deadline. The interview is assessing problem solving skills and various data science research capacities.

Uploaded by

Juan Guillermo Ferrer Velando

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

272 views19 pages

Datanest - Data Science Interview

Uploaded by

Juan Guillermo Ferrer Velando

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Data Science Internship

Interview Preparation
Deadline: 1 July 2019
(Please answer using the slide format and give your full
name on the first slide on your answer slide)
Notes
Deadline: 1 July 2019

As fast as you apply as fast as you can be interviewed, but please

prepare your answer carefully
Title of Content

1. Data Cleansing
2. Data Analysis
3. Data Storytelling
4. Structured Thinking
5. Data Solution Implementation
6. Optional Research Capacity : Computer Vision
7. Optional Research Capacity : NLP
8. Optional Research Capacity : Bayesian Statistics
9. Optional Research Capacity : Frequentist Statistics
10. Optional Research Capacity : Feature Engineering
Clue
In Datanest, we believe problem solving is the key of any data science
activities.

Please solve the problem eﬀectively, minimize work on create synthetic data,
code, visualization, etc. Bring simplest answer that you can defend to technical
and non-technical people eﬀectively.

The problem is not hard, but it requires you to be resourceful and have a strong
understanding of the problem

We start from “Dataset 2” so “Dataset 1” does not exist in this kit

Optional
There’s 5 research capacities that we are assessing:

1. Research Capacity : Computer Vision

2. Research Capacity : NLP
3. Research Capacity : Bayesian Statistics
4. Research Capacity : Frequentist Statistics
5. Research Capacity : Featuring Accuracy

Your score is determined by 3 topics with the highest score, you can work on 5 of them .
Dataset 2

Here’s the sample of Dataset

Dataset 3
Number of
Buyers
Dataset 4

Phone Number Status

085674872274 Real

085612341234 Unreal

081243579357 Real

081328648738 Real

081122334455 Unreal

081234567890 Unreal

081726842689 Real
Problem 1: Data Cleansing

1. In Dataset 2, How to transform the description column in

order to make it easier to analyze?
2. If the columns `label` is empty in 10 millions rows what will
you do to fill the missing data?
3. What yo do to deal with abbreviation and misspelled words?
4. How to deal with Imbalanced Classes, Outliers,and Rare Data?
Problem 2: Data Analysis

1. What is diﬀerence between bias and variance?

2. How do you know if one machine learning algorithm is better than another on
accuracy, reliability, and scalability?
3. What is difference between close-form and non close-form?
4. What is difference between feature, parameter, and variables?
5. What is difference between survival analysis, time series analysis, classification,
recommendation engine and clustering (in terms of input and output)?
6. What is differences between Hold-Out Validation, Cross- Validation, and
Bootstrapping?
Problem 3: Data Storytelling

1. Based on Dataset 3 (Slide 7) left chart, how many people that

came in May 2018 are still coming in July 2018?
2. What data need to make chart on Dataset 3?
3. How to create the left chart on Dataset 3?
4. How to create the right chart on Dataset 3?
5. If we make chart based on left chart in Data, what chart that
you need to make?
Problem 4: Structured Thinking

1. Based on dataset 4, What pattern determined that the number

is real and unreal?
2. Write pseudocode to determine if the number is real and
unreal?

(Clue: you can do multiple pseudocode)

Problem 5: Data Solution Implementation
1. What is differences between Business Intelligence and Data Science in terms of (a)
business question, (b) analytic characteristics, (c) analytic engagement processes, (d)
data models, (e) business view.
2. List benefits that data lake could bring to organizations existing data warehousing
environment, business analysts and data scientists.
3. What are issues that are preventing companies migrates to cloud solutions?
4. List the cultural changes that organizations must address if they would like to
become data driven, to leverage big data to its maximum business potential and what
are the organization needs to address those challenges.
5. Select two of outward-facing BI dashboards that can be checked daily/weekly (one
example for retail industry, one for financial industry), and what is the most important
insight to be displayed?
Problem 6: Computer Vision
1. Describe the required steps in order to build a proper object detection engine!
2. What is the difference between Semantic Segmentation, Object Detection, Image
Generation, and Pose Estimation in terms of Input, Output and Label?
3. In YOLO (https://pjreddie.com/darknet/yolo/), there are 5 type of loss function, can
you please explain them?
Problem 7: NLP
1. Explain differences (pros and cons) between building chatbot with NLTK, Seq2seq,
and Rasa Framework
2. What is differences between TF-IDF, Cosine Similarity, FastText in terms on text based
feature engineering?
Problem 8: Bayesian Statistics
1. What is differences between Bayesian and Frequentist statistics?
2. What is types of bayesian statistics are available on Ludwig
(https://uber.github.io/ludwig/), and describe their inputs and outputs of them?
Problem 9: Frequentist Statistics
1. You have multiple ads to experiment in a campaign, explain the steps of
experiment using (a) A/B Testing, and (b) Multi-Armed Bandit.
2. What is difference between panel data analysis, longitudinal data analysis,
multilevel statistical model, and structural equation modeling in terms of dataset
requirement?
Problem 10: Feature Engineering
1. What is differences between LabelCount, Target , NaN Encoding, Polynomial,
Consolidation and Expansion Encoding
2. What is differences between standard (Z), MinMax, Root and Log scaling
3. Please list 5 feature engineering on address data
Great things happen to
those who don't stop
Closing
believing, trying,
learning, and being
grateful.

Data Scientist Interview Prep Guide
No ratings yet
Data Scientist Interview Prep Guide
7 pages
120 Data Science Interview Questions
No ratings yet
120 Data Science Interview Questions
19 pages
Top Data Science Interview Questions and Answers in 2023 PDF
100% (1)
Top Data Science Interview Questions and Answers in 2023 PDF
14 pages
Data Science & ML Interview Guide
100% (1)
Data Science & ML Interview Guide
18 pages
AI & ML Interview Preparation
No ratings yet
AI & ML Interview Preparation
15 pages
Python Interview Questions 1653100147
No ratings yet
Python Interview Questions 1653100147
24 pages
Data Science Note
No ratings yet
Data Science Note
24 pages
Basic Data Science Interview Questions
No ratings yet
Basic Data Science Interview Questions
18 pages
Data Science Notes - TutorialsDuniya
No ratings yet
Data Science Notes - TutorialsDuniya
59 pages
Data Science Interview Questions - 365 Questions
No ratings yet
Data Science Interview Questions - 365 Questions
48 pages
Data Science Interview Questions
No ratings yet
Data Science Interview Questions
27 pages
Accenture Data Scientist Interview Questions
No ratings yet
Accenture Data Scientist Interview Questions
13 pages
How To Use LeetCode For Data Science SQL Interviews - StrataScratch
No ratings yet
How To Use LeetCode For Data Science SQL Interviews - StrataScratch
1 page
ML Interview Prep for Startups
100% (1)
ML Interview Prep for Startups
33 pages
Data Science Interview Questions
100% (1)
Data Science Interview Questions
68 pages
Data Science Interview Quesions
No ratings yet
Data Science Interview Quesions
22 pages
R Programming Interview Questions 2016
100% (2)
R Programming Interview Questions 2016
56 pages
Machine Learning Engineer Interview Questions
No ratings yet
Machine Learning Engineer Interview Questions
2 pages
Real Estate ML Project Guide
No ratings yet
Real Estate ML Project Guide
20 pages
SuperDataScience - Data Scientist Learning Path Study Plan
100% (2)
SuperDataScience - Data Scientist Learning Path Study Plan
30 pages
Amazon Applied Scientist Interview Guide
No ratings yet
Amazon Applied Scientist Interview Guide
4 pages
Lecture+Notes (Upgrad)
No ratings yet
Lecture+Notes (Upgrad)
5 pages
Data Science Interview
100% (4)
Data Science Interview
12 pages
Amity PGP-DS: Online Data Science Program
No ratings yet
Amity PGP-DS: Online Data Science Program
15 pages
Data Science Interview Questions
No ratings yet
Data Science Interview Questions
31 pages
DATA SCIENCE With DA, ML, DL, AI Using Python & R PDF
100% (1)
DATA SCIENCE With DA, ML, DL, AI Using Python & R PDF
10 pages
Vector Spaces: Definitions and Examples
No ratings yet
Vector Spaces: Definitions and Examples
94 pages
Data Science Interview Q&A Guide
No ratings yet
Data Science Interview Q&A Guide
21 pages
Full Stack Developer Interview Guide
No ratings yet
Full Stack Developer Interview Guide
6 pages
Python Interview Questions and Answers - Mytectra
No ratings yet
Python Interview Questions and Answers - Mytectra
58 pages
Intro to Data Science Fields
No ratings yet
Intro to Data Science Fields
8 pages
Data Pipeline
No ratings yet
Data Pipeline
13 pages
Data Science Interview Questions and Answers For 2020 PDF
No ratings yet
Data Science Interview Questions and Answers For 2020 PDF
20 pages
17 Free Data Science Projects Guide
100% (1)
17 Free Data Science Projects Guide
9 pages
SC QB
No ratings yet
SC QB
24 pages
The 8 Basic Statistics Concepts For Data Science - KDnuggets
No ratings yet
The 8 Basic Statistics Concepts For Data Science - KDnuggets
13 pages
Aspiring Google Data Scientists Guide
No ratings yet
Aspiring Google Data Scientists Guide
8 pages
Python Data Science Internship Report
100% (1)
Python Data Science Internship Report
24 pages
Data Science Interview Prep: CNNs Explained
No ratings yet
Data Science Interview Prep: CNNs Explained
11 pages
AI ML Interview Introduction
No ratings yet
AI ML Interview Introduction
15 pages
Machine Learning Engineering Guide
No ratings yet
Machine Learning Engineering Guide
308 pages
Python JSON: Convert Data Easily
No ratings yet
Python JSON: Convert Data Easily
1 page
Essential Data Science Interview Questions
100% (2)
Essential Data Science Interview Questions
55 pages
Python Programming Guide
No ratings yet
Python Programming Guide
211 pages
Introduction To Open Data Certificates
100% (3)
Introduction To Open Data Certificates
26 pages
Data Analytics Interview Handbook Isb
No ratings yet
Data Analytics Interview Handbook Isb
40 pages
Data Science Interview Questions (#Day11) PDF
100% (1)
Data Science Interview Questions (#Day11) PDF
11 pages
Data Science Interview Examples Guide
0% (1)
Data Science Interview Examples Guide
32 pages
Amdocs Questions
No ratings yet
Amdocs Questions
8 pages
360DigiTmg E Book Data Science
100% (1)
360DigiTmg E Book Data Science
168 pages
Free Data Science Notes Download
No ratings yet
Free Data Science Notes Download
59 pages
Python Pandas Interview Questions 2023
No ratings yet
Python Pandas Interview Questions 2023
20 pages
100 Data Science Interview Questions and Answers
No ratings yet
100 Data Science Interview Questions and Answers
33 pages
1.2 Introduction To Applied Data Science
No ratings yet
1.2 Introduction To Applied Data Science
47 pages
Python Course for Beginners
No ratings yet
Python Course for Beginners
5 pages
49 Machine Learning
No ratings yet
49 Machine Learning
300 pages
Data Science Interview Preparation Guide
No ratings yet
Data Science Interview Preparation Guide
17 pages
Data Science Challenges and Solutions
No ratings yet
Data Science Challenges and Solutions
5 pages
Data Science Internship Guide
No ratings yet
Data Science Internship Guide
14 pages
Company Wise Data Science Interview Questions
100% (2)
Company Wise Data Science Interview Questions
39 pages
Thesis Results Section Guidelines
100% (3)
Thesis Results Section Guidelines
8 pages
Essential Statistics 1st Edition by Gould and Ryan ISBN Test Bank
100% (67)
Essential Statistics 1st Edition by Gould and Ryan ISBN Test Bank
10 pages
Probability Distributions and Expectations
No ratings yet
Probability Distributions and Expectations
4 pages
Evaluating Aldi's HR Practices
No ratings yet
Evaluating Aldi's HR Practices
10 pages
Engineering Probability Basics
No ratings yet
Engineering Probability Basics
16 pages
3RD Quarter - Module
No ratings yet
3RD Quarter - Module
5 pages
Black Book Final
No ratings yet
Black Book Final
31 pages
Internal Auditor Characteristics, Internal Audit Effectiveness and Moderating Effect of Senior Management
No ratings yet
Internal Auditor Characteristics, Internal Audit Effectiveness and Moderating Effect of Senior Management
19 pages
Aptitude Test
No ratings yet
Aptitude Test
31 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
21 pages
Journal of Archaeological Science: Reports: Mallory A. Melton, Matthew E. Biwer, Rita Panjarjian
No ratings yet
Journal of Archaeological Science: Reports: Mallory A. Melton, Matthew E. Biwer, Rita Panjarjian
10 pages
DS Mod 1 To 2 Complete Notes
No ratings yet
DS Mod 1 To 2 Complete Notes
63 pages
Module 4 - Research Methodology
No ratings yet
Module 4 - Research Methodology
25 pages
5puspitafinal Jurnal 1502 1511
No ratings yet
5puspitafinal Jurnal 1502 1511
10 pages
BCA Syllabus for Semesters V & VI
No ratings yet
BCA Syllabus for Semesters V & VI
28 pages
VISWA Project Report
No ratings yet
VISWA Project Report
65 pages
BasicNeuralNetwork TrainingAndEvaluation - Ipynb Colaboratory
No ratings yet
BasicNeuralNetwork TrainingAndEvaluation - Ipynb Colaboratory
2 pages
Seminar1 Part1 Student Copy SUSS
No ratings yet
Seminar1 Part1 Student Copy SUSS
6 pages
Work Measurement Techniques Guide
No ratings yet
Work Measurement Techniques Guide
6 pages
Rebel: My Escape From Saudi Arabia To Freedom 1st Edition Rahaf Mohammed Instant Download
No ratings yet
Rebel: My Escape From Saudi Arabia To Freedom 1st Edition Rahaf Mohammed Instant Download
130 pages
Ch10 Experimental Design Statistical Analysis of Data
No ratings yet
Ch10 Experimental Design Statistical Analysis of Data
38 pages
A Python Based Multi-Point Geostatistics by Using Direct Sampling Algorithm
No ratings yet
A Python Based Multi-Point Geostatistics by Using Direct Sampling Algorithm
4 pages
IJAAFMR220203
No ratings yet
IJAAFMR220203
14 pages
Potential & Sales Forecasting
No ratings yet
Potential & Sales Forecasting
9 pages
King Crab Population Analysis in Tableau
No ratings yet
King Crab Population Analysis in Tableau
19 pages
Stat-Review Xid-8243919 1
No ratings yet
Stat-Review Xid-8243919 1
24 pages
Quasi-Experimental Research Guide
No ratings yet
Quasi-Experimental Research Guide
12 pages
2021 Meteorology Master's Program Guide
No ratings yet
2021 Meteorology Master's Program Guide
23 pages
Advanced Machine Learning Course Info
No ratings yet
Advanced Machine Learning Course Info
5 pages
School Bags and Musculoskeletal Pain Among Elementary School Children in Chennai City
No ratings yet
School Bags and Musculoskeletal Pain Among Elementary School Children in Chennai City
9 pages

Datanest - Data Science Interview

Uploaded by

Datanest - Data Science Interview

Uploaded by

Data Science Internship

As fast as you apply as fast as you can be interviewed, but please

We start from “Dataset 2” so “Dataset 1” does not exist in this kit

1. Research Capacity : Computer Vision

Here’s the sample of Dataset

Phone Number Status

1. In Dataset 2, How to transform the description column in

1. What is diﬀerence between bias and variance?

1. Based on Dataset 3 (Slide 7) left chart, how many people that

1. Based on dataset 4, What pattern determined that the number

(Clue: you can do multiple pseudocode)

You might also like