0% found this document useful (0 votes)

10 views5 pages

Examples

Examples for Machine Learning

Uploaded by

harsh.pandeycs.da23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views5 pages

Examples

Examples for Machine Learning

Uploaded by

harsh.pandeycs.da23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Scenario Based Question

Scenario 1: Real Estate Market Analysis

Scenario: A real estate analyst is tasked with predicting house prices in a metropolitan
area based on various features such as square footage, number of bedrooms, age of the
house, and proximity to schools. The analyst collects a dataset containing these features
along with the corresponding sale prices of homes.

Questions:

1. Model Development: Describe the steps you would take to develop a multiple
linear regression model for predicting house prices. What preprocessing steps
would be necessary before fitting the model?
2. Interpretation and Impact: After fitting the model, you find that the coefficient for
square footage is $200. What does this mean in terms of pricing, and how would
you communicate this finding to potential home buyers and sellers? What other
factors could influence the accuracy of your model, and how might you address
them?

Scenario 2: Customer Satisfaction and Retention

Scenario: A telecommunications company is investigating the factors that influence

customer satisfaction and retention. They collect data on customer demographics, service
usage, customer service interactions, and satisfaction ratings on a scale from 1 to 10. The
company wants to use logistic regression to predict whether a customer will remain with
the service provider or switch to a competitor.

Questions:

3. Model Selection: Explain why logistic regression is a suitable choice for this
scenario. What are the dependent and independent variables in your model, and
how would you handle categorical variables in your dataset?
4. Evaluation and Strategy: After building the model, you find that the model predicts
a 70% probability of retention for customers with a high satisfaction rating. How
would you assess the model’s performance? What strategies could the company
implement to improve customer satisfaction based on the findings from your
model?
Scenario 1: Email Spam Detection

Scenario: A tech company is developing a spam detection system for its email service.
They have a dataset containing features extracted from emails, such as the presence of
certain keywords, the length of the email, and the sender's reputation. The goal is to
classify emails as either "spam" or "not spam."

Questions:

5. Model Development: Describe how you would approach building a classification

model for this spam detection system. Which algorithm would you choose (e.g.,
Logistic Regression, Decision Trees, SVM) and why? What steps would you take for
data preprocessing, including feature selection or engineering?
6. Model Evaluation: After training the model, you achieve an accuracy of 85% on the
test set. However, upon further inspection, you notice a high false positive rate.
How would you evaluate the model’s performance more thoroughly? What metrics
(e.g., precision, recall, F1 score) would you consider, and how would you address
the issue of false positives in your system?

Scenario 2: Disease Diagnosis

Scenario: A healthcare organization is developing a predictive model to diagnose a

particular disease based on patient symptoms and medical history. The dataset includes
features such as age, gender, specific symptoms, and test results. The target variable is
binary, indicating whether a patient has the disease (1) or not (0).

Questions:

7. Feature Importance: Which features do you think would be most important for the
model, and how would you determine their significance? What classification
algorithm would you use, and why?
8. Real-World Implications: After deploying the model, you find that the model's
predictions are leading to a high rate of false negatives, meaning some patients with
the disease are being incorrectly diagnosed as healthy. What steps would you take
to improve the model? How would you communicate the importance of accurate
diagnosis to healthcare providers and patients?
Scenario 1: Customer Segmentation

Scenario: A retail company wants to improve its marketing strategies by segmenting its
customer base. They have collected data on customer purchases, demographics, and
online behavior. The goal is to use clustering techniques to identify distinct customer
segments.

Questions:

Clustering Approach: Describe how you would approach the customer segmentation
problem. Which clustering algorithm would you choose (e.g., K-Means, Hierarchical
Clustering, DBSCAN) and why? What steps would you take for data preprocessing,
including feature selection and scaling?

Interpretation and Action: After applying the clustering algorithm, you identify four
distinct customer segments. How would you interpret the characteristics of each
segment? What marketing strategies would you recommend for each group to enhance
customer engagement and sales?

Scenario 2: Image Compression

Scenario: A tech company is developing an image compression algorithm to reduce the

file size of images while maintaining quality. They decide to use clustering techniques to
group similar pixel colors together in the images.

Questions:

9. Algorithm Selection: Which clustering algorithm would you recommend for this
image compression task, and why? How would you determine the optimal number
of clusters to use for effective compression?
10. Evaluation of Results: After implementing the algorithm, you notice that the
compressed images lose some quality. What metrics would you use to evaluate the
quality of the compressed images compared to the original? How would you adjust
your clustering approach to improve the balance between compression and image
quality?

Numerical Based Question

Linear Regression:

• Calculation of coefficients using the least squares method.

• Prediction of values based on a linear model.
• Residuals and their analysis.

Logistic Regression:

• Calculation of odds ratios and probabilities.

• Interpretation of coefficients.
• Confusion matrix metrics (accuracy, precision, recall, F1 score).

K-Nearest Neighbors (KNN):

• Distance calculations (Euclidean, Manhattan).

• Prediction based on majority voting.
• Effect of different values of kkk on classification results.

Data Preprocessing:

• Imputation of missing values (mean, median, mode).

• Calculation of z-scores for outlier detection.
• One-hot encoding calculations.

Classification Metrics:

• Calculating confusion matrix elements.

• ROC curve and AUC calculations.
• Evaluation metrics like accuracy, precision, recall, F1 score.

Bayesian Methods:

• Application of Bayes' theorem for probability updates.

• Prior and posterior probability calculations.

Clustering:

• K-means clustering calculations (centroid updates, inertia).

• Silhouette score calculations.

Polynomial Regression:
• Fitting polynomial models and calculating coefficients.
• Predictions using polynomial equations.

Midterm IAIDS Exam at Fasilkom UI-1
No ratings yet
Midterm IAIDS Exam at Fasilkom UI-1
14 pages
MAE for Predicting Article Views
No ratings yet
MAE for Predicting Article Views
141 pages
AAS DSExam
No ratings yet
AAS DSExam
5 pages
Company Wise Data Science Interview Questions
100% (2)
Company Wise Data Science Interview Questions
39 pages
SpecCV SeniorDataScientist
No ratings yet
SpecCV SeniorDataScientist
3 pages
Assignment - Machine Learning
No ratings yet
Assignment - Machine Learning
3 pages
Dsa - DK Question Paper
No ratings yet
Dsa - DK Question Paper
4 pages
Water Quality Forecasting
No ratings yet
Water Quality Forecasting
3 pages
Mahindra Interview
No ratings yet
Mahindra Interview
30 pages
STAR Method For ML Projects
No ratings yet
STAR Method For ML Projects
10 pages
Question Samples
No ratings yet
Question Samples
4 pages
Task - Case Study - DLMDSME01
No ratings yet
Task - Case Study - DLMDSME01
7 pages
Important Questions
No ratings yet
Important Questions
4 pages
Identifing Software Bugs or Not Using SMLT Model
No ratings yet
Identifing Software Bugs or Not Using SMLT Model
34 pages
Capstones AIML and DS Capstone Projects
No ratings yet
Capstones AIML and DS Capstone Projects
6 pages
Simulations for Product Analysts
No ratings yet
Simulations for Product Analysts
37 pages
Dsbda Prelim QB Solution
No ratings yet
Dsbda Prelim QB Solution
11 pages
Data Science Challenges and Solutions
No ratings yet
Data Science Challenges and Solutions
5 pages
Data Science For Online Customer Analytics - Assignment
No ratings yet
Data Science For Online Customer Analytics - Assignment
11 pages
LLM2
No ratings yet
LLM2
6 pages
Interview Preparation Notes
No ratings yet
Interview Preparation Notes
3 pages
Big Data Analytics: Data Prep
No ratings yet
Big Data Analytics: Data Prep
58 pages
Data Science Assignment Guide
No ratings yet
Data Science Assignment Guide
4 pages
8824 Shivam Darekar Report - 8824 Shivam Darekar
No ratings yet
8824 Shivam Darekar Report - 8824 Shivam Darekar
7 pages
Evaluation Metrics & ML Problem Types
No ratings yet
Evaluation Metrics & ML Problem Types
49 pages
Machine Learning Laboratory Report
No ratings yet
Machine Learning Laboratory Report
23 pages
Project ProblemStatements DataScience
No ratings yet
Project ProblemStatements DataScience
7 pages
Data Science Interview Questions Guide
No ratings yet
Data Science Interview Questions Guide
2 pages
MLOps Resume Sample
No ratings yet
MLOps Resume Sample
2 pages
ML Index Nancy
No ratings yet
ML Index Nancy
3 pages
Project Description Document
No ratings yet
Project Description Document
7 pages
Databyte ML Task 1
No ratings yet
Databyte ML Task 1
6 pages
Common DS Interview Questions and Answers - 1
No ratings yet
Common DS Interview Questions and Answers - 1
4 pages
Capstone Project Guidelines Data Science
No ratings yet
Capstone Project Guidelines Data Science
2 pages
ML QB
No ratings yet
ML QB
6 pages
Credit Risk Project
No ratings yet
Credit Risk Project
11 pages
DIT865 2018 Mar Solution
No ratings yet
DIT865 2018 Mar Solution
9 pages
New ITRAdd On
No ratings yet
New ITRAdd On
6 pages
DA Practice Questions - Unit - 1
No ratings yet
DA Practice Questions - Unit - 1
4 pages
Datanest - Data Science Interview
No ratings yet
Datanest - Data Science Interview
19 pages
Report On Backorder Prediction
No ratings yet
Report On Backorder Prediction
14 pages
Capstone Project Question Answers
No ratings yet
Capstone Project Question Answers
11 pages
Conceptual Understanding of M&S
No ratings yet
Conceptual Understanding of M&S
30 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Answer
No ratings yet
Answer
5 pages
20 Scenario Q&A For Data Analyst
No ratings yet
20 Scenario Q&A For Data Analyst
4 pages
3 Marks Aids
No ratings yet
3 Marks Aids
4 pages
Data Science Exam Questions
No ratings yet
Data Science Exam Questions
9 pages
Eda Fat
No ratings yet
Eda Fat
3 pages
Digital Transformation in Banking
No ratings yet
Digital Transformation in Banking
4 pages
BDMDM Telemarketing
No ratings yet
BDMDM Telemarketing
16 pages
Prediction of Breast Cancer Using Machine Learning Algorithms - 2nd Review
No ratings yet
Prediction of Breast Cancer Using Machine Learning Algorithms - 2nd Review
21 pages
Answer Sheet
No ratings yet
Answer Sheet
2 pages
40 Interview Questions Asked at Startups in Machine Learning - Data Science
No ratings yet
40 Interview Questions Asked at Startups in Machine Learning - Data Science
13 pages
Data Science Checklist
No ratings yet
Data Science Checklist
22 pages
Lavajiit Singh CV
No ratings yet
Lavajiit Singh CV
3 pages
Techouts JD & Mock Interview Questions
No ratings yet
Techouts JD & Mock Interview Questions
6 pages
Machine Learning With Matlab
100% (1)
Machine Learning With Matlab
36 pages
Form 3 Students' Allowance Study
No ratings yet
Form 3 Students' Allowance Study
15 pages
Tutorial-6 18MAB204T
No ratings yet
Tutorial-6 18MAB204T
2 pages
Assignment 2 Individual
No ratings yet
Assignment 2 Individual
10 pages
Data - Science and AI Program - 21 Days
No ratings yet
Data - Science and AI Program - 21 Days
2 pages
StatMat - 4 - Order Statistics - 22
No ratings yet
StatMat - 4 - Order Statistics - 22
20 pages
MGS3100: Exercises - Forecasting
No ratings yet
MGS3100: Exercises - Forecasting
8 pages
Stats for Students: Estimation Basics
No ratings yet
Stats for Students: Estimation Basics
55 pages
Parametric vs Nonparametric Tests Explained
100% (1)
Parametric vs Nonparametric Tests Explained
6 pages
Finding The Mean and The Variance of The Sampling Distribution of The Sample Means
No ratings yet
Finding The Mean and The Variance of The Sampling Distribution of The Sample Means
28 pages
Statistical Inference and Estimation Review
No ratings yet
Statistical Inference and Estimation Review
14 pages
Sentiment Analysis Video Game Reviews
No ratings yet
Sentiment Analysis Video Game Reviews
3 pages
DSFM for Electricity Forward Prices
No ratings yet
DSFM for Electricity Forward Prices
37 pages
GROUP 5 PROBSET1 Edited
No ratings yet
GROUP 5 PROBSET1 Edited
11 pages
Central Tendency & Variability Guide
100% (1)
Central Tendency & Variability Guide
9 pages
Stochastic Mortality Model Guide
No ratings yet
Stochastic Mortality Model Guide
9 pages
Measures of Correlation PDF
No ratings yet
Measures of Correlation PDF
14 pages
Time Series Analysis - CheatSheet
No ratings yet
Time Series Analysis - CheatSheet
10 pages
OER Statistics
No ratings yet
OER Statistics
8 pages
BCDS501 Introduction To Data Analytics and Visualization: CO1 K CO2 K, K CO3 K, K CO4 K, K CO5 K, K
No ratings yet
BCDS501 Introduction To Data Analytics and Visualization: CO1 K CO2 K, K CO3 K, K CO4 K, K CO5 K, K
1 page
PSM DFX .PDF - @dams - New - Robot
No ratings yet
PSM DFX .PDF - @dams - New - Robot
84 pages
How To Use ROC Curves and Precision-Recall Curves For Classification in Python
No ratings yet
How To Use ROC Curves and Precision-Recall Curves For Classification in Python
47 pages
Vivek-John Dominic Martins: MS - Business Intelligence & Analytics
No ratings yet
Vivek-John Dominic Martins: MS - Business Intelligence & Analytics
4 pages
PCA in Spiking Neuron Analysis
No ratings yet
PCA in Spiking Neuron Analysis
2 pages
Case Study Project BSAD 210
No ratings yet
Case Study Project BSAD 210
13 pages
MDA PrincipalComponentAnalysis
No ratings yet
MDA PrincipalComponentAnalysis
20 pages
Intro to Multiple Linear Regression
No ratings yet
Intro to Multiple Linear Regression
15 pages
Statistics Assignment Analysis
No ratings yet
Statistics Assignment Analysis
22 pages
Weight of Evidence Formula Guide PDF
No ratings yet
Weight of Evidence Formula Guide PDF
5 pages
Cost Estimation Methods Guide
No ratings yet
Cost Estimation Methods Guide
2 pages

Examples

Uploaded by

Examples

Uploaded by

Scenario Based Question

Scenario 1: Real Estate Market Analysis

Scenario 2: Customer Satisfaction and Retention

Scenario: A telecommunications company is investigating the factors that influence

5. Model Development: Describe how you would approach building a classification

Scenario 2: Disease Diagnosis

Scenario: A healthcare organization is developing a predictive model to diagnose a

Scenario 2: Image Compression

Scenario: A tech company is developing an image compression algorithm to reduce the

Numerical Based Question

• Calculation of coefficients using the least squares method.

• Calculation of odds ratios and probabilities.

K-Nearest Neighbors (KNN):

• Distance calculations (Euclidean, Manhattan).

• Imputation of missing values (mean, median, mode).

• Calculating confusion matrix elements.

• Application of Bayes' theorem for probability updates.

• K-means clustering calculations (centroid updates, inertia).

You might also like