0% found this document useful (0 votes)
41 views2 pages

DTC Assignment Unit 3

The document outlines an assignment focused on digital technologies for commerce, covering various machine learning techniques including linear regression for sales prediction, logistic regression for customer churn prediction, K-nearest neighbors for diabetes prediction, and decision trees for calculating information gain. It also includes an evaluation of a binary classification model using a confusion matrix and discusses K-means clustering for patient segmentation. The assignment is authored by Ms. Radhika Mahajan and Ms. Ashima Bhatnagar Bhatia.

Uploaded by

Srijan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views2 pages

DTC Assignment Unit 3

The document outlines an assignment focused on digital technologies for commerce, covering various machine learning techniques including linear regression for sales prediction, logistic regression for customer churn prediction, K-nearest neighbors for diabetes prediction, and decision trees for calculating information gain. It also includes an evaluation of a binary classification model using a confusion matrix and discusses K-means clustering for patient segmentation. The assignment is authored by Ms. Radhika Mahajan and Ms. Ashima Bhatnagar Bhatia.

Uploaded by

Srijan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ASSIGNMENT-1

DIGITAL TECHNOLOGIES FOR COMMERCE


I. Regression
a) Linear Regression
Predict sales based on the amount spent on advertising. Also, plot the graph for the same. The training data is
as follows:
Week ID Advertising Spend ($1000s) Sales ($1000s)
A 15 120
B 20 150
C 18 140
D 25 180
E 30 210

II. Classification
a) Logistic Regression
A telecom company wants to predict the likelihood that a customer will churn (leave the company) based on
their monthly spending.
Given: P(Churn = 1 | Spending) = 1 / (1 + e^(-(-3 + 0.1 * Spending)))
Give the predictions for a spending of 50, 100 and 20. Also, plot the graph for the same.

b) K-Nearest Neighbors (KNN)


A doctor wants to predict if a patient with glucose level of 160 and BMI of 25 will have diabetes based on
the past medical data given below:
Patient ID Glucose Level BMI Outcome (Diabetes: Yes/No)
A 180 28 Yes
B 150 24 No
C 200 30 Yes
D 120 22 No
E 170 26 Yes

c) Decision Trees
Calculate the information gain for Pitch for with respect to the target variable, Winner of the cricket match for
the dataset given below:
Pitch Format Winner
Spin T20 Pakistan
Spin T20 India
Fast ODI India
Spin ODI India
Fast T20 Pakistan
Fast ODI India
Spin ODI Pakistan
Fast T20 India
Fast ODI India
Spin ODI Pakistan

Authors: Ms. Radhika Mahajan


Ms. Ashima Bhatnagar Bhatia
III Confusion matrix
A binary classification model predicts whether a loan application is fraudulent (Positive) or not (Negative).
We tested it on 100 cases, and the confusion matrix is:
Actual / Predicted Fraudulent (P) Not Fraudulent (N)
Fraudulent (P) 30 (TP) 10 (FN)
Not Fraudulent (N) 5 (FP) 55 (TN)

Evaluate the performance of the model. Also, plot ROC-AUC curve for the same.

IV Unsupervised Learning
a) K means clustering
Segment patients into Low-Risk & High-Risk Groups based on the data below.
Patient Age BMI
A 25 22
B 45 30
C 55 35
D 30 24
E 50 32

Authors: Ms. Radhika Mahajan


Ms. Ashima Bhatnagar Bhatia

You might also like