0% found this document useful (0 votes)

36 views8 pages

Section 1 - Introduction To Regression

The document outlines exercises related to regression analysis using the Advertisement dataset, including simple data plotting, kNN regression, and finding the best k value. It provides step-by-step instructions for creating scatter plots, applying the kNN algorithm, and evaluating model performance using mean squared error. Additionally, it includes hints and references to relevant documentation for various functions used in the exercises.

Uploaded by

oracledba1963

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views8 pages

Section 1 - Introduction To Regression

Uploaded by

oracledba1963

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Section 1 - Introduction to Regression

Exercise: Simple Data Plotting

The aim of this exercise is to plot TV Ads vs Sales based on the Advertisement dataset which should
look similar to the graph given below.

Instructions:
Read the Advertisement data and view the top rows of the dataframe to get an understanding
of the data and the columns.
Select the ﬁrst 7 observations and the columns TV and sales to make a new data frame.
Create a scatter plot of the new data frame TV budget vs sales .

Hints:
The following are direct links to documentation for some of the functions used in this
exercise. As always, if you are unsure how to use a function, refer to its documentation.

pd.read_csv(filename)

Returns a pandas dataframe containing the data and labels from the ﬁle data

df.iloc[]

Returns a subset of the dataframe that is contained in the row range passed as the argument
df.head()

Returns the ﬁrst 5 rows of the dataframe with the column names

plt.scatter()

A scatter plot of y vs. x with varying marker size and/or color

plt.xlabel()

This is used to specify the text to be displayed as the label for the x-axis

plt.ylabel()

This is used to specify the text to be displayed as the label for the y-axis

plt.title()

This is used to specify the title to be displayed for the plot

Note: This exercise is auto-graded and you can try multiple attempts. See the Programing
Assignments tab in the New to edX? section of the Introduction for more information about
Automated edTests.
Exercise: Simple kNN Regression
The goal of this exercise is to re-create the plots given below. You would have come across these
graphs in the lecture as well.

Instructions:
Part 1: KNN by hand for k=1

Read the Advertisement data.

Get a subset of the data from row 5 to row 13.
Apply the kNN algorithm by hand and plot the ﬁrst graph as given above.

Part 2: Using sklearn package

Read the Advertisement dataset.

Split the data into train and test sets using the train_test_split() function.
Set k_list as the possible k values ranging from 1 to 70.
For each value of k in k_list :
Use sklearn KNearestNeighbors() to ﬁt train data.
Predict on the test data.
Use the helper code to get the second plot above for k=1,10,70.

Hints:
The following are direct links to documentation for some of the functions used in this
exercise. As always, if you are unsure how to use a function, refer to its documentation.

np.argsort()

Returns the indices that would sort an array.

df.iloc[]

Returns a subset of the dataframe that is contained in the column range passed as the argument.

plt.plot( )

Plot y versus x as lines and/or markers.

df.values

Returns a Numpy representation of the DataFrame.

pd.idxmin()

Returns index of the ﬁrst occurrence of minimum over requested axis.

np.min()

Returns the minimum along a given axis.

np.max()

Returns the maximum along a given axis.

model.fit( )

Fit the k-nearest neighbors regressor from the training dataset.

model.predict( )

Predict the target for the provided data.

np.zeros()

Returns a new array of given shape and type, ﬁlled with zeros.

train_test_split(X,y)

Split arrays or matrices into random train and test subsets.

np.linspace()

Returns evenly spaced numbers over a speciﬁed interval.

KNeighborsRegressor(n_neighbors=k_value)

Regression-based on k-nearest neighbors.

Note: This exercise is auto-graded, hence please remember to set all the parameters to the values
mentioned in the scaﬀold before marking. See the Programing Assignments tab in the New to
edX? section of the Introduction for more information about Automated edTests.
Exercise: Finding the Best k in kNN Regression
The goal here is to ﬁnd the value of k of the best performing model based on the test MSE.

Instructions:
Read the data into a Pandas dataframe object.
Select the sales column as the response variable and TV budget column as the predictor
variable.
Make a train-test split using sklearn.model_selection.train_test_split .
Create a list of integer k values using numpy.linspace .
For each value of k
Fit a kNN regression on train set.
Calculate MSE on test set and store it.
Plot the test MSE values for each k.
Find the k value associated with the lowest test MSE.
Hints:
The following are direct links to documentation for some of the functions used in this
exercise. As always, if you are unsure how to use a function, refer to its documentation.

train_test_split(X,y)

Split arrays or matrices into random train and test subsets.

np.linspace()

Returns evenly spaced numbers over a speciﬁed interval.

KNeighborsRegressor(n_neighbors=k_value)

Regression-based on k-nearest neighbors.

model.predict()

Predict the target for the provided data.

mean_squared_error()

Computes the mean squared error regression loss.

dict.keys()

Returns a view object that displays a list of all the keys in the dictionary.

dict.values()

Returns a list of all the values available in a given dictionary.

plt.plot()

Plot y versus x as lines and/or markers.

dict.items()
Returns a list of dict's (key, value) tuple pairs.

Note: This exercise is auto-graded and you can try multiple attempts. See the Programing
Assignments tab in the New to edX? section of the Introduction for more information about
Automated edTests.

R and Python Programming Exercises
100% (1)
R and Python Programming Exercises
24 pages
ML Final Prac
No ratings yet
ML Final Prac
47 pages
ML Lab Record
No ratings yet
ML Lab Record
17 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
34 pages
cp4252 Machine Learning Lab Manual
No ratings yet
cp4252 Machine Learning Lab Manual
21 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
Linear Regression with Boston Housing Data
No ratings yet
Linear Regression with Boston Housing Data
14 pages
ML Full For Print New 1
No ratings yet
ML Full For Print New 1
38 pages
ML Lab Manual
No ratings yet
ML Lab Manual
19 pages
AIML Assignment - Merged
No ratings yet
AIML Assignment - Merged
7 pages
ML Record
No ratings yet
ML Record
19 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
9 pages
Linear Regression Lab Guide
No ratings yet
Linear Regression Lab Guide
5 pages
Dav Pracs
No ratings yet
Dav Pracs
9 pages
C1 W1 Lab03 Model Representation Soln-Copy1
No ratings yet
C1 W1 Lab03 Model Representation Soln-Copy1
7 pages
ML External File-43
No ratings yet
ML External File-43
23 pages
Code and Outputs
No ratings yet
Code and Outputs
25 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
5 pages
ML (Sudhanshu)
No ratings yet
ML (Sudhanshu)
24 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
Machine Learning Programs
No ratings yet
Machine Learning Programs
10 pages
Shubham Pract 6 - Merged
No ratings yet
Shubham Pract 6 - Merged
12 pages
Lab Experiment 4 - AI
No ratings yet
Lab Experiment 4 - AI
7 pages
INDUSTRY 2 Akshat
No ratings yet
INDUSTRY 2 Akshat
12 pages
Lab Assignment 1 - KNN
No ratings yet
Lab Assignment 1 - KNN
6 pages
Linear Regression
No ratings yet
Linear Regression
6 pages
ML Combined
No ratings yet
ML Combined
254 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
ml2020 Pythonlab02
No ratings yet
ml2020 Pythonlab02
3 pages
VND - Openxmlformats Officedocument - Wordprocessingml.document&rendition 1
No ratings yet
VND - Openxmlformats Officedocument - Wordprocessingml.document&rendition 1
24 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
Record
No ratings yet
Record
23 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
ML Lab Manual
No ratings yet
ML Lab Manual
14 pages
Data Analysis for Beginners
No ratings yet
Data Analysis for Beginners
8 pages
Lab 5 Nguyenngocmaithi 20130120
No ratings yet
Lab 5 Nguyenngocmaithi 20130120
20 pages
ML
No ratings yet
ML
21 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
437 pages
IOT DA 21bee0309
No ratings yet
IOT DA 21bee0309
3 pages
ML Updated File
No ratings yet
ML Updated File
36 pages
Python Data Analytics Techniques
No ratings yet
Python Data Analytics Techniques
10 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
18 pages
Regression Techniques in Python Guide
No ratings yet
Regression Techniques in Python Guide
34 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
Hemraj Python Ass1
No ratings yet
Hemraj Python Ass1
7 pages
CO-367 Machine Learning Lab File: Submitted To: Submitted by
No ratings yet
CO-367 Machine Learning Lab File: Submitted To: Submitted by
12 pages
CP4252 Machine Learning Laboratory
No ratings yet
CP4252 Machine Learning Laboratory
37 pages
Machine Learning LAB
No ratings yet
Machine Learning LAB
20 pages
Python Simple Linear Regression Guide
No ratings yet
Python Simple Linear Regression Guide
14 pages
Coding Questions
No ratings yet
Coding Questions
124 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
Scikit-Learn: Python Data Analytics
No ratings yet
Scikit-Learn: Python Data Analytics
58 pages
Some Exercises
No ratings yet
Some Exercises
9 pages
CS250 - Simple Linear Regression Project - Saylor Academy - Saylor Academy
No ratings yet
CS250 - Simple Linear Regression Project - Saylor Academy - Saylor Academy
9 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
Exercise#8 Instructions Linear Regression Model
No ratings yet
Exercise#8 Instructions Linear Regression Model
4 pages
Machine Learning Lab Assignments
100% (2)
Machine Learning Lab Assignments
23 pages
Slack Features Security
No ratings yet
Slack Features Security
2 pages
Ark Building Devotional Scroll
No ratings yet
Ark Building Devotional Scroll
3 pages
ML Concepts Bias Variance Regularization Cleaned
No ratings yet
ML Concepts Bias Variance Regularization Cleaned
2 pages
Lesson 04 Deep Neural Network
No ratings yet
Lesson 04 Deep Neural Network
81 pages
Lesson 02 Introduction To Deep Learning
No ratings yet
Lesson 02 Introduction To Deep Learning
74 pages
Hypothesis Testing for Two Samples
No ratings yet
Hypothesis Testing for Two Samples
7 pages
AP Stats Chapter 1: Exploring Data
No ratings yet
AP Stats Chapter 1: Exploring Data
3 pages
Gender Impact on Attitude Analysis
No ratings yet
Gender Impact on Attitude Analysis
2 pages
Chen 2007
No ratings yet
Chen 2007
42 pages
The Royal Statistical Society 2003 Examinations: Solutions
No ratings yet
The Royal Statistical Society 2003 Examinations: Solutions
9 pages
Randomised Controlled Trials (RCTS) - Sample Size: The Magic Number?
No ratings yet
Randomised Controlled Trials (RCTS) - Sample Size: The Magic Number?
3 pages
Quantitative Research Course Outline
No ratings yet
Quantitative Research Course Outline
2 pages
Introduction To Business Statistic: (MATH 19) Special Subject
No ratings yet
Introduction To Business Statistic: (MATH 19) Special Subject
19 pages
Dummy Variables 2 23
No ratings yet
Dummy Variables 2 23
22 pages
Chapter 6 PDF Lecture Notes
No ratings yet
Chapter 6 PDF Lecture Notes
41 pages
Advanced Statistics: Probability Theory Review
No ratings yet
Advanced Statistics: Probability Theory Review
24 pages
Gateway Assessment #6 of 6: Correlation and Regression Analysis Submissions
No ratings yet
Gateway Assessment #6 of 6: Correlation and Regression Analysis Submissions
3 pages
Advanced Level DPP Statistics Question Mathongo
No ratings yet
Advanced Level DPP Statistics Question Mathongo
10 pages
R Programming Sessional-2 (2024)
No ratings yet
R Programming Sessional-2 (2024)
3 pages
Probability & Stochastic Processes
No ratings yet
Probability & Stochastic Processes
25 pages
Understanding Dispersion and Standard Deviation
No ratings yet
Understanding Dispersion and Standard Deviation
19 pages
Moderation & Mediation with Process
No ratings yet
Moderation & Mediation with Process
8 pages
Statistics and Probability 2019-2020
No ratings yet
Statistics and Probability 2019-2020
5 pages
CH 5 - Demand Forecasting in A Supply Chain
No ratings yet
CH 5 - Demand Forecasting in A Supply Chain
48 pages
Data Management Techniques and Analysis
No ratings yet
Data Management Techniques and Analysis
92 pages
Formulae
No ratings yet
Formulae
11 pages
BaseR Cheat Sheet
No ratings yet
BaseR Cheat Sheet
21 pages
Statistical Analysis Summary
No ratings yet
Statistical Analysis Summary
9 pages
TPAMI 2021 Instance-Dependent Positive and Unlabeled Learning With Labeling Bias Estimation
No ratings yet
TPAMI 2021 Instance-Dependent Positive and Unlabeled Learning With Labeling Bias Estimation
16 pages
ML QB Final
No ratings yet
ML QB Final
16 pages
Hypothesis Testing Quiz
No ratings yet
Hypothesis Testing Quiz
8 pages
CHARACTERIZATIONS AND ENTROPY MEASURES OF THE Revised Manuscript
No ratings yet
CHARACTERIZATIONS AND ENTROPY MEASURES OF THE Revised Manuscript
27 pages
Panaden Leinne Act14 16.1 Engdat1 Ebb1-1
No ratings yet
Panaden Leinne Act14 16.1 Engdat1 Ebb1-1
8 pages
Simulation Modeling and Analysis Overview
50% (2)
Simulation Modeling and Analysis Overview
9 pages
SUMMATIVE TEST - Measures of Position
No ratings yet
SUMMATIVE TEST - Measures of Position
1 page