Part I: Written Exercises: Homework 3 Submit On NYU Classes by Fri. Oct. 20 at Noon

This document outlines Homework 3 for the CS6923 Machine Learning course at NYU, detailing submission guidelines and collaboration rules. It consists of two parts: written exercises focusing on regression and classification problems, and programming exercises that involve implementing models in Python or Matlab. The homework requires students to analyze datasets, compute coefficients, and evaluate model performance while adhering to specific formatting and submission instructions.

Uploaded by

HaonanZhu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views3 pages

Part I: Written Exercises: Homework 3 Submit On NYU Classes by Fri. Oct. 20 at Noon

Uploaded by

HaonanZhu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

CS6923 Machine Learning, Fall 2017

Prof. Hellerstein, NYU School of Engineering

Homework 3

Submit on NYU Classes by Fri. Oct. 20 at noon. You may work together
with one other person on this homework. If you do that, hand in JUST ONE
homework for the two of you, with both of your names on it. You may *discuss*
this homework with other students but YOU MAY NOT SHARE WRITTEN
ANSWERS OR CODE WITH ANYONE BUT YOUR PARTNER.
IMPORTANT SUBMISSION INSTRUCTIONS: Please submit your
solutions in 3 separate files: one file for your written answers to Part I, one file
for your written answers/output for the questions in Part II, and one file with
your code (a zip file if your code requires more than one file).

Part I: Written Exercises

1. A movie company wants to predict which of its new movies will be suc-
cessful. It has data about the previous films it has released.

(a) Suppose we want to formulate this as a regression problem. What

value should be predicted? It should measure success in a mean-
ingful way, and be something that could be computed automatically
from data (rather than by asking a person). There is more than one
possible correct answer to this question.
(b) Suppose we want to predict based on only one input attribute, where
the attribute is real-valued. Describe an attribute (in English) that
might be helpful in making our predictions.
(c) Now consider the problem of training a predictor that uses only the
one attribute you have described. Do you think that constructing a
linear model (i.e., fitting a line) would be a reasonable choice? Why
or why not? If so, do you think the slope of the learned line would
be positive or negative? Explain your answer.

2. Consider the following training set

x r
3 5
2 9
7 10
1 4

(a) Using the closed-form formula we covered in class, compute the co-
efficients of the line y = w1 x + w0 that minimizes squared error.

1
(b) Using the following formula for squared error, compute the squared
error of your line on the above training set.

N
X
E= (y t − rt )2
t=1

where y t is the predicted value for example xt .

3. A researcher studying a certain radioactive isotope believes that its mass

decays exponentially with time, according to the following model (for-
mula):

m(x) ≈ αe−βx

where x is the time variable, α and β are constants, and m(x) is the mass
at time x.
The researcher wants to learn the constants α and β from training data.
Because the above model is non-linear, the researcher cannot use linear
regression directly on the training data.

(a) Taking logs, show that you can transform the above model into one
that is linear in x.
(b) Give a formula for computing the values of α and β from a training
set {xt , rt }N
t=1 so as to minimize the squared error between the logs
t
of the predicted values P y andt the given values rt , as given in the
t 2
following expression: t (log y − log r ) .
(c) Write a few lines of Python or Matlab code that would compute the
values of α and β, according to the formula you just specified. (You
will use this in Part 2.)

4. Consider a classification problem with real valued attributes, and two

classes, C1 and C2 .
If we apply logistic regression to this problem, we learn an expression for
1
P [C1 |x] of the form f (x) = T , where x and w are d-dimensional
1+e−(w x+w0 )
vectors. Usually, we predict class C1 for x if f (x) ≥ 0.5. This is true iff
wT x + w0 ≥ 0. This is a linear function of x. Therefore, our prediction
uses the linear discriminant function g(x) = wT x + w0 ≥ 0.
If we want to avoid false positives, we might instead predict class C1 if
f (x) ≥ .75. Which linear discriminant function would be used for predic-
tion in this case?

Part 2: Programming Exercises

2
5. Write a Python or Matlab program that reads in a csv file called census-
data.csv, which is a regression dataset. The file has two columns. The first
column corresponds to the first attribute, year. The second corresponds
to population. Each row is an example, giving the population (in millions)
in the stated year (starting from year 1790).
The program should fit a model of the form y = αe−βx where x is the year,
and y is the predicted population, by finding the α and β that minimize
t t 2
P
t (log y − log r ) , To do this, use the Python or Matlab lines of code
you wrote for this purpose in Part I.
Your program should output theP values of α and β, the actual sum squared
error t (y t − rt )2 , the value of t (log y t − log rt )2 , and a plot of the data
P
with the learned curve.
were chosen to minimize t (log y t −
P
Notes: The α and β that you computed
t 2
) , and may not minimize t (y t −rt )2 . (In fact, the value you obtain
P
log rP
for t (y t − rt )2 may be very large.) Nevertheless, we hope the computed
α and β will produce a curve that fits the original data well.
In MATLAB, if the training data is in column vectors x and r, and the
predicted values for x are in column vector y, the plot can be formed using
the commands:

scatter(x,r)
hold on
plot(x,y)

6. Repeat the previous exercise, but this time, subtract 1790 from each entry
in column 2 so that the years begin with
P 0. Again, output theP values of α
and β, the actual sum squared error t (y t − rt )2 , the value of t (log y t −
log rt )2 , and a plot of the data with the learned curve.
Did this improve the fit? Explain your answer.
7. In Matlab and Python, there are tools that allow you to fit polynomials to
data, to minimize squared error. For example, in Matlab there is polyfit,
and in numpy there is numpy.polyfit. Use one of these two and try and
fit polynomials of degree 1 and 2 to the data in censusdata.csv. Show the
plots of the two resulting curves on the same graph.
(For Matlab, see https://www.mathworks.com/help/matlab/ref/polyfit.
html)
Then try to fit a curve of degree 3. Report what happens.
8. Suppose you are given a choice of predicting future population growth by
using one of the curves that you have learned above. Which one would you
choose and why? Do you think you have enough information to make a
good choice? You can answer this question either by giving an explanation
in English, or by giving some experimental evidence, or both.

Matlab Homework Experts 2
No ratings yet
Matlab Homework Experts 2
10 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
Linear Regression Lab Guide
100% (1)
Linear Regression Lab Guide
8 pages
Regression All
No ratings yet
Regression All
22 pages
Columbia ML Homework Guide
No ratings yet
Columbia ML Homework Guide
3 pages
Python Data Preprocessing & Regression
No ratings yet
Python Data Preprocessing & Regression
68 pages
R and Python Programming Exercises
100% (1)
R and Python Programming Exercises
24 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
Objects Oriented Programming OOP
No ratings yet
Objects Oriented Programming OOP
67 pages
ML File - Merged
No ratings yet
ML File - Merged
24 pages
Lecture10 Mid
No ratings yet
Lecture10 Mid
43 pages
Objects Oriented Programming OOP
No ratings yet
Objects Oriented Programming OOP
66 pages
Statistical Learning Guide
No ratings yet
Statistical Learning Guide
5 pages
Machine Learning Midterm Prep
No ratings yet
Machine Learning Midterm Prep
42 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Machine Learning Coursework Guide
No ratings yet
Machine Learning Coursework Guide
10 pages
Data Science for Beginners
No ratings yet
Data Science for Beginners
98 pages
Linear Regression and Qualitative Predictors Analysis
No ratings yet
Linear Regression and Qualitative Predictors Analysis
66 pages
ISLP - Website-135-200 (1) - 1-60
No ratings yet
ISLP - Website-135-200 (1) - 1-60
60 pages
Python Data Analytics Techniques
No ratings yet
Python Data Analytics Techniques
10 pages
hw3 Red
No ratings yet
hw3 Red
4 pages
Machine Learning Lab Experiments Guide
No ratings yet
Machine Learning Lab Experiments Guide
47 pages
Regression Analysis and Curve Fitting
No ratings yet
Regression Analysis and Curve Fitting
34 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Machine Learning PYQ 2021
No ratings yet
Machine Learning PYQ 2021
4 pages
Sci ML Mock Exam 2023
No ratings yet
Sci ML Mock Exam 2023
8 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
11 pages
Python Data Analysis Guide
No ratings yet
Python Data Analysis Guide
171 pages
AI14 - MachineLearning
No ratings yet
AI14 - MachineLearning
49 pages
Content PDF
No ratings yet
Content PDF
61 pages
Elements of Statistical Learning Overview
No ratings yet
Elements of Statistical Learning Overview
63 pages
Polynomial Fitting with Least Squares
No ratings yet
Polynomial Fitting with Least Squares
12 pages
Machine Learning Homework 1
No ratings yet
Machine Learning Homework 1
8 pages
Ml0101En-Reg-Nonelinearregression-Py-V1: 1 Non Linear Regression Analysis
No ratings yet
Ml0101En-Reg-Nonelinearregression-Py-V1: 1 Non Linear Regression Analysis
12 pages
Islp 1
No ratings yet
Islp 1
15 pages
21Csc305P-Machine Learning: Offline
No ratings yet
21Csc305P-Machine Learning: Offline
8 pages
Ps and Solution CS229
No ratings yet
Ps and Solution CS229
55 pages
Empirical Models and Data Collection
No ratings yet
Empirical Models and Data Collection
14 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
Machine Learning (CSEN3203) 1-14
No ratings yet
Machine Learning (CSEN3203) 1-14
15 pages
CS229 Practice Midterm Overview
No ratings yet
CS229 Practice Midterm Overview
4 pages
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
No ratings yet
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
78 pages
Assignment III
No ratings yet
Assignment III
3 pages
Python Simple Linear Regression Guide
No ratings yet
Python Simple Linear Regression Guide
14 pages
Polynomial & Categorical Regression
No ratings yet
Polynomial & Categorical Regression
10 pages
Code and Outputs
No ratings yet
Code and Outputs
25 pages
ML Lab Programs For Exam
No ratings yet
ML Lab Programs For Exam
10 pages
Machine Learning Quiz for Students
No ratings yet
Machine Learning Quiz for Students
45 pages
Exam 2 Review
No ratings yet
Exam 2 Review
23 pages
Linear Regression with Boston Housing Data
No ratings yet
Linear Regression with Boston Housing Data
14 pages
Regression Analysis
No ratings yet
Regression Analysis
11 pages
ML Manoj
No ratings yet
ML Manoj
51 pages
Lab Mannual of ML
No ratings yet
Lab Mannual of ML
43 pages
Machine Learning Assignment Guide
No ratings yet
Machine Learning Assignment Guide
2 pages
CL IV Manual
No ratings yet
CL IV Manual
108 pages
Intro To Data Science Lecture 1
No ratings yet
Intro To Data Science Lecture 1
7 pages
Linear Regression Explained
No ratings yet
Linear Regression Explained
8 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
5 pages
Ps 1
No ratings yet
Ps 1
5 pages
User's Manual: Ns King 1000-2000-3000VA
No ratings yet
User's Manual: Ns King 1000-2000-3000VA
26 pages
Git and GitHub Basics Explained
100% (1)
Git and GitHub Basics Explained
110 pages
Subjective Test - 03: Test Paper (Maths)
No ratings yet
Subjective Test - 03: Test Paper (Maths)
4 pages
Plantnode Troubleshooting A Chain Zone LED Board That Is Not Working 2
No ratings yet
Plantnode Troubleshooting A Chain Zone LED Board That Is Not Working 2
12 pages
Et3491-Embedded Systems and Iot Design
No ratings yet
Et3491-Embedded Systems and Iot Design
81 pages
1 - Network Design and Presentation-1
No ratings yet
1 - Network Design and Presentation-1
7 pages
Understanding USB Architecture and Design
No ratings yet
Understanding USB Architecture and Design
3 pages
May 2024 - Paper 2
No ratings yet
May 2024 - Paper 2
20 pages
1st Year Subjects - Part-2
No ratings yet
1st Year Subjects - Part-2
7 pages
Quadratic Poster
No ratings yet
Quadratic Poster
3 pages
Flutter Basis
No ratings yet
Flutter Basis
6 pages
Xerox Brand Positioning 10917 PDF
No ratings yet
Xerox Brand Positioning 10917 PDF
9 pages
10 Basic Python Examples to Learn Fast
No ratings yet
10 Basic Python Examples to Learn Fast
28 pages
Digital Audio Forensics Fundamentals From Capture To Courtroom James Zjalic Download
No ratings yet
Digital Audio Forensics Fundamentals From Capture To Courtroom James Zjalic Download
83 pages
IJAST-3
No ratings yet
IJAST-3
14 pages
Mi Smart Band 6
No ratings yet
Mi Smart Band 6
1 page
JNTU Anantapur Ph.D. Scholars List
No ratings yet
JNTU Anantapur Ph.D. Scholars List
11 pages
Run Command
No ratings yet
Run Command
5 pages
53TW CV Applying For Job
No ratings yet
53TW CV Applying For Job
7 pages
Allen Bradley PLC Programming Guide
100% (5)
Allen Bradley PLC Programming Guide
50 pages
Invoice of Acer Laptop
No ratings yet
Invoice of Acer Laptop
1 page
Santhosh Sarangi Resume
No ratings yet
Santhosh Sarangi Resume
3 pages
Information Security Policy Assignment
No ratings yet
Information Security Policy Assignment
2 pages
Class 12 Computer Science Question Paper (CBSE) - Mr. Sunil Nehra
No ratings yet
Class 12 Computer Science Question Paper (CBSE) - Mr. Sunil Nehra
8 pages
Python Objective Test
No ratings yet
Python Objective Test
2 pages
Entropy Label With EVPN Deep-Dive Technical Presentation
No ratings yet
Entropy Label With EVPN Deep-Dive Technical Presentation
22 pages
AHFE2022 FinalProgram
No ratings yet
AHFE2022 FinalProgram
92 pages
2023 CCPM Report Template and Action Plan Eng
No ratings yet
2023 CCPM Report Template and Action Plan Eng
10 pages
Opening and Importing ANSYS Files
No ratings yet
Opening and Importing ANSYS Files
50 pages
@vtucode - in 21CS62 Model Paper 2022 Scheme
100% (1)
@vtucode - in 21CS62 Model Paper 2022 Scheme
2 pages