0% found this document useful (0 votes)
36 views2 pages

Assignment 2

The document outlines Assignment 2 for the EE708 course, focusing on regression analysis and modeling. It includes various questions related to bias and variance, linear regression calculations, and programming tasks involving linear and logistic regression. Students are required to perform calculations, derive models, and implement regression techniques using provided datasets.

Uploaded by

Pankaj SIngh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views2 pages

Assignment 2

The document outlines Assignment 2 for the EE708 course, focusing on regression analysis and modeling. It includes various questions related to bias and variance, linear regression calculations, and programming tasks involving linear and logistic regression. Students are required to perform calculations, derive models, and implement regression techniques using provided datasets.

Uploaded by

Pankaj SIngh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

EE708: Fundamentals of Data Science and Machine Intelligence

Assignment 2
Based on Module 3: Regression analysis and modeling

1. When the training set is small, the contribution of variance to error may be more than that of bias,
and in such a case, we may prefer a simple model even though we know it is too simple for the task.
Can you give an example?
2. What is the effect of changing 𝜆 on bias and variance?
𝐸= [𝑟 − 𝑔(𝑥 |𝑤)] + λ 𝑤

3. On average, do people gain weight as they age? Based on a dataset of 250 samples, some summary
statistics for both age (𝑥) and weight (𝑦) are:

𝑥 = 11211.00 𝑦 = 44520.80 𝑥 𝑦 = 1996904.15

𝑥 = 543503.00 𝑦 = 8110405.02

Assume that the two variables are related according to the simple linear regression model.
a. Calculate the least squares estimates of the slope and intercept.
b. Use the equation of the fitted line to predict the weight that would be observed, on average, for
a man who is 25 years old.
c. Suppose that the observed weight of a 25-year-old man is 170 lbs. Find the residual for that
observation.
d. Was the prediction for the 25-year-old in part (c) an overestimate or underestimate? Explain
briefly.
4. An article in Concrete Research presented 14 data samples on compressive strength 𝑥 and intrinsic
permeability 𝑦 of various concrete mixes and cures. Summary quantities are:
∑𝑦 = 572 𝑥 = 43 𝑥 𝑦 = 1697.8
𝑦 = 23,530
𝑥 = 157.42
Assume that the two variables are related according to the simple linear regression model.
a. Calculate the least squares estimates of the slope and intercept. Estimate σ .
b. Use the equation of the fitted line to predict what permeability would be observed when the
compressive strength is 𝑥 = 4.3.
c. Give a point estimate of the mean permeability when compressive strength is 𝑥 = 3.7.
d. Suppose that the observed value of permeability at 𝑥 = 3.7 is 𝑦 = 46.1. Calculate the value of
the corresponding residual.
5. A study was performed to investigate the shear strength of soil 𝑦 as it relates to depth in feet 𝑥 and
% moisture content 𝑥 . Ten observations were collected, and the following summary quantities
were obtained:
𝑥 = 223 𝑥 = 5,200.9 𝑥 𝑦 = 43,550.8

𝑥 = 553 𝑥 = 31,729 𝑥 𝑦 104,736.8

𝑦 = 1,916 𝑦 = 371,595.6 𝑥 𝑥 = 12,352


a. Set up the least squares normal equations for the model
𝑌 = β +β 𝑥 +β 𝑥 +ϵ
b. Estimate the parameters in the model in part (a).
c. What is the predicted strength when 𝑥 = 18 feet and 𝑥 = 43%?

1
6. A regression model is described between the percent body fat (%BF) measured by immersion and
BMI from a study on 250 male subjects. The researchers also measured 13 physical characteristics
of each man, including his age (yrs), height (in), and waist size (in). Write out the regression model
of the percent of body fat with both height and waist as predictors with the given information:
2.9705 −4.0042𝐸 − 2 −4.1679𝐸 − 2 4757.9
(𝑋 𝑋) = −0.4004 6.0774𝐸 − 4 −7.3875𝐸 − 5 and (𝑋 𝑦) = 334335.8
−0.00417 −7.3875𝐸 − 5 2.5766𝐸 − 4 179706.7
7. Let us say we have two variables 𝑥 and 𝑥 and we want to make a quadratic fit using them, namely
𝑓(𝑥 , 𝑥 ) = 𝑤 + 𝑤 𝑥 + 𝑤 𝑥 + 𝑤 𝑥 𝑥 + 𝑤 𝑥 + 𝑤 𝑥
Derive the least square estimates of 𝑤 , 𝑖 = 0,1, … ,5, given 𝑁 data samples.

Programming Questions:
8. Assume a linear model and add 0-mean Gaussian noise to generate 100 samples.
a. Divide your sample into training and testing sets (80:20).
b. Use linear regression for the training half. Compute the mean squared error (MSE) on the
testing set.
c. Plot the fitted model along with the data.
d. Repeat the same for polynomials of degrees 2 and 3 as well.
9. Implement logistic regression using dataset A2_P2.csv. Write a code for gradient descent with
learning rates of 0.01 and 0.05. For each learning rate:
a. Plot variation of mean squared error for 20 iterations.
b. Specify the final weight value.
10. Write a code to implement regression models using dataset A2_P3.csv. Divide the dataset into
training and testing sets (80:20). Implement the following models using the training dataset and
compute MSE on the test dataset:
a. Linear regression.
b. Linear regression with LASSO regularization = 1 .
c. Linear regression with ridge regularization = 0.1 .
Use bar plots to compare MSE and feature coefficients (weights) for the three methods.

You might also like