MGCR 271: Assignment #4
Fall 2019
Due date: Monday November 25th at the beginning of class.
Please print, do not email. Late submissions will not be accepted.
Instructions:
• Please print this booklet and answer all questions within the space provided. You must
submit the assignment as a hard copy at the beginning of the class.
• Please staple your booklet.
• You are encouraged to work in pairs (but it is not mandatory).
Student 1
Last Name First Name Student ID Section (check on)
Section 001: 11:35am–12:55pm
Section 002: 1:05pm–2:25pm
Student 2
Last Name First Name Student ID Section (check on)
Section 001: 11:35am–12:55pm
Section 002: 1:05pm–2:25pm
Instructor: Angelos Georghiou
1
Question 1 (Multiple Regression): We conduct a study to determine the determinants of house
prices (in thousands of dollars) in Montreal. We run the following multiple regression model:
House prices = β0 + β1 (Areai ) + β2 (Number of Bedroomsi ) + β3 (Year of Constructioni ) + ξi
The explanatory variables are: (i) Area, a dummy variable (where Downtown = 1, and Other =
0); (ii) Number of Bedrooms; and (iii) Year of Construction.
We obtain a sample of 100 observations. We obtained the following results:
ANOVA
df SS MS
Regression 3 5523.709 1841.57
Residual A 335.5119 3.505333
Total 99 5861.221
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 39.75415 31.3013 1.270048 0.207137 −22.3784 101.8867
Area B 0.374814 13.72036 0.000000 4.398583 5.886582
Year of Construction 0.03474 0.015665 2.2176 C 0.003644 0.065835
Number of Bedrooms 5.000415 0.133277 37.51899 0.000000 4.735863 D
1. (2 points) Find the value of A in the above table.
Answer:
2
2. (2 points) Find the value of B in the above table.
Answer:
3. (2 points) Find the value of C in the above table (a reasonably small range is OK if the
exact value is not available from tables):
Answer:
4. (2 points) Find the value of D in the above table:
Your Work:
3
5. (2 points) What is the F-statistic for the above model?
Answer:
6. (2 points) Approximately, what is the price of a 3-bedroom house in downtown if the con-
struction year is 2015?
Answer:
7. (2 points) Now suppose you don’t care about the year of construction. Instead you want to
test the effect of each specific period {1980s, 1990s, 2000s, 2010s} in your model. How would
you write the regression equation?
Answers:
4
Question 2 (Time-series forecasting)
1. (3 points) Dummy variables corresponding to first, second, and third quarters were added to
the linear trend model. These indicator variables are X1 , X2 , and X3 (the excluded category
is quarter 4). The estimated trend-and-season model is
\ = 100 + 20t − 15X1 − 20X2 − 10X3
Supply
where t is the number of periods (quarters) elapsed since the beginning of the series. Based
on this regression equation, draw (by hand or attaching a plot from software) the time-series
plot for expected sales, starting at t = 0 and ending at t = 10 (note: the relationship should
be a time-series of dots connected by lines). Assume that at t = 0, we are in quarter 1.
Draw your time-series plot here:
5
2. The spreadsheet forecasting.xls has data for the past 30 weeks of sales for product X. Based
on the data, answer the following questions. Note: Different people could reach different
conclusions, but you need to give good reasons for your conclusions.
(a) (3 points) Plot the data. Just by looking at the plot, do you think a 2-week simple
moving average would be a better forecast, or a 4-week simple moving average? Note:
Different people could reach different conclusions, but you need to give good reasons for
your conclusions.
Draw your time-series plot here:
6
(b) (4 points) Try four different exponential smoothing forecasts, using α = 0.2,0.4,0.6,
and 0.8. Which is the best forecast? Justify your answer by calculating MAD (for the
exponential smoothing method, assume the forecast for the first week is 600).
Answer:
7
Question 3 (Logistic Regression):
Researchers believe that whether someone is hired or not is related to their age (in years). To test
this theory, we collected data on 100 people, and ran the following logistic regression model:
p
ln = β0 + β1 Age
1−p
where p is the binomial probability that someone is hired.
Parameter Estimate Std.Error
Intercept 0.80 0.65
Age -0.02 0.015
1. (2 points) If someone’s age decreases by 5 years, by how much will his odds of being hired
decrease / increase?
Answer:
2. (2 points) If someone is 28 years old, what is the probability of being hired?
Answer:
8
3. (2 points) Calculate a 95% confidence interval for the slope.
Answer:
4. (2 points) Based on the confidence interval calculated above, what can you say about the
P-value for the χ2 the test for the slope? Explain your answer.
(a) P-value < 0.01
(b) P-value is between 0.01 and 0.05
(c) P-value is between 0.05 and 0.1
(d) P-value > 0.1
Answer:
9
For Instructor use only
Question Points Available Points Scored
Q1 14
Q2 10
Q3 8
Total 32
10