0% found this document useful (0 votes)
98 views25 pages

Chapter 3 Multiple Regression

Chapter 3 covers the Multiple Linear Regression Model, detailing its components such as the meaning of partial regression coefficients, assumptions, parameter estimation, and hypothesis testing. It explains the relationship between F and R², the adjusted R², and restricted least squares, while providing practical examples and exercises for better understanding. The chapter also includes theoretical and numerical questions to reinforce learning.

Uploaded by

Khushi Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
98 views25 pages

Chapter 3 Multiple Regression

Chapter 3 covers the Multiple Linear Regression Model, detailing its components such as the meaning of partial regression coefficients, assumptions, parameter estimation, and hypothesis testing. It explains the relationship between F and R², the adjusted R², and restricted least squares, while providing practical examples and exercises for better understanding. The chapter also includes theoretical and numerical questions to reinforce learning.

Uploaded by

Khushi Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 25
Chapter 3 Multiple Linear Regression Model After learning this chapter you will understand : Multiple Variable Linear Regression Model. Meaning of Partial Regression Coefficient. Assumptions of Multiple Linear Regression Model. Estimation of Parameters of Multiple Regression. Multiple Coefficient of Determination R?. Hypothesis Testing of Parameters. Relationship Between F and R*. The Adjusted R*. Restricted Least Squares. VVVVVVVVV For Full Course Video Lectures of All Subjects of Eco. (Hons), B Com (H), BBE, MA Economics, NTA UGC NET Economics, Indian Economic Service (IES) Register yourself at www.primeacademy.in Dheeraj Suri Classes Prime Academy 9899192027 Prime Academy, www.primeacademy.in Basic Concepts 1. Multiple Linear Regression : Multiple linear regression (MLR) is a method used to model the linear relationship between a dependent variable and one or more independent variables. The dependent variable is sometimes also called the predictand, and the independent variables the predictors. MLR is based on least squares : the model is fit such that the sum-of-squares of differences of observed and predicted values is minimized. A multiple linear regression analysis is carried out to predict the values of a dependent variable, Y, given a set of p explanatory variables (x1, X2,... Xp) In multiple linear regression, there are p explanatory variables, and the relationship between the dependent variable and the explanatory variables is represented by the following equation : By* BX 4 BAX 95 tnt BYX y Where : Av is the intercept term and a to fy are the partial slope coefficients relating the p explanatory variables to the variables of interest. So, multiple linear regression can be thought of an extension of simple linear regression, where there are p explanatory variables. Examples where multiple linear regression may be used : + Trying to predict an individual’s income given several socio-economic characteristics. + Trying to predict the overall examination performance of pupils in ‘A’ levels, given the values of a set of exam scores at age 16. + Trying to estimate systolic or diastolic blood pressure, given a variety of socioeconomic and behavioral characteristics (occupation, drinking smoking, age etc). 2. Population Regression Function for Multiple Regression : Generalizing the two-variable population regression function (PRF), we may write the three- variable PRF as Yi = B+ BaXoi + BsXsi + a where Y is the dependent variable, X> and X; the explanatory variables (or independent variables), u the stochastic disturbance term. Given the assumptions of the classical regression model, it follows that, on taking the conditional expectation of Y on both sides of PRF, we obtain EY | Xoj, Xi) = Bi + BoXai + BsXai In words, it gives the conditional mean or expected value of Y conditional upon the given or fixed values of X and X;. Therefore, as in the two-variable case, multiple regression analysis is regression analysis conditional upon the fixed values of the regressors, and what we obtain is the average or mean value of Y or the mean response of Y for the given values of the regressors. Econometrics 3. By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in The Meaning of Partial Regression Coefficients : As mentioned earlier, the regression coefficients B2 and Bs are known as partial regression or partial slope coefficients. The meaning of partial regression coefficient is > measures the change in the mean value of Y, E(Y), per unit change in Xo, holding the value of X3 constant. Put differently, it gives the “direct” or the “net” effect of a unit change in X2 on the mean value of Y, net of any effect that X; may have on mean of Y. Likewise, ; measures the change in the mean value of Y per unit change in X;, holding the value of X> constant. That is, it gives the “direct” or “net” effect of a unit change in X; on the mean value of Y, net of any effect that X2 may have on mean of Y. In short a partial regression coefficient reflects the partial effect of one explanatory variable on the mean value of the dependent variable when the values of other explanatory variables included in the model are held constant. 4. Sample Regression Function for Multiple Regression : The sample regression function for three variable regression may be written as : ¥=B,+ BX + BX x +e; Where, é; is the residual term, the sample counterpart of the stochastic disturbance term ui. As noted in two variable regression, the OLS procedure consists in so choosing the values of the unknown parameters that the residual sum of squares (RSS) Fe? is as small as possible. Symbolically, Min Ye? =D, - BBX, AX) To obtain the values of OLS estimators f,, #, and f,, we obtain the following three normal equations by the method of least squares : Y= B,+B,X,+BX, DHX, =A Dt BX tA LX aX DX =A DX tA VX BUX = Using these three normal equations we obtain the values of OLS estimators f,, 2, and f, as under : Sad , Econometrics 33 By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in Variances and Standard Errors of OLS Estimators : Having obtained the OLS estimators of the partial regression coefficients, we can derive the variances and standard errors of these estimators as well. As in the two-variable case, we need the standard errors for two main purposes: to establish confidence intervals and to test statistical hypotheses. The relevant formulas are as follows : @ Varld, se{i,)}= WvarlG, iy Var(G,) su{f,)=Warl,) Gi) Varlf, sep, Where, And > é7 can be obtained using the relation : Le = Ly - AY yw - BD yen 6. Coefficient of Determination (R2) : In the two-variable case we saw that measures the goodness of fit of the regression equation; that is, it gives the proportion or percentage of the total variation in the dependent variable Y explained by the (single) explanatory variable X. This notation of /? can be easily extended to regression models containing more than two variables. Thus, in the three variable model we would like to know the proportion of the variation in Y explained by the variables X: and X; jointly. The quantity that gives this information is known as the multiple coefficient of determination and is denoted by R°; conceptually it is similar to 12, Also, as in the two variable case R? is defined as ESS ‘TSS Where, ESS is explained sum of squares (i.e., explained variation) TSS is total sum of squares (i.e., total variation) Econometrics 3.4 By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in Exercise 1 Theory Questions QI. The basic framework of multiple regression analysis, the classical linear regression model, is based on a set of assumptions. What are these assumptions? Present a brief description of each one of them. [Eco. (H) III Sem. 2012] Q2. What is perfect Multicollinearity. Q3. Ina multiple regression model if two explanatory variables are perfectly collinear, then how would this affect the estimation of partial regression coefficients? [BBE 2008] Q4. State whether the following statement is True or False. Give reasons for your answer : [Eco. (H) III Sem. 2013] In the regression model Y; = By + BoX2) + BsXs; +, if all values of Xs are identical, then the variance of the ordinary least squares estimators of the slope coefficients is not defined. QS. The basic framework of multiple regression analysis, the classical linear regression model, is based on a set of assumptions. What are these assumptions? Present a brief description of each one of them. [Eco. (H) IIT Sem. 2012] Q6. Are the following statements correct? Justify your answers carefully and provide proofs wherever necessary : [Eco. (H) IV Sem. 2015] (a) In the regression of Y on X2 and X;, if all Xs are identical, variance of partial regression coefficient of X; is zero (b) The value of R* is always greater than R* ©) Y= B.+BX,+BXs+6, is estimated as ¥,=f,+A,X,,+2Xs, using OLS. Here X2 and #; are random variables and 2, is unknown. Q7. State whether the following statements are true or false. Give reasons for your answer : [Eco. (H) IV Sem 2018] If the regression model : ¥; = By + BoXxx + BsXs + u, is estimated using the method of ordinary least squares, the sum of the estimated residuals (¢,) is zero. Proofs QI. Consider the following three-variables regression model :[Eco. (H) IV Sem 2018) ¥, = B, + B,X2; + B3X3, + uy If the method of ordinary least squares is used to estimate the parameters, prove that Le? = Ly? —beDyixe— bsEyixs where y= (%-¥), Xa = (Xai — Xp), Xs = Xs — Xs) Q2. Ifthe regression model [Eco. (H) IV Sem 2017] ¥j = By + BoXoi + BsXsi + is estimated using the method of least squares, prove that the OLS residuals, e would be uncorrelated with the estimated Y values. Econometrics 35 By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in Numerical Questions QI. You are given the following data : ¥ 1 3 8 X 1 2 3 X5 2 1 3 Obtain the estimated regression equation using ordinary least squares if Y is regressed on X2 and X3 with an intercept term. [Eco. (H) 2009] [Ans.: ¥ =2+X,,-Xy] Q2. An econometric analyst is estimating the following production function from annual data on a firm in India : O= y+ BL+ BK Where L = Rupees of Labour, K = Rupees of Capital The analyst knows that the firm always budgets Rs. 12 lakhs a year for labour and capital together. The other relevant data are provided : YX; = 14588, Px, = 2725, Fy? =47921, Ex, = 7454, Dx,y, $4554, Px, x, = 4796, Fe ¥=67, N=14 (Eco. (H) 2010] Can you estimate the regression coefficients in this model? Explain your answer. Q3. The following results were obtained from a sample of 12 firms on their output (Y), labour input (X>) and capital input (X;), measured in arbitrary units iYs Ts LY? = 48,139 LYX2 = 40,830 EX_= 643 5 34,843 zY: 6,796 =X; = 106 zr 976 =X1X2 = 5,779 Find the regression equation : P= BBX. +BX, Q4. The quantity supplied of a commodity X is assumed to be a linear function of the price of x and the wage rate of labour used in the production of x. The population supply equation is given as + BW, +6, juantity supplied of x P, = price of x W =wage rate Using the sample data : YQ = 1,281 DP, = 544 EW =85 EQP,=53,665 Fp: = 22,922 EP,.W = 2,568 EQ.W=5,706 — ZQ? = 132,609 =W?=617, n=I5 (i) Estimate the parameters by OLS, (ii) Interpret the meaning of parameters obtained in (i), (iii) Test the statistical significance of the individual coefficients at the 5% level, (iv) What % of the total variation in the quantity supplied is explained by both P, and W? Econometrics 3.6 By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in QS. The following table contains the sales prices of 5 holiday cottages in Odsherred, Denmark, together with the age and the livable area of each cottage, Price (in $) ‘Age (in Years) ‘Area (in m?) i Xai Xi 745 36 66 895 37 8 442 47 64 440 32 53 1598 1 101 ‘Suppose it is thought that the price obiained for a cottage depends primarily on the age and livable area. A possible model for the data might be the linear regression model : ¥, = 8, +B,X>, + BX, +; where the random errors 4, are independent, normally distributed random variables with zero mean and constant variance. Fit the model and obtain the parameters and their respective standard errors (Ans. : 5, =-281.43-7.611X,, +19.01X,] Q6. You are given the following data based on a simple regression estimated for the relationship between price (X2) and quantity of oranges sold (Y) in a supermarket andaalso on the amount spent on advertising the product (Xs), for 12 consecutive days. ¥=100, xX. Dy.xs =125.25, Yx,x, =-S4, (i) Estimate the three multiple regression coefficients and R?. (ii) Test the statistical significance of each estimated regression coefficient using a= 5% [BBE 2009] Q7. You are given the following data based on 15 observations : Y = 367.693, X, = 402.760, X,=8.0, Sy; = 66042.269, = 74,778.346 84,855,096, x3 = 280, x = 4250.9, Dox .%y =4796.0,, (i) Estimate the three multiple regression coefficients and their standard errors. Gi) Obtain R? and R?. (iii) Test the statistical significance of each estimated regression coefficient using a= 5% [BBE 2008] [Ans. : 2, = 53.1572, 0.7266, f, = 2.7363, R? = 0.9988] Q8. Let X2 be the hours spent on Mathematics coaching during a week, X; be the time spent on other subjects and Y be the scores obtained in Mathematics final exam. The following summations for 23 students were obtained as below : X=10, X,=5, Y=12, n=23 VWxd, = 12, Vxgixg) = 8, Lx) = 12, Dx2i7; = 10, Y x93; = 8, Dy}, = 10 Econometrics 37 By Dheeraj Suri, 9899-192027 =2250, Dypw, =-3550 = 6300, ix}, = 4.857 Prime Academy, www.primeacademy.in 22, a3 and y are variables measured in deviation form. — [Eco. (H) IV Sem 2022] (i) Estimate the following regression ¥i= f1 + B2X2i + BsXi + ui (ii) _ Estimate the standard errors of the slope coefficients. Gi) Obtain R? of the regression. (iv) _ Interpret the slope coefficients and comment on their statistical significance. Basic Concepts 1. Hypothesis Testing about Individual Partial Regression Coefficients : If we invoke the assumption that u; ~ N(O, 0”), then, we can use the f test to test a hypothesis about any individual partial regression coefficient. To illustrate the mechanics, consider the following regression model : 1 + BX, + BX 5 +e; The following steps are taken to test the significance of partial slope term (f,) of the above regression equation : (Define Null hypothesis(Ho) and Alternative hypothesis(H,). Ho: B2 =0, i.e., partial slope term is statistically insignificant. Hy : Partial slope term is statistically insignificant, ie., Pr ¥0 (Two tailed test) Bo>0 (Upper tailed test) Bo<0 (Lower tailed test) (ii) Find out the tail of the test, determine whether it is single tail or two tail test. (iii) Calculate the standard error of £,. (iv) _ Calculate the test statistic “t’ as under (v) Set the Level of Significance ‘a (vi) Find ta (for single tail test) or ta2 (for two tail test) for n — 3 degrees of freedom from the table. (vii) Compare |t| and te, (or ta): (a) — If |t] < ta (or ta), then do not Reject Null hypothesis. (b) If |t] > ta (or ta), then Reject Null hypothesis. Similarly we can test the statistical significance of other partial slope term B3 and intercept term B;. 2. Testing the Joint Hypothesis : In this case we want to test if all the explanatory variables jointly have the influence on the dependent variable or not. It is also called the test of overall significance of the estimated multiple regression. To illustrate the mechanics, consider the following regression model : y, 1, + AX, + +e, Econometrics 3.8 By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in The following steps are used in this case : (i) Set the hypothesis Hy : =f, =0, ice., the two explanatory variables together have no influence on Y. This is the same as saying Ho : R? = 0. Hz : at least one of the slope coefficients 2, or B, is different from zero. (ii) Compute the test statistic as under : _ ESSKk-1) © RSS/(n=k) The F statistic may also be expressed in terms of R2 by dividing both numerator and denominator of above expression by TSS and noting that ~ R*, so test statistic becomes : (iii) Set the level of significance ‘a’ (iv) Find Fa for (k~ 1) degrees of freedom of numerator and for (n~ k) d.o.f. of denominator. (v) Compare F and Fu. (a) If F Fa, then Reject Null hypothesis. Exercise 2 Theory Questions QI. Explain the concept of partial regression coefficients. In a multiple regression, why is the testing of significance of individual coefficients not the same as testing the overall significance of the regression? [BBE 2011] Q2. Explain step by step the procedure involved in testing the statistical significance of a partial regression coefficient. Q3. Explain step by step the procedure involved in testing the statistical significance of a partial slope coefficients. Numerical and Conceptual Problems QI. Consider the following estimated regression equation : Y= -1336.049 + -12.7413X, + 85.7640X3, se (175.2725) (0.9123) (8.8019) t (-7.6226) (13.9653) (9.7437) = 0.8906, F = 118.0585, n =32 Where, Y = Auction price of antique clock X. = — Ageof clock X; = Number of bidders (i) Interpret all the three coefficients of the equation. Econometrics 3.9 By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in @. Ans. : Q3. Qs. (ii) | What do you understand by the concept of standard error of an estimate? How would you calculate it? (iii) Test whether the age of clock has any significant contribution in explaining the variation in auction price of antique clock. (iv) Would you say that this regression equation is a good fit on the data? Explain the basis for your answer. (v) Test the overall significance of this equation, i.e., test the joint hypothesis that X> and Xs are insignificant in explaining the variation in Y. (vi) What is the relationship between F and R”? Establish this for the regression results presented above. Consider the following regression for an imaginary country, say Utopia, for a period of 15 years. Variables are : IMP = imports, GNP = Gross National Product and CPI = Consumer Price Index. [Eco. (H) 2010] -108.20 + (0.045 GNP + 0.931. CPI (3.45) (1.232) (1.844) R? = 0.9894 (i) Test whether, individually, the partial slope coefficients for GNP and CPI are statistically significant at the 5% level of significance. Gi) Test whether GNP and CPI jointly have any statistical significance in explaining variations in exports. Carry out this test at 5% level of significance. (i) Both GNP and CPI are insignificant individually, (ii) F = 562.16, Consider the following model relating the gain in salary due to an MBA degree to a number of its determinants. SLRYGAIN, = B; + Bz TUITION, + BsZu + BaZa + BsZa + uy Where, SLRYGAIN= Post salary MBA minus pre MBA salary, in thousands of dollars. TUITION == annual tuition costs, in thousands of dollars Zh = MBA skills in being analysts, graded by recruiters. L = MBA skills in being team players, graded by recruiters. Zs = Curriculum evaluation by MBA’s Using data for top 25 business schools, the coefficients were estimated as follows, standard errors in paranthesis. B, 60.899 (2.513) 0.314 (0.750) , 0.3948 (2.756) 2.016 (2.165) B, -5.325 (3.773) (i) Carry out individual two tail tests at 10% level of significance for the slope coefficients. (ii) Test the model for overall significance at the 10% level if R? = 0.461 was obtained for the model. (Eco. (H) 2012] A field researcher while trying to evolve a theory on human capital held that a person's income (1) could be determined on the basis of his or her education level (©), training (T) and general level of health (H). using a sample of 25 employees Econometrics 3.10 By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in the researcher regressed income on the other three variables and got the following results : (BBE III Sem. 2012] I = 27.2 + 37E + L7T + 3.054H SE (3.70) (6.21) (4.32) (6.79) R? = 0.67 Where | is measured in Rs. “000, E and T are measured in years and H in terms of scaled index of one’s health, the higher the index the better the health (i) Interpret the model. Do the coefficients have the right sign. (ii) Test the significance of the coefficients of training and education at 1% level of significance (iii) Test the overall significance of the model at 1% level of significance Q5. For the multiple regression model for Y = mental impairment, Xi = life events, and X= SES E(Y) =o + BiX1 + B2X2 Following table contains the required results : Coeff. Std. Error t (Constant) 28.230 2.174 12.984 LIFE 103 032 3.177 SES -.097 029 3.351 n=40, R?= 0.9542 (i) Interpret the regression model. (i) Test the significance of partial slope coefficients. (iii) Construct the 95% confidence interval for partial slope coefficients. (iv) _ Construct the ANOVA Table and test whether the model is significant. Q6. The following model represents the demand for roses in Delhi for the period 1971 —I to 1975-1 ¥,=@,+@,X,+@,X3, +, Where, Y= quantity of roses in dozens X2 = ayerage wholesale price of roses (Rs./dozen) X3 = average wholesale price of lilies (Rs./dozen) The following results were obtained : Y, =9734.2176 =3782.19X,, + 2815.25X,, t= (3.3705) (6.6069) (2.9712) (Do the coefficients have the expected sign? Interpret them. (i) Comment upon the significance of all the 3 parameters. (iii) Test the overall significance of the model. (iv) What do you understand by p-value [BBE 2011] Q7. The child Mortality Rate (CM) depicting the number of deaths of children under 5 years per thousand of live births was regressed on the per capita GNP (PGNP) expressed in rupees, Female Literacy Rate (FLR) expressed in percentage and Total Fertility Rate (TFR). The resultant regression was the following : CM = 168.3067 ~ 0.005511 PGNP ~ 1.768029FLR + 12.86867FR The standard errors for the OLS estimates of the coefficients were : Constant (32.89), PGNP (0.00187), FLR (0.24801) and TER (4.1905) Econometrics 3.41 By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in R? = 0.7473, R? = 0.7347 , F = 59.167, No. of observation = 64 (i) Interpret the results of the regression equation. Are the signs of the explanatory variables theoretically justified? (i) Test the significance of PGNP, FLR and TFR at 5% level. (iii) Comment on the value of R?. [BBE 2011] Q8. Using quarterly data for 1965 Q: to 1983 Qu (76 observations) for an economy, the following model of consumption function was estimated : PCE, = B, + B,PDI, + B,INTRATE, +u, Where PCE : personal consumption expenditure in billion of dollars PDI : Personal disposable income in billions of dollars INTRATE : Prime interest rate charged by banks in percent The table below has estimates of the coefficients and their t ratios : Variables Estimates of coefficients tratios Constant -10.96 . PDI 0.93 249.06 INTRATE -2.09 -3.09 (a) Interpret the slope coefficients (b) Perform an appropriate test at 5% level of significance, to check if marginal propensity to consume is statistically significantly different from 1. State the Null and alternative hypothesis clearly. (c)__ If Personal disposable income and personal consumption expenditures are measured in millions of dollars instead of billions of dollars, what will be the new numerical value of the coefficient of PDI and its t-ratio? What will be the impact on R2? [Eco (H) III Sem 2017(ER)] Q9. The grade points average (GPA) of a random sample of 427 students in a college were regressed on verbal SAT scores (VSAT) and mathematics SAT scores (MSAT) and the following regression model was estimated. (Standard errors are reported in parentheses) GPA, = 0.423 + 0.398VSAT; + 0.001MSAT; Se (0.220) (0.061) (0.00029) (i) The analyst found the unadjusted R? = 0.22 and concluded that the VSAT and MSAT scores are not good predictors of GPA. Do you agree with him? Write down all the steps to test his claim and check it at 5% level of significance. Gi) Suppose a student’s VSAT and MSAT scores increased by 100 points each. How much increase in GPA can he expect? Gii) As a result of the college policy if all the GPA scores were increased by 10%, what impact would it have on the regression coefficients and coefficient of determination R?. [Eco. (H) 2013] QI0. A relationship was established between demand for housing (H), Gross National Product (GNP), Interest Rate (INT) prevailing in the economy. The following results were obtained : Econometrics 3.12 By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in H = 678.89 + 0.905GNP — 169.65INT t (180) (3.64) (3.87) R? = 0.432, R? =0.375, df = 20 The statistician however forgot to state the F value. () Calculate the F value from the data? (ii) What conclusion do you draw from the F value? [BBE 2011] QI1. To explain what determines the price of air conditioners the following results were obtained based on a sample of 19 air conditioners : ¥, = 68.236 + 0.023X,, + 19.729X,, +7,653X 4, se= (0.005) (8.992) (3.082) where, Y= the price in rapees X= the rating of air conditioner X3 = the energy efficiency ratio X, = the number of settings (a) Interpret the regression results (b) Do the results make economic sense (©) Ata = 5%, test the hypothesis that rating has no effect on the price of air conditioners versus that it has a positive effect. (@) Would you accept the hypothesis that the three explanatory variables explain a substantial variation in the prices of air conditioners? Q12. Based on the data for 1965 — 1Q to 1983 ~ TVQ (n = 76), the following results were obtained in the regression model to explain the personal consumption expenditure : ¥, =-10.96 + 0.93X,, — 2.09X;, se= (3.33) (249.06) _—_(-3.09) where, Y= PCE in billion rupees X2 = the disposable income in billion rupees X3 = the prime rate (%) charged by banks (a) Whatis the marginal propensity to consume (MPC) the amount of additional consumption expenditure. (b) Is the MPC statistically different from 1? Show. the appropriate testing procedure. (c) What is the rationale for inclusion of prime rate variable in the model? A priori, would you expect a negative sign for this variable ? (d)__ Iss statistically different from zero? (e) Test the hypothesis that R? = 0. (£) Compute the standard error for each coefficient. QI3. Using time series data for 1979 to 2009 for a certain economy, the following model of demand for money was estimated : [Eco. (H) IV Sem. 2016] MD; = By + Bp Y;,+ Bs INTRATE; + u Where MD = Quantity of money demanded, measured in billions of rupees. Y =National income, measured in billions of rupees Econometrics 3.13. By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in INTRATE = Interest rate in percent on 3 month treasury bills The table below has estimates of the coefficients and their standard errors Variables Estimates of coefficients Standard errors CONSTANT 0.003 0.009 Y 0.530 0.112 INTRATE -0.0261 0.101 (a) Interpret the slope coefficients. (b) _ Test the overall significance of the model, at 5% level of significance, if coefficient of determination reported for the model is 0.519. Q14. The following regression model was estimated using data collected from 34 stores, ¥;, = 5837.53 — 53.217Xz; + 3.613X3; Se= (628.151) (6.853) (0.6852) Y; = Monthly sales of 'Milky' chocolate bars for store i, (number of bars) X2; = price of 'Milky' chocolate bars for store i, (in rupees) X3;= Monthly 'm-store’ promotional expenditure for store i, (in thousand rupees) (i) Interpret the estimated partial slope coefficients of X2 and X3. Gi) Test the model for overall goodness of fit using 5% level of significance. [Eco. (H) IV Sem. 2018] QIS. Consider the following simple regression model [Eco. (H) IV Sem. 2019] Price = Bo + B: Assess + u Where, Price is the housing price Assess is the assessment of housing price. The estimated equation is Price = -14.47 + 0.976Assess t = (16.27) — (0.049) n= 88, SSR = 165644.51, 7 = 0.820 (How will you test the constraints B; = 1 and Bo = 0 in the above regression if you are given the SSR in the restricted model as 209448.99? Conduct the necessary test(s) at 1% level of significance and give your conclusion? (i) Suppose now that the estimated model is Price = Bo + Bi Assess + B2 Lotsize + Bs Sqrft + By Bdrms + u Where Lotsize = the size of the lot Sqrft = the square footage Bdrms = the number of bedrooms The R? from estimating this model using the same 88 houses is 0.829. Test at 1% level of significance that all partial slope coefficients are equal to zero. QI6. Demographic data from 126 countries is obtained for the year 2017. It is hypothesized that life expectancy (Y) is dependent on number of under five deaths (X2), polio immunization coverage (D), Per capita Govt. Exp. on Health Care (X3) (in Rs crores), Per Capita GNI (in Rs crores) (X4) and Average number of years of Econometrics 3.14 By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in Schooling (X5). Polio immunization coverage = 1 if yes and 0 otherwise. Following regressions were estimated: [Eco. (H) IV Sem. 2021] MODEL 1: ¥, = 0.903 ~ 0.561 Xai + 2.008X3i + 0.553X ai + 0.778Xsi + 3.638D se = (1.280) (0.405) (0.765) (0.712) (0.491) R?= 0.787 RSS = 1339.8 MODEL 2: Y, = 1.379 + 0.594X3i + 2.139D se= (0.406) (0.465) .677 RSS = 1567.28 (i) Is ita time series or a cross sectional data (ii) Show model 2.is a restricted version of model 1 and what is the restriction? (iii) Test for the statistical significance of the restriction at 5% level. (iv) Construct a 95% confidence interval for true per capita government health expenditure in model II and check whether it is statistically significant. QI7. The estimated equation for sales of TV is given as below : Sales = 118.91 7.908 Price + 1.863 Advert (se) (6.35) (1.096) (0.683) Where Price is price of TV measured in Rs. Sales is sale revenue and Advert is advertising expenditure. Both Sales and Advert are measured in terms of thousands of rupees. (i) Is the slope coefficient of price statistically different from 1? Test at a = 2% (ii). Calculate the elasticity of sales revenue with respect to price if average sales revenue is 300 and average price is 100? (iii) How would you test that an increase in advertising expenditure will bring an increase in sales revenue that is sufficient to cover the increased advertising expenditure ? Clearly state the Null and alternative hypothesis. Test at a = 1.448, n= 30 5%. (iv) Estimate the sales revenue for a price of Rs. 6 and an advertising expenditure of Rs. 1200. [Eco, (H) TV Sem. 2022] Basic Concepts 1. R? and Adjusted R? : An important property of R? is that it is a non-decreasing function of the number of explanatory variables or regressors present in the model; as the number of regressors increases, R? almost invariably increases and never decreases, Stated differently, an additional X variable will not decrease R?. To see this, recall the definition of the coefficient of determination : Now, simply (¥,-¥). The RSS, Ye?, however, depends on the number of Econometrics 3.15 By Dheeraj Suri, 9899-192027 independent of the number of X variables in the model because it is Prime Academy, www.primeacademy.in Tegressors present in the model. Intuitively. it is clear that as the number of |X variables increases, Ye? is likely to decrease (at least it will not increase); hence R? as defined above will increase. In view of this, in comparing two regression models with the same dependent variable but differing number of X variables, one should be very wary of choosing the model with the highest R. To compare two R? terms, one must take into account the number of X variables present in the model. This can be done readily if we consider an alternative coefficient of determination, which is as follows Where, k = the number of parameters in the model including the intercept term. (In the three-variable regression, k = 3) The R? thus defined is known as the adjusted R?, denoted by R*. The term adjusted means adjusted for the df associated with the sums of squares entering into R?. Sve? has n — k df in a model involving k parameters, which include the intercept term. Sy? has n—1 df. The adjusted R? can be related to R? as under : R --0-R {2 ") n-k Adjusted R-square is a modification of R-square that adjusts for the number of terms in a model. R-square always increases when a new term is added to a model, but adjusted R-square increases only if the new term improves the model more than would be expected by chance. Adjusted R? is used to compensate for the addition of variables to the model. As more independent variables are added to the regression model, unadjusted R? will generally increase but there will never be a decrease. This will occur even when the additional variables do little to help explain the dependent variable. To compensate for this, adjusted R?is corrected for the number of independent variables in the model. The result is an adjusted R? than can go up or down depending on whether the addition of another variable adds or does not add to the explanatory power of the model. Adjusted R? will always be lower than unadjusted R2, Properties of Adjusted R? ; Adjusted R? has the following properties : (i) Adjusted R? is always less than or equal to R2. (ii) R? can never be negative but adjusted R? may acquire negative values for some values of R°, 2. The “Game” of Maximizing R? : Sometimes researchers play the game of maximizing R’, that is, choosing the model that gives the highest R?. But this may be dangerous, for in regression analysis our objective is not to obtain a high Econometrics 3.16 By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in R? per se but rather to obtain dependable estimates of the true population sion coefficients and draw statistical inferences about them. In empirical analysis it is not unusual to obtain a very high R? but find that some of the regression coefficients either are statistically insignificant or have signs that are contrary to a priori expectations. Therefore, the researcher should be more concerned about the logical or theoretical relevance of the explanatory variables to the dependent variable and their statistical significance. If in this process we obtain a high R’, well and good; on the other hand, if is low, it does not mean the model is necessarily bad Exercise 3 The Adjusted R? QI. What are the properties of adjusted R?. Q2. Write Short notes on R? Vs R® [BBE 2011] Q3. What is degrees of freedom and R?, how is adjusted R square an improvement over R square. [BBE III Sem. 2012] Q4. What is Adjusted coefficient of multiple determination? When would you prefer this measure over the coefficient of multiple determination? _[Eco. (H) 2010] Q5. Can we compare the R? of two models with same dependent variable and different number of parameters. If not what alternative of R? can be used. Q6. For a regression of variable Y, on two explanatory variables X; and X; illustrate the ANOVA (analysis of variance) Table. [Eco. (H) 2010] Q7. Write short note on ANOVA and its application. [BBE III Sem. 2012] Q8. Is the following statement correct? Justify your answers carefully and provide proofs wherever necessary : [Eco. (H) III Sem. 2012] An increase in the number of explanatory variables in a multiple regression model will necessary increase adjusted R squared. Q9. Comment on the following. Give reasons in support of your comment.[BBE 2014] (a) The value of adjusted R? is always less than R°. (b) _R? and adjusted R? are always positive. QI0. State whether the following statements are true or false. give reasons for your answer : [Eco. (H) IV Sem 2017] The adjusted R? is always less than the unadjusted R? QI1. State whether the following statements are True or False. Justify your answer. (a) An addition of a variable in a regression model with 30 observations and 4 variables, would always lead to a rise in R? and adjusted R®, given that the additional variable is statistically significantly different from zero at a = 20%. (b) In a multiple regression model ¥; = B; + BeX2: + BsXoi + uj, testing a joint restriction Ho : By = By = 0 is same as testing for Ho : By = 0 and Ho : By = 0. [Eco. (H) IV Sem 2022] Econometrics 3.17. By Dheeraj Suri, 9899-192027 Prime Academy, www.primeacademy.in Numerical and Conceptual Problems Ql. Compute adjusted R? from the following data ¥, = -1336.049 + 12.7413X.,, +85.7640X.,, 8906, n = 32 [Ans. 0.8831] Q2. The monthly salary (Wage, in hundreds of rupees), age (AGE, in years), number of years of experience (EXP, in years), number of years of education (EDU) were obtained for 49 persons in a certain office. The estimated regression of Wage on the characteristics of a person were obtained as follows (with t statistics in parenthesis) Wage = 632.244 + 142.510 EDU + 43.225 EXP - 1.913 AGE (1.493) (4.088) (3.022) (0.22) (i) The value of adjusted R, R° = 0.277. Using this information, test the model for overall significance. (ii) Test the coefficient of EDU and EXP for statistical significance at 1% level and Coefficients for AGE at 10% level. [Eco. (H) 2012] Q3. Using quarterly data for 10 years (n = 40) for the U.S. economy, the following model of demand for new cars was estimated : NUMCARS, = B; + Bz PRICE; + Bs INCOME, + Bs INTRATE, + 1 Where NUMCARS : Number of new car sales per thousand people PRICE : New car price index INCOME : Per capita real disposable income (in dollars) INTRATE : Interest rate (in percent) The table below gives estimates of the coefficient and their standard errors Estimates of Coefficients | Standard errors CONSTANT -7.4534 13,5782 PRICE -0.0714 0.0032 INCOME 0.0032 0.0017 INTRATE -0.1537 0.0491 (i) A priori, what are the expected signs of the partial slope coefficients? Are the results in accordance with these expectations? Gi) Interpret the various slope coefficients and test whether they are individually statistically different from zero. Use 10% level of significance. (iii) The adjusted R squared reported for this model is 0.758. Test the model for overall goodness of fit at 5% level of significance.[Eco. (H) III Sem. 2012] Q4. Consider the following data on hourly wage rates (Y), labour productivity (X,) and literacy rate (X2) in a country ABV : Y 90, 2 54 42 30 12 X 3 a 6 8 12 14 X2 16 10 ? 4 3. 2 Econometrics 3.18 By Dheeraj Suri, 9899-192027

You might also like