Example: Multiple Linear Regression
Suppose we have the following dataset with one response variable y and two predictor variables
X1 and X2:
Use the following steps to fit a multiple linear regression model to this dataset.
Step 1: Calculate X12, X22, X1y, X2y and X1X2.
Step 2: Calculate Regression Sums.
Next, make the following regression sum calculations:
Σx12 = ΣX12 – (ΣX1)2 / n
Σx22 = ΣX22 – (ΣX2)2 / n
Σx1y = ΣX1y – (ΣX1Σy) / n
Σx2y = ΣX2y – (ΣX2Σy) / n
Σx1x2 = ΣX1X2 – (ΣX1ΣX2) / n
Step 3: Calculate b0, b1, and b2.
The formula to calculate b1 is: [(Σx22)(Σx1y) – (Σx1x2)(Σx2y)] / [(Σx12) (Σx22) – (Σx1x2)2]
The formula to calculate b2 is: [(Σx12)(Σx2y) – (Σx1x2)(Σx1y)] / [(Σx12) (Σx22) – (Σx1x2)2]
The formula to calculate b0 is: y – b1X1 – b2X2
Step 4: Place b0, b1, and b2 in the estimated linear regression equation.
The estimated linear regression equation is: ŷ = b0 + b1*x1 + b2*x2
Note: the above equation can be of the form
ത b1(𝑋1 − 𝑋ത1 ) + b2(𝑋2 − 𝑋ത2 ) --- Reference – B.L Agarwal
ŷ = 𝑦+
Step 1 and 2:
Step 2:
Σx12 = ΣX12 – (ΣX1)2 / n = 38,767 – (555)2 / 8 = 263.875
Σx22 = ΣX22 – (ΣX2)2 / n = 2,823 – (145)2 / 8 = 194.875
Σx1y = ΣX1y – (ΣX1Σy) / n = 101,895 – (555*1,452) / 8 = 1,162.5
Σx2y = ΣX2y – (ΣX2Σy) / n = 25,364 – (145*1,452) / 8 = -953.5
Σx1x2 = ΣX1X2 – (ΣX1ΣX2) / n = 9,859 – (555*145) / 8 = -200.375
Step 3: Calculate b0, b1, and b2.
The formula to calculate b1 is: [(Σx22)(Σx1y) – (Σx1x2)(Σx2y)] / [(Σx12) (Σx22) – (Σx1x2)2]
Thus, b1 = [(194.875)(1162.5) – (-200.375)(-953.5)] / [(263.875) (194.875) – (-200.375)2]
= 3.148
The formula to calculate b2 is: [(Σx12)(Σx2y) – (Σx1x2)(Σx1y)] / [(Σx12) (Σx22) – (Σx1x2)2]
Thus, b2 = [(263.875)(-953.5) – (-200.375)(1152.5)] / [(263.875) (194.875) – (-200.375)2] = -
1.656
The formula to calculate b0 is: y – b1X1 – b2X2
Thus, b0 = 181.5 – 3.148(69.375) – (-1.656)(18.125) = -6.867
Step 4: Place b0, b1, and b2 in the estimated linear regression equation.
The estimated linear regression equation is: ŷ = b0 + b1*x1 + b2*x2
In our example, it is ŷ = -6.867 + 3.148x1 – 1.656x2
Here is how to interpret this estimated linear regression equation: ŷ = -6.867 + 3.148x1 – 1.656x2
b0 = -6.867. When both predictor variables are equal to zero, the mean value for y is -6.867.
b1 = 3.148. A one unit increase in x1 is associated with a 3.148 unit increase in y, on average,
assuming x2 is held constant.
b2 = -1.656. A one unit increase in x2 is associated with a 1.656 unit decrease in y, on average,
assuming x1 is held constant
Dry Weight of plants (mg) Y Root Length (cm) X1 Shoot Length (cm) X2
412 28.7 21.5
226 13.4 11.7
292 14.6 12.9
323 18.0 14.8
233 12.1 11.0
368 23.4 19.2
239 12.6 11.4
382 30.2 22.6
218 11.6 10.8
222 12.0 10.2
214 12.4 10.1
Total 3129 189.0 136.2
Considering that the dry weight of plants depend on root length
and shoot length, we would
(i) fit in the linear regression equation of Y on X1 and X2
(ii) estimate Y for given X1 =12 and X2=10,
(iii)find the value of R and interpret it,
(iv)test the overall significance using F -test
test the significance of each partial regression coefficient.