0% found this document useful (0 votes)
8 views11 pages

07 Multiple Regression

multiple regression

Uploaded by

Daniel Rotari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views11 pages

07 Multiple Regression

multiple regression

Uploaded by

Daniel Rotari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

EM Multiple Regression 2025

Introduction Multiple regression Collinearity Interaction Different responses

Multiple Regression
Day 7
Multiple explanatory variables · Regression coefficients · Linear regression ·
Normality · Transformation · Additive effect · Intercept · Interaction · Parallel lines
· Response curves · Lognormal · Exponentiation · Ecological optimum · Gaussian
curve · Parabola · Back-transformation · Width of response · Hump-shaped
response · Multiple logistic regression ·Adjusted R square · Explained variation ·
Multiple regression: more
Collinearity · Variance inflation factor · Correlation matrix · Loglinear regression than 1 independent variable

Ecological Methods 2025

1 2

Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Multiple regression Collinearity Interaction Different responses

Regression Important!
1. Correlation and introduction regression
 General model

2. Different distributions
 Different response curves
Interaction ≠ Collinearity
3. Multiple regression
 More than 1 independent variable

4. Zero-inflated models
 Lots of zeros

3 4

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Today’s topics Today’s topics

1. Multiple regression 1. Multiple regression


2. Collinearity 2. Collinearity
3. Interaction 3. Interaction
4. Different response curves 4. Different response curves

5 6
EM Multiple Regression 2025

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Multiple regression
Linear regression
Regression with more than 1 independent variable
Normality test on residuals
Linear regression:

y  b0  b1 x1  b2 x2  b3 x3  ...
If not: transform dependent
variable (ln, sqrt, etc.)
 1 dependent variable y
Then: calculate residuals again
 2 or more independent variables xi
 Possible interactions between the independent variables

Regression coefficients estimated with least-square method (OLS)

7 8

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

1 independent variable: 1 independent variable:


linear regression linear regression
9 9

8 8
y=2.4733 +y0.5867 nutrients
= 0.5867x2 + 2.4733
R2 = 0.9219
7 y=-0.48y + 0.5836 -water
= 0.5836x1 0.48 7
Plantybiomass

Plantybiomass

6 R2 = 0.9299 6

5 5

4 4

3 3

2 2

1 1

0 0
0 2 4 6 8 10 0 2 4 6 8 10
Waterx1 Nutrients
x2

9 10

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

More than 1 independent variables


multiple regression What do we test in multiple regression?

Hypothesis for intercept:

Intercept does not deviate from zero

Hypothesis for each independent variable:


biomass
Regression coefficient for independent variable
nutrients does not deviate from zero

water

y = 0.1 + 0.5 water + 0.6 nutrients (additive effect)

11 12
EM Multiple Regression 2025

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

More than 1 independent variables


multiple regression

Biomass
Red: nutrients = 2

Nutrients=2

biomass

nutrients

water Water

y = 0.1 + 0.5 water + 0.6 nutrients y = 0.1 + 0.5 water + 0.6 nutrients

13 14

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

More than 1 independent variables


multiple regression Green: nutrients = 4
Biomass

Red: nutrients = 2
Nutrients=4

Nutrients=2

biomass

nutrients

water Water

y = 0.1 + 0.5 water + 0.6 nutrients y = 0.1 + 0.5 water + 0.6 nutrients

15 16

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Green: water = 3
Biomass

Red: water = 1

2 independent variables: soil water and soil nutrients


Nutrients

y = 0.1 + 0.5 water + 0.6 nutrients


Multiple Regression Example in R (part 1)

17 18
EM Multiple Regression 2025

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Today’s topics
model.water <- lm(biomass ~ water, data = biomass)
summary(model.water)

Model effect water


 readxl
model.nutrients <- lm(biomass ~ nutrients, data =
 tidyverse
biomass) 1. Multiple regression
summary(model.nutrients)
Model effect nutrients
2. Collinearity
3. Interaction
model.water.nutrients <- lm(biomass ~ water +
nutrients, data = biomass) 4. Different response curves
summary(model.water.nutrients)
Model effect water and nutrients

19 20

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Soil water Soil nutrients

21 22

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Overlap in explained variation


Total variation

water nutrients

23 24
EM Multiple Regression 2025

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Problem with multiple regression Plant Soil Clay


growth moisture content
13.6 35 11
11.7 30 9.8
 Have different independent variables an additive effect? 12.4 25 10.2

 Or is there overlap in explained variation? 10.8 20 6.4


10.4 15 4.4

However:
 2 independent variables might be highly correlated
 High correlation: they have the same effect on Y
 Substitution between the independent variables
 No need for both variables in the model

Problem of collinearity

25 26

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Regression Regression

growth = b0 + b1 soil moisture growth = b0 + b1 clay content

27 28

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Collinearity
Regression
 High collinearity when two independent variables are strongly
correlated
growth = b0 + ?
 High level of collinearity: type II error (not rejecting H0, so not
including coefficient in the model)

 Affects adjusted R2

 Low levels of collinearity: not a big problem

29 30
EM Multiple Regression 2025

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Collinearity
Linear regression (normal distribution)
Correlation matrix:
Loglinear regression (Poisson distribution)
Correlation coefficient r
Logistic regression (binomial distriobution)

Try to avoid multicollinearity when using more


Variance Inflation Factor
(VIF = 1/(1-r2)) than 1 independent variable

To avoid collinearity: VIF < 5

31 32

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses
model.water.nutrients <- lm(biomass ~ water +
nutrients, data = biomass)
summary(model.water.nutrients )

 readxl Model effect moisture and clay


 tidyverse
 car
vif(model.water.nutrients )

Variance Inflation Factor (VIF)

Variance Inflation Factor (VIF)

Multiple Regression Example in R (part 2)

33 34

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Today’s topics

1. Multiple regression Interaction:


2. Collinearity effect of one independent variable (x1) on dependent
3. Interaction variable (y) depends on other independent variable (x2)
4. Different response curves

35 36
EM Multiple Regression 2025

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Parallel lines = no interaction


Green: water = 3

Biomass
Red: water = 1

Nutrients

y = 0.1 + 0.5 water + 0.6 nutrients

37 38

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Multiple regression with interaction Lines do not run parallel = interaction


y = 0.1 + 0.5 water + 0.6 nutrients + 0.7 water x nutrients
y = b0 + b1 x1 + b2 x2 + b3 x1x2
Interaction term is the product of x1 and x2
Biomass

Green: nutrients = 4
Red: nutrients = 2 Green: water = 3
Red: water = 1

Nutrients=4
biomass

Nutrients=2 Water Nutrients


nutrients

water

39 40

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

What do we test in multiple regression?

Hypothesis for intercept


Intercept does not deviate from zero

Hypothesis for each independent variable


Regression coefficient for independent variable does not deviate from zero

Hypothesis for the interaction between the independent variables


Regression coefficient for interaction does not deviate from zero 2 independent variables and interaction

Multiple Regression Example in R (part 3)

41 42
EM Multiple Regression 2025

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses
model <- lm(biomass ~ water + nutrients +
water:nutrients, data = biomass)
summary(model)

 readxl Model with interaction


 tidyverse
 car

Soil water and soil nutrients have both a positive effect on the biomass of
trees and they show an interactive effect where the effect of soil water
increases with increasing nutrients and vice versa (linear regression, n=131,
water: t=36.557, p<0.001; nutrients: t=5.923, p<0.001; interaction: t=2.826,
p=0.005)

43 44

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Effect size
Large effect Small effect

Abundance species
Abundance species

Water Nutrients

What variable has the strongest effect?

45 46

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

To compare effect size: scale(independent variable)

• We cannot use P-value


• Regression coefficient has units
• To compare effect size: scale()
• To standardize regression coefficients

47 48
EM Multiple Regression 2025

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Real Madrid

Interaction → independent variables and FC Barcelona


interaction variable are correlated

National team

Correlations between
Real Madrid and National team &
Barcelona and National team

49 50

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

It is good to use scale() in multiple regression

Use scale()

Multiple Regression Example in R (part 4)

51 52

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Today’s topics
model.interaction.scale <- lm(biomass ~ scale(water)
+ scale(nutrients) + scale(water):scale(nutrients),
data = biomass)
summary(model.interaction.scale)
 readxl Model interaction using scale()
 tidyverse 1. Multiple regression
 car vif(model.interaction.scale)

Calculate VIF value


2. Collinearity
3. Interaction
4. Different response curves

53 54
EM Multiple Regression 2025

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses
Distribution of y Transformation
Exponential curve Lognormal Ln(y) or Ln(y+1)

Curve with Lognormal Ln(y) or Ln(y+1)

Biomass (g/m2)
optimum

80
Biomass (g/m2)

0
pH

0
pH

Response with optimum: Gaussian curve

55 56

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Fitting a parabola is a special case of multiple linear regression Excel: explanation of x2


where x1 = x and x2 = x2 y=b0+b1x+b2x2 R: Gaussian curve

additional
This case: ln(y+1)=b0+b1pH+b2pH2 term
Ln(y+1)

ln or log10 to
avoid
predicting
values < 0

pH

Response with optimum: Gaussian curve

57 58

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Multiple regression
Gaussian.curve <- lm(log(biomass) ~ scale(pH) +
I(scale(pH)^2), data = gaussian)
summary(Gaussian.curve)

Fit model, including ^2 of scale(pH)


 readxl
 More than 1 independent variable in linear regression
 tidyverse
gaussian$predicted <- exp(predict(Gaussian.curve))
 car  Can be used to test a hump-shaped response:
Add backtransformed predictions
Biomass (g/m2)

vif(Gaussian.curve)

Check multicollinearity

0
pH

Ln(y+1)=b0+b1x+b2x2

59 60
EM Multiple Regression 2025

Introduction Multiple regression Collinearity Interaction Different responses Introduction Multiple regression Collinearity Interaction Different responses

Multiple regression Today’s topics


 Also for other distributions of dependent variable:
multiple logistic regression 1. Multiple regression
2. Collinearity
3. Interaction
Presence/absence

4. Different response curves


0
pH

e(b0+b1x+b2x^2)
Pr =
1+e(b0+b1x+b2x^2)

61 62

Introduction Multiple regression Collinearity Interaction Different responses

Model selection

63 64

You might also like