0% found this document useful (0 votes)
85 views34 pages

5 - 2022 - Advanced Bio2 - Linear Mixed Models - Intro

This document discusses linear mixed models, also known as random effects models, multilevel models, or mixed models. These models contain both fixed effects, like covariates in a standard regression, as well as random effects, which account for variability beyond the error term. Mixed models decompose the total variance in a dataset into components that can be interpreted, such as variability between individuals. Key aspects of mixed models include writing the model equations to include random intercepts and/or random slopes, interpreting the variance components that result from fitting the random effects, and visualizing the model predictions which utilize both the fixed and random effect estimates.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views34 pages

5 - 2022 - Advanced Bio2 - Linear Mixed Models - Intro

This document discusses linear mixed models, also known as random effects models, multilevel models, or mixed models. These models contain both fixed effects, like covariates in a standard regression, as well as random effects, which account for variability beyond the error term. Mixed models decompose the total variance in a dataset into components that can be interpreted, such as variability between individuals. Key aspects of mixed models include writing the model equations to include random intercepts and/or random slopes, interpreting the variance components that result from fitting the random effects, and visualizing the model predictions which utilize both the fixed and random effect estimates.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Linear Mixed Models

Biostats 3
aka
Random effects models

Multilevel models

Mixed models

Variance components models


Learning objectives
Differentiate between fixed and random effects in both general terms (variables)
and specific (in model equation)

Write down (subscripts and all): random intercept, and random intercepts/random
slopes models

Draw line diagrams showing the essential differences between random


slopes/random intercepts/GLMS etc
We use the term “random effects” in a general way to talk about ‘what should you
fit as a random effect’ in your model, and also in very specific (mathematical) ways
as specific elements of statistical equations.

Random effects models, or mixed effects models typically denote models with
both types of effects (fixed effects and random effects).
Mental model
Think: continuous Y vs
(special) covariate
time

Repeated measures
on individuals
"I don't expect a model to be correct, I am only interested in
whether the terms in the model are useful for explaining the
observed data."

This is especially important with respect to mixed effects


models, we are looking for relevance and utility, remember,
*all models are wrong*.
1. Fixed effect - a covariate, or explanatory variable (eg. age, gender), these are
like ordinary regression coefficients.

1. Random effect - a variable whose levels are considered stochastic (randomly


sampled) beyond the usual error term.

The main idea, which we will see again, is that the variance (variability) in a data
set can be decomposed into a sum of several components, each of which can be
given a useful interpretation
Equations
GLM: yi = 𝛼 + 𝛽xi i = 1, … , N (number of individuals)
Equations
GLM: yi = 𝛼 + 𝛽xi i = 1, … , N (number of individuals)

GLM: yij = 𝛼 + 𝛽xij i = 1, … , N (number of individuals); j = 1, … , T (number of time points*)

Why is this still a GLM?

(*We pretend, for now, that there is no missing data and everyone has exactly the same number of time points)
GLM: yij = 𝛼 + 𝛽xij i = 1, … , N; j = 1, … , T

Not a GLM: yij = (𝛼+ai) + 𝛽xij i = 1, … , N (number of individuals); j = 1, … , T (number of time points*)
Not a GLM: yij = (𝛼+ai) + 𝛽xij i = 1, … , N (number of individuals); j = 1, … , T (number of time points*)

Person 1: y1j = (𝛼+a1) + 𝛽x1j Just an intercept to estimate

Person 2: y2j = (𝛼+a2) + 𝛽x2j Just a (slightly different) intercept to estimate

Person 3: y3j = (𝛼+a3) + 𝛽x3j Just a (slightly different) intercept to estimate


Not a GLM: yij = (𝛼+ai) + 𝛽xij i = 1, … , N (number of individuals); j = 1, … , T (number of time points*)

Instead of estimating each ai we will assume a distribution ai ~ N(0, σ2A)

Not a GLM: yij = (𝛼+ai) + (𝛽 +bi)xij i = 1, … , N; j = 1, … , T

a ~ N(0, σ2A)

Instead of estimating each b we will assume a distribution b ~ N(0, σ2B)


When we assume distributions for objects...
...we call them random variables.

Random intercepts model: yij = (𝛼+ai) + 𝛽xij

Random slopes and random intercepts model: yij = (𝛼+ai) + (𝛽 +bi)xij

Write down a random slopes model.


Random slopes model: yij = 𝛼 + (𝛽 +bi)xij
Outcome: response time (ms)

Exposure: sleep deprivation

Design: repeated measurements over time

**
Components of model output
a) Model summary info (AIC, BIC etc)
b) Random effects
c) Fixed effects estimates
d) Correlation estimates

Will look at these in a different order.


c) fixed effects estimates
Easy. Just like GLMs.
b) random effects

The random effects themselves

The variance components


The random effects themselves...
Are just (predictions) numbers… ways to make the
fitted regression equation we started with.

Use along with fixed effects estimates and plug in..

yij = (𝛼+ai) + (𝛽 +bi)xij


The variance components
Mixed models work by splitting the total variance in the model to the different
random effects.

This is helpful because we can then ascribe blame to different components.

The sleep model fit random intercepts and random slopes

Variance (intercepts): 640.92

Variance (slopes): 35.92

Variance (residual): 654.94


Usually report proportion variance attributable
Total of all variance: 640.92 + 35.92 + 654.94 = 1331.78

% Variance (intercepts): 640.92 / 1331.78 = 0.48

% Variance (slopes): 35.92 / 1331.78 = 0.03

% Variance (residual): 654.94 / 1331.78 = 0.49

Conclusion: little gain in fitting varying slopes as varying intercepts picks up most
of variability.
New model...
Random intercepts only model.

Fixed effects mostly the same.

Variance component (intercepts): 1419 / (1419+960) = 0.60


Quick summary so far
Longitudinal data / repeated measures on a person

Single level of clustering

Can fit random: intercepts, slopes, or both

Fixed effects: like GLM components

Random effects: interested in variance components and for fitted equations

Other stuff: (next class)


Visualising mixed effects models
The only way to handle large amounts of data, observations or complex models.
Predicted values
Utilise the random effects estimates

Bonus question:

What kind of model was this?


What kind of model was this?
The random effects estimates themselves

Sometimes called “caterpillar plots”


Or BLUPs

Best Linear Unbiased Predictor


There are a lot of moving parts now...
Data

- Outcome: what type of model


- Covariates: (we haven’t talked about yet at all)
- Structure: (design, clustering, repeated measures)

Estimates

- Fixed effects (like GLMS)


- Variance components (for random effects)

Predictions

- Use BOTH fixed and random effects


Sometimes clusters are important!

You might also like