0% found this document useful (0 votes)

156 views39 pages

Kaplan-Meier and Log-Rank Analysis

This document discusses Kaplan-Meier survival curves and the log-rank test. It provides an example of survival time data for two groups of leukemia patients, one receiving treatment and one receiving a placebo. It then demonstrates how to compute the Kaplan-Meier curve for each group, taking into account censored observations. The curves allow estimating the survival probability over time for each group. The log-rank test can then be used to statistically compare the survival distributions between the two groups.

Uploaded by

goldfronts1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

156 views39 pages

Kaplan-Meier and Log-Rank Analysis

Uploaded by

goldfronts1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

Seminar in Statistics: Survival Analysis

Chapter 2

Kaplan-Meier Survival
Curves and the LogRank Test
Linda Staub & Alexandros Gekenidis
March 7th, 2011

1 Review

Outcome variable of interest: time until an event

occurs
Time = survival time
Event = failure
Censoring: Dont know survival time exactly
True survival time
observed survival time
Right-censored

1 Review

= failure time with distribution , density

= censoring time with distribution , density
Assume that the censoring time is
independent of the variable of interest
= min(, ), = 1*+

We observe i.i.d. copies of (, )

Survivor function

= Pr( > )

Alternative (Ordered) Data Layout

Risk set: collection of individuals who have survived at least to time ()

2 Kaplan-Meier Curves

Example
The data: remission times (weeks) for two groups of
leukemia patients

Group 1 (n=21)
treatment

Group 2 (n=21)
placebo

6, 6, 6, 7, 10,
13, 16, 22, 23,
6+, 9+, 10+, 11+,
17+, 19+, 20+,
25+, 32+, 32+,
34+, 25+

1, 1, 2, 2, 3,
4, 4, 5, 5,
8, 8, 8, 8,
11, 11, 12, 12,
15, 17, 22, 23

+ denotes censored

Group 1
Group 2

# failed

# censored

Total

9
21

12
0

21
21

Descriptive statistic:

T1 ignoring 's 17.1, T2 8.6

Table of ordered failure times

Group 1 (treatment)
nj
t( j )
mj

0
6
7
10
13
16
22
23
>23

0
1
1
2
0
3
0
5
-

21
21
17
15
12
11
7
6
-

Group 1 (treatment)
6, 6, 6, 7, 10,
13, 16, 22, 23,
6+, 9+, 10+, 11+,
17+, 19+, 20+,
25+, 32+, 32+,
34+, 25+
+ denotes
censored

0
3
1
1
1
1
1
1
-

Group 2 (placebo)
1, 1, 2, 2, 3,
4, 4, 5, 5,
8, 8, 8, 8,
11, 11, 12, 12,
15, 17, 22, 23

Group 2 (placebo)
nj
t( j )
mj

0
1
2
3
4
5
8
11
12
15
17
22
23

0
0
0
0
0
0
0
0
0
0
0
0
0

21
21
19
17
16
14
12
8
6
4
3
2
1

0
2
2
1
2
2
4
2
2
1
1
1
1

Remark: no censorship in group 2

Computation of KM-curve for group 2 (no censoring)

t( j )

19/21 = .90

17/21 = .81

16/21 = .76

14/21 = .67

12/21 = .57

8/21 = .38

6/21 = .29

4/21 = .19

3/21 = .14

2/21 = .10

1/21 = .05

0/21 = .00

# ()
=
21

Computation of KM-curve for group 2 (no censoring)

t( j )

19/21 = .90

17/21 = .81

16/21 = .76

14/21 = .67

12/21 = .57

8/21 = .38

6/21 = .29

4/21 = .19

3/21 = .14

2/21 = .10

1/21 = .05

0/21 = .00

# ()
=
21

Computation of KM-curve for group 2 (no censoring)

t( j )

19/21 = .90

17/21 = .81

16/21 = .76

14/21 = .67

12/21 = .57

8/21 = .38

6/21 = .29

4/21 = .19

3/21 = .14

2/21 = .10

1/21 = .05

0/21 = .00

# ()
=
21

Computation of KM-curve for group 2 (no censoring)

t( j )

19/21 = .90

17/21 = .81

16/21 = .76

14/21 = .67

12/21 = .57

8/21 = .38

6/21 = .29

4/21 = .19

3/21 = .14

2/21 = .10

1/21 = .05

0/21 = .00

# ()
=
21

Computation of KM-curve for group 2 (no censoring)

t( j )

19/21 = .90

17/21 = .81

16/21 = .76

14/21 = .67

12/21 = .57

8/21 = .38

6/21 = .29

4/21 = .19

3/21 = .14

2/21 = .10

1/21 = .05

0/21 = .00

# ()
=
21

KM Curve for Group 2 (Placebo)

> time2 <c(1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,

.752911/12=.6902

.690210/11=.6275

.62756/7=.5378

.53785/6=.4482

Fraction at () :
Pr > () () )

Not available at t j
failed prior to t j
Censored prior to
t j

KM-curve for group 1 (treatment)

> time1 <c(6,6,6,7,10,13,16,22,23,6,9,10,11,17,19,20,
25,32,32,34,35)
> status1 <c(1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0)
> fit1 <- survfit(Surv(time1, status1) ~ 1)
> plot(fit1,conf.int=0, col = 'red', xlab =
'Time (weeks)', ylab = 'Survival
Probability')
> title(main='KM Curve for Group 1
(treatment)')

KM-estimator = Nonparametric MLE

Model
= failure time

distr. function , density

= censoring time

distr. function , density

Assume that is independent of

= min(, )

= 1*+

We observe i.i.d. copies of (, )

Derivation of the likelihood for

Claim
The density of observing (, 1) is:

()(1 ())

The density of observing (, 0) is:

()(1 ())

Proof of the Claim: Blackboard

Density of observing (, ) is:
1
=

The likelihood for and of i.i.d. observations (1 , 1 ), , ( , ) is:

and independent Ignore part that involves

In order to find the nonparametric maximum likelihood estimator , we
need to maximize this expression over all possible distribution
functions (with corresponding density ).
Optimization problem
sup ()

where is the class of all distribution functions on and

But: Problem is not well-defined!

Solution: Let be a density w.r.t. counting measure on the observed

failure times (instead of a density w.r.t. Lebesgue measure)
Replace ( ) by
survival function at

= , the jump of the distribution /

Parametrizing everything in terms of the survival function = 1 :

And satisfies
= max , where is the space of all survival functions

One can show that the Kaplan-Meier estimator maximizes the likelihood
KM-estimator is the NPMLE

Comparison of KM Plots for Remission Data

> time1 <c(6,6,6,7,10,13,16,22,23,6,9,10,11,17,19,20,25
,32,32,34,35)
> status1 <c(1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0)
> time2 <c(1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,
22,23)
> status2 <c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
> fit1 <- survfit(Surv(time1, status1) ~ 1)
> fit2 <- survfit(Surv(time2, status2) ~ 1)
> plot(fit1,conf.int=0, col = 'blue', xlab =
'Time (weeks)', ylab = 'Survival Probability')
> lines(fit2, col = 'red')
> legend(21,1,c('Group 1 (treatment)', 'Group
2 (placebo)'), col = c('blue','red'), lty = 1)
> title(main='KM-Curves for Remission Data')

Question: Do we have any reason to claim that group 1 (treatment)

has better survival prognosis than group 2?

3 The Log-Rank Test

We look at 2 groups extensions to several groups

possible
When are two KM curves statistically equivalent?
testing procedure compares the two curves
we dont have evidence to indicate that the true
survival curves are different
Nullhypothesis
H 0 : no difference between (true) survival curves

Goal: To find an expression (depending on the data)

from which we know the distribution (or at least
approximately) under the nullhypothesis

Derivation of test statistic

Remission data: n=42
# failures

# in risk set

t j

m1 j m2 j

n1 j

n2 j

1
2
3
4
5
6
7
8
10
11
12
13
15
16
17
22
23

0
0
0
0
0
3
1
0
1
0
0
1
0
1
0
1
1

21
21
21
21
21
21
17
16
15
13
12
12
11
11
10
7
6

21
19
17
16
14
12
12
12
8
8
6
4
4
3
3
2
1

2
2
1
2
2
0
0
4
0
2
12
0
1
0
1
1
1

Expected cell counts:

e1 j

n1 j

n n
2j
1j
Proportion
in risk set

e2 j

n2 j

n n
2j
1j

m1 j m2 j

# of failures
over both
groups

m1 j m2 j

Oi Ei

# failure times
j 1

eij

O1 E1 10.26
O2 E2 10.26
Log-rank statistic

O2 E2 2
=
Var O2 E2

Remark: We could also work

with O1 E1 and would get the
same statistic! Why?

Distribution of log-rank statistic

H 0 : no difference between survival curves

Log-rank statistic for two groups =

O2 E2 2
Var O2 E2

Idea of the Proof:

If is standard normal disitributed then 2 has a 2 distribution with 1 df

(assuming to be one-dim)
Set =

2 2
2 2

Then is standardized and appr. normal distributed for large samples

Hence 2 , which is exactly our statistic, has appr. a 2 distribution.

Log-Rank Test for Remission data

R-code
> time <c(6,6,6,7,10,13,16,22,23,6,9,10,11,17,19,20,25,32,32,34,35,1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,
12,12,15,17,22,23)
> status <c(1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
> treatment <c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
> fit <- survdiff(Surv(time, status) ~ treatment)

Result

p-value is the probability of obtaining a test

statistic at least as extreme as the one that
was actually observed!

> fit
Call:
survdiff(formula = Surv(time, status) ~ treatment)
N Observed Expected (O-E)^2/E (O-E)^2/V
treatment=1 21
9
19.3
5.46
16.8
treatment=2 21
21
10.7
9.77
16.8
Chisq = 16.8 on 1 degrees of freedom, p = 4.17e-05

What does this tell us?

The Log-Rank Test for Several

Groups

0 : All survival curves are the same

Log-rank statistics for > 2 groups involves variances
and covariances of
( 2) groups:
log-rank statistic ~ 2 with 1 df

Remarks

Alternatives to the Log-Rank Test

Wilcoxen

Tarone-Ware
Peto
Flemington-Harrington

Variations of the log

rank test, derived by
applying different
weights at the jth
failure time

Weighting the
Test statistic:

w(t j )(mij eij )

Var w(t j )(mij eij )

Weight at jth
failure time

Remarks

Choosing a Test

Results of different weightings usually

lead to similar conclusions
The best choice is test with most power
There may be a clinical reason to choose a particular
weighting
Choice of weighting should be a priori! Not fish for a
desired p-value!

Stratified log rank test

Variation of log rank test

Allows controlling for additional (stratified) variable
Split data into stratas, depending on value of
stratified variable
Calculate scores within strata
Sum across strata

Stratified log rank test - Example

Remission data
Stratified variable: 3-level variable (LWBC3) indicating
low, medium, or high log white blood cell count (coded 1,
2, and 3, respectively)

Treated Group: rx = 0
Placebo Group: rx = 1
Recall: Non-stratified test 2 -value of 16.79
and corresponding p-value rounded to 0.0000

Stratified Log-Rank Test for

Remission data

R-code
> data <- read.table("http://www.sph.emory.edu/~dkleinb/surv2datasets/anderson.dat")
> lwbc3 <c(1,1,1,2,1,2,2,1,1,1,3,2,2,2,2,2,3,3,2,3,3,1,2,2,1,1,3,3,1,3,3,2,3,3,3,3,2,3,3,3,2,3)
> fit <- survdiff(Surv(data$V1,data$V2)~data$V5+strata(lwbc3))

Result
> fit
Call:
survdiff(formula = Surv(data$V1, data$V2) ~ data$V5 + strata(lwbc3))
N Observed Expected (O-E)^2/E (O-E)^2/V
data$V5=0 21
9
16.4
3.33
10.1
data$V5=1 21
21
13.6
4.00
10.1

Chisq = 10.1

on 1 degrees of freedom, p = 0.00145

Stratified vs. unstratified approach

Limitation: Sample size may be

small within strata

Stratified vs. unstratified approach

Limitation: Sample size may be

small within strata
In next chapter: controlling for
other explanatory variables!

References
KLEINBAUM, D.G. and KLEIN, M. (2005).
Survival Analysis. A self-learning text.
Springer.
MAATHUIS, M. (2007). Survival analysis for
interval censored data. Part I.

Survival Analysis
No ratings yet
Survival Analysis
28 pages
Kaplan-Meier Survival Curves and The Log-Rank Test
No ratings yet
Kaplan-Meier Survival Curves and The Log-Rank Test
42 pages
Survival Analysis: KM & Cox Methods
No ratings yet
Survival Analysis: KM & Cox Methods
13 pages
A Confidence Interval For The Median Survival Time
No ratings yet
A Confidence Interval For The Median Survival Time
14 pages
Kaplan-Meier Estimator: Association. The Journal Editor, John Tukey, Convinced Them To Combine Their
No ratings yet
Kaplan-Meier Estimator: Association. The Journal Editor, John Tukey, Convinced Them To Combine Their
7 pages
Survival Analysis With STATA 1701597623
No ratings yet
Survival Analysis With STATA 1701597623
252 pages
Biostatistics: Kaplan-Meier Analysis
No ratings yet
Biostatistics: Kaplan-Meier Analysis
73 pages
Cox Proportional Hazard Model Overview
No ratings yet
Cox Proportional Hazard Model Overview
34 pages
Survival Analysis for Academics
No ratings yet
Survival Analysis for Academics
133 pages
Survival Analysis
No ratings yet
Survival Analysis
30 pages
Survival Analysis Dengan Pendekatan R
No ratings yet
Survival Analysis Dengan Pendekatan R
32 pages
Introduction To Survival Analysis: BIOST 515 February 26, 2004
No ratings yet
Introduction To Survival Analysis: BIOST 515 February 26, 2004
30 pages
Introduction To Survival Analysis: Lecture Notes
No ratings yet
Introduction To Survival Analysis: Lecture Notes
28 pages
Introduction to Survival Analysis
No ratings yet
Introduction to Survival Analysis
54 pages
Advanced Survival Analysis Guide
No ratings yet
Advanced Survival Analysis Guide
71 pages
(Cox (1972) ) Regression Models and Life Tables PDF
No ratings yet
(Cox (1972) ) Regression Models and Life Tables PDF
35 pages
Analysis of Survival Data - LN - D Zhang - 05
100% (1)
Analysis of Survival Data - LN - D Zhang - 05
264 pages
Survival Analysis
No ratings yet
Survival Analysis
36 pages
Survival Analysis Methods Guide
100% (1)
Survival Analysis Methods Guide
15 pages
CT 4201304
No ratings yet
CT 4201304
26 pages
Part14 Survival Analysis
No ratings yet
Part14 Survival Analysis
22 pages
Logistic Regression Survival Analysis Kaplan-Meier
No ratings yet
Logistic Regression Survival Analysis Kaplan-Meier
13 pages
Survival Analysis with R Guide
No ratings yet
Survival Analysis with R Guide
42 pages
Longitudinal Data Analysis
100% (1)
Longitudinal Data Analysis
103 pages
Understanding Poisson Regression Models
No ratings yet
Understanding Poisson Regression Models
19 pages
Non-Parametric Survival Models
100% (1)
Non-Parametric Survival Models
4 pages
Scope of Operational Research: Presented By: Ramsha Ghaffar Saira Bano Muhammad Ayaz Syed Hassan Ali Hashmi
No ratings yet
Scope of Operational Research: Presented By: Ramsha Ghaffar Saira Bano Muhammad Ayaz Syed Hassan Ali Hashmi
10 pages
Regression Logistic Regression
100% (1)
Regression Logistic Regression
37 pages
What Is A Cox Model?: Sponsored by An Educational Grant From Aventis Pharma
No ratings yet
What Is A Cox Model?: Sponsored by An Educational Grant From Aventis Pharma
8 pages
Advanced Mathematical Statistics II
No ratings yet
Advanced Mathematical Statistics II
192 pages
Survival Competing Risk
No ratings yet
Survival Competing Risk
29 pages
Survival Models
100% (1)
Survival Models
97 pages
06 - Natural Experiment (Part 1) PDF
No ratings yet
06 - Natural Experiment (Part 1) PDF
89 pages
Generalized Linear Models
No ratings yet
Generalized Linear Models
109 pages
Survival Models for Actuarial Students
No ratings yet
Survival Models for Actuarial Students
13 pages
Cox Regression Overview by Kristin Sainani
No ratings yet
Cox Regression Overview by Kristin Sainani
62 pages
Odds Ratio, Hazard Ratio and Relative Risk: Janez Stare Delphine Maucort-Boulch
No ratings yet
Odds Ratio, Hazard Ratio and Relative Risk: Janez Stare Delphine Maucort-Boulch
9 pages
Count Data Models in SAS
No ratings yet
Count Data Models in SAS
12 pages
Analyzing Grouped Data Statistics
No ratings yet
Analyzing Grouped Data Statistics
51 pages
Statatistical Inferences
No ratings yet
Statatistical Inferences
22 pages
Kaplan-Meier Estimator and Censoring
No ratings yet
Kaplan-Meier Estimator and Censoring
33 pages
Aiken & West (1991) Chap07 PDF
No ratings yet
Aiken & West (1991) Chap07 PDF
14 pages
Solution Manual For Essentials of Econometrics 4th Edition
No ratings yet
Solution Manual For Essentials of Econometrics 4th Edition
14 pages
Research Methods in Economics Part II STAT
No ratings yet
Research Methods in Economics Part II STAT
350 pages
Curtate Life Expectancy Analysis
No ratings yet
Curtate Life Expectancy Analysis
15 pages
Survival Analysis
No ratings yet
Survival Analysis
22 pages
Confidence Interval Estimation
100% (1)
Confidence Interval Estimation
31 pages
Moment Generating Functions
No ratings yet
Moment Generating Functions
7 pages
Dummy Regression
No ratings yet
Dummy Regression
23 pages
Difference-in-Differences Methodology
No ratings yet
Difference-in-Differences Methodology
31 pages
Kaplan-Meier Survival Analysis Guide
No ratings yet
Kaplan-Meier Survival Analysis Guide
23 pages
Lecture Three
No ratings yet
Lecture Three
28 pages
Survival Analysis for Statisticians
No ratings yet
Survival Analysis for Statisticians
44 pages
Survival Analysis Theory 2024-4
No ratings yet
Survival Analysis Theory 2024-4
49 pages
Informe de Viaje de Visita Tecnica de Los Puentes La Leche Vilela y Motupe
No ratings yet
Informe de Viaje de Visita Tecnica de Los Puentes La Leche Vilela y Motupe
42 pages
Kaplan-Meier Survival Analysis Guide
No ratings yet
Kaplan-Meier Survival Analysis Guide
5 pages
Assignment
No ratings yet
Assignment
20 pages
AMA 4424 Survival Analysis B
100% (1)
AMA 4424 Survival Analysis B
5 pages
Pages From (Statistics For Biology and Health) David G. Kleinbaum, Mitchel Klein (Auth.) - Survival
No ratings yet
Pages From (Statistics For Biology and Health) David G. Kleinbaum, Mitchel Klein (Auth.) - Survival
10 pages
Kaplan-Meier Estimator in Survival Analysis
No ratings yet
Kaplan-Meier Estimator in Survival Analysis
29 pages
15th Sha'ban: Night of Worship
No ratings yet
15th Sha'ban: Night of Worship
19 pages
Firms in Competitive Markets
100% (1)
Firms in Competitive Markets
41 pages
LESSON 3 - Physiological Cota
No ratings yet
LESSON 3 - Physiological Cota
37 pages
Synonyms of The New Testament PDF
100% (2)
Synonyms of The New Testament PDF
408 pages
ABP3 Intoxmetalespesados
No ratings yet
ABP3 Intoxmetalespesados
18 pages
Characters in Motion
No ratings yet
Characters in Motion
25 pages
Grade 12 Inquiry Module 3 Overview
No ratings yet
Grade 12 Inquiry Module 3 Overview
23 pages
Proof of R(3, 6) = 18
No ratings yet
Proof of R(3, 6) = 18
4 pages
DAILY LESSON LOG OF M11GM-Ig-1 (Week Seven-Day Two) : Expected Answers
No ratings yet
DAILY LESSON LOG OF M11GM-Ig-1 (Week Seven-Day Two) : Expected Answers
4 pages
Shipwreck Arcana - Creatures From The Deep V3
No ratings yet
Shipwreck Arcana - Creatures From The Deep V3
13 pages
Field Guide To Butterflies of The San Francisco Bay and Sacramento Valley Regions (Arthur M. Shapiro, Timothy D. Manolis) (Z-Library)
50% (2)
Field Guide To Butterflies of The San Francisco Bay and Sacramento Valley Regions (Arthur M. Shapiro, Timothy D. Manolis) (Z-Library)
425 pages
Community Health Nursing Review and Introduction
No ratings yet
Community Health Nursing Review and Introduction
11 pages
HTCC Priest Req
No ratings yet
HTCC Priest Req
2 pages
Dwarf Sample DND 2025
No ratings yet
Dwarf Sample DND 2025
4 pages
History of Group Work
No ratings yet
History of Group Work
3 pages
Crafting a Definition Essay
100% (2)
Crafting a Definition Essay
6 pages
Educational Technology Part 2 REVIEWER
No ratings yet
Educational Technology Part 2 REVIEWER
4 pages
Simple Past Tense Lesson Plan
No ratings yet
Simple Past Tense Lesson Plan
4 pages
AS400 Interview Questions Part - 2
No ratings yet
AS400 Interview Questions Part - 2
2 pages
Executive Directors Mumbai - 2250
No ratings yet
Executive Directors Mumbai - 2250
204 pages
Environmental Science For AP Full Download
No ratings yet
Environmental Science For AP Full Download
406 pages
Dominoess Mobydick
No ratings yet
Dominoess Mobydick
9 pages
Engaging Activities for ESL Teachers
No ratings yet
Engaging Activities for ESL Teachers
4 pages
English Communication Skills
No ratings yet
English Communication Skills
3 pages
Crested Honey Buzzard
No ratings yet
Crested Honey Buzzard
2 pages
Don T Sweat The Small Stuff TOC
No ratings yet
Don T Sweat The Small Stuff TOC
4 pages
The Importance of Project Planning
No ratings yet
The Importance of Project Planning
2 pages
The King of The Sea by Dina Zaman
No ratings yet
The King of The Sea by Dina Zaman
3 pages
MICHAEL CHILUFYA SATA V POST NEWSPAPER LIMITED
No ratings yet
MICHAEL CHILUFYA SATA V POST NEWSPAPER LIMITED
9 pages
Chapter 2 Classification of Company
No ratings yet
Chapter 2 Classification of Company
18 pages