Seminar in Statistics: Survival Analysis
Chapter 2
Kaplan-Meier Survival
Curves and the LogRank Test
Linda Staub & Alexandros Gekenidis
March 7th, 2011
1 Review
Outcome variable of interest: time until an event
occurs
Time = survival time
Event = failure
Censoring: Dont know survival time exactly
True survival time
observed survival time
Right-censored
1 Review
= failure time with distribution , density
= censoring time with distribution , density
Assume that the censoring time is
independent of the variable of interest
= min(, ), = 1*+
We observe i.i.d. copies of (, )
Survivor function
= Pr( > )
Alternative (Ordered) Data Layout
Risk set: collection of individuals who have survived at least to time ()
2 Kaplan-Meier Curves
Example
The data: remission times (weeks) for two groups of
leukemia patients
Group 1 (n=21)
treatment
Group 2 (n=21)
placebo
6, 6, 6, 7, 10,
13, 16, 22, 23,
6+, 9+, 10+, 11+,
17+, 19+, 20+,
25+, 32+, 32+,
34+, 25+
1, 1, 2, 2, 3,
4, 4, 5, 5,
8, 8, 8, 8,
11, 11, 12, 12,
15, 17, 22, 23
+ denotes censored
Group 1
Group 2
# failed
# censored
Total
9
21
12
0
21
21
Descriptive statistic:
T1 ignoring 's 17.1, T2 8.6
Table of ordered failure times
Group 1 (treatment)
nj
t( j )
mj
qj
0
6
7
10
13
16
22
23
>23
0
1
1
2
0
3
0
5
-
21
21
17
15
12
11
7
6
-
Group 1 (treatment)
6, 6, 6, 7, 10,
13, 16, 22, 23,
6+, 9+, 10+, 11+,
17+, 19+, 20+,
25+, 32+, 32+,
34+, 25+
+ denotes
censored
0
3
1
1
1
1
1
1
-
Group 2 (placebo)
1, 1, 2, 2, 3,
4, 4, 5, 5,
8, 8, 8, 8,
11, 11, 12, 12,
15, 17, 22, 23
Group 2 (placebo)
nj
t( j )
mj
qj
0
1
2
3
4
5
8
11
12
15
17
22
23
0
0
0
0
0
0
0
0
0
0
0
0
0
21
21
19
17
16
14
12
8
6
4
3
2
1
0
2
2
1
2
2
4
2
2
1
1
1
1
Remark: no censorship in group 2
Computation of KM-curve for group 2 (no censoring)
t( j )
nj
mj
qj
21
21
19/21 = .90
19
17/21 = .81
17
16/21 = .76
16
14/21 = .67
14
12/21 = .57
12
8/21 = .38
11
6/21 = .29
12
4/21 = .19
15
3/21 = .14
17
2/21 = .10
22
1/21 = .05
23
0/21 = .00
# ()
=
21
Computation of KM-curve for group 2 (no censoring)
t( j )
nj
mj
qj
21
21
19/21 = .90
19
17/21 = .81
17
16/21 = .76
16
14/21 = .67
14
12/21 = .57
12
8/21 = .38
11
6/21 = .29
12
4/21 = .19
15
3/21 = .14
17
2/21 = .10
22
1/21 = .05
23
0/21 = .00
# ()
=
21
Computation of KM-curve for group 2 (no censoring)
t( j )
nj
mj
qj
21
21
19/21 = .90
19
17/21 = .81
17
16/21 = .76
16
14/21 = .67
14
12/21 = .57
12
8/21 = .38
11
6/21 = .29
12
4/21 = .19
15
3/21 = .14
17
2/21 = .10
22
1/21 = .05
23
0/21 = .00
# ()
=
21
Computation of KM-curve for group 2 (no censoring)
t( j )
nj
mj
qj
21
21
19/21 = .90
19
17/21 = .81
17
16/21 = .76
16
14/21 = .67
14
12/21 = .57
12
8/21 = .38
11
6/21 = .29
12
4/21 = .19
15
3/21 = .14
17
2/21 = .10
22
1/21 = .05
23
0/21 = .00
# ()
=
21
Computation of KM-curve for group 2 (no censoring)
t( j )
nj
mj
qj
21
21
19/21 = .90
19
17/21 = .81
17
16/21 = .76
16
14/21 = .67
14
12/21 = .57
12
8/21 = .38
11
6/21 = .29
12
4/21 = .19
15
3/21 = .14
17
2/21 = .10
22
1/21 = .05
23
0/21 = .00
# ()
=
21
KM Curve for Group 2 (Placebo)
> time2 <c(1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,
22,23)
> status2 <c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
> fit2 <- survfit(Surv(time2, status2) ~ 1)
> plot(fit2,conf.int=0, col = 'red', xlab =
'Time (weeks)', ylab = 'Survival Probability')
> title(main='KM Curve for Group 2 (placebo)')
General KM formula
Alternative way to calculate the survival probabilities
KM formula = product limit formula
> () ()
=1
= (1) > () ()
Proof: blackboard
Computation of KM-curve for group 1
(treatment)
t( j )
nj
mj
qj
()
21
21
118/21=.8571
17
.857116/17=.8067
10 15
.806714/15=.7529
13 12
.752911/12=.6902
16
11
.690210/11=.6275
22
.62756/7=.5378
23
.53785/6=.4482
Fraction at () :
Pr > () () )
Not available at t j
failed prior to t j
Censored prior to
t j
Computation of KM-curve for group 1
(treatment)
t( j )
nj
mj
qj
()
21
21
118/21=.8571
17
.857116/17=.8067
10 15
.806714/15=.7529
13 12
.752911/12=.6902
16
11
.690210/11=.6275
22
.62756/7=.5378
23
.53785/6=.4482
Fraction at () :
Pr > () () )
Not available at t j
failed prior to t j
Censored prior to
t j
Computation of KM-curve for group 1
(treatment)
t( j )
nj
mj
qj
()
21
21
118/21=.8571
17
.857116/17=.8067
10 15
.806714/15=.7529
13 12
.752911/12=.6902
16
11
.690210/11=.6275
22
.62756/7=.5378
23
.53785/6=.4482
Fraction at () :
Pr > () () )
Not available at t j
failed prior to t j
Censored prior to
t j
Computation of KM-curve for group 1
(treatment)
t( j )
nj
mj
qj
()
21
21
118/21=.8571
17
.857116/17=.8067
10 15
.806714/15=.7529
13 12
.752911/12=.6902
16
11
.690210/11=.6275
22
.62756/7=.5378
23
.53785/6=.4482
Fraction at () :
Pr > () () )
Not available at t j
failed prior to t j
Censored prior to
t j
Computation of KM-curve for group 1
(treatment)
t( j )
nj
mj
qj
()
21
21
118/21=.8571
17
.857116/17=.8067
10 15
.806714/15=.7529
13 12
.752911/12=.6902
16
11
.690210/11=.6275
22
.62756/7=.5378
23
.53785/6=.4482
Fraction at () :
Pr > () () )
Not available at t j
failed prior to t j
Censored prior to
t j
KM-curve for group 1 (treatment)
> time1 <c(6,6,6,7,10,13,16,22,23,6,9,10,11,17,19,20,
25,32,32,34,35)
> status1 <c(1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0)
> fit1 <- survfit(Surv(time1, status1) ~ 1)
> plot(fit1,conf.int=0, col = 'red', xlab =
'Time (weeks)', ylab = 'Survival
Probability')
> title(main='KM Curve for Group 1
(treatment)')
KM-estimator = Nonparametric MLE
Model
= failure time
distr. function , density
= censoring time
distr. function , density
Assume that is independent of
= min(, )
= 1*+
We observe i.i.d. copies of (, )
Derivation of the likelihood for
Claim
The density of observing (, 1) is:
()(1 ())
The density of observing (, 0) is:
()(1 ())
Proof of the Claim: Blackboard
Density of observing (, ) is:
1
=
The likelihood for and of i.i.d. observations (1 , 1 ), , ( , ) is:
=1
and independent Ignore part that involves
In order to find the nonparametric maximum likelihood estimator , we
need to maximize this expression over all possible distribution
functions (with corresponding density ).
Optimization problem
sup ()
where is the class of all distribution functions on and
=1
But: Problem is not well-defined!
Solution: Let be a density w.r.t. counting measure on the observed
failure times (instead of a density w.r.t. Lebesgue measure)
Replace ( ) by
survival function at
= , the jump of the distribution /
Parametrizing everything in terms of the survival function = 1 :
=
=1
And satisfies
= max , where is the space of all survival functions
One can show that the Kaplan-Meier estimator maximizes the likelihood
KM-estimator is the NPMLE
Comparison of KM Plots for Remission Data
> time1 <c(6,6,6,7,10,13,16,22,23,6,9,10,11,17,19,20,25
,32,32,34,35)
> status1 <c(1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0)
> time2 <c(1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,
22,23)
> status2 <c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
> fit1 <- survfit(Surv(time1, status1) ~ 1)
> fit2 <- survfit(Surv(time2, status2) ~ 1)
> plot(fit1,conf.int=0, col = 'blue', xlab =
'Time (weeks)', ylab = 'Survival Probability')
> lines(fit2, col = 'red')
> legend(21,1,c('Group 1 (treatment)', 'Group
2 (placebo)'), col = c('blue','red'), lty = 1)
> title(main='KM-Curves for Remission Data')
Question: Do we have any reason to claim that group 1 (treatment)
has better survival prognosis than group 2?
3 The Log-Rank Test
We look at 2 groups extensions to several groups
possible
When are two KM curves statistically equivalent?
testing procedure compares the two curves
we dont have evidence to indicate that the true
survival curves are different
Nullhypothesis
H 0 : no difference between (true) survival curves
Goal: To find an expression (depending on the data)
from which we know the distribution (or at least
approximately) under the nullhypothesis
Derivation of test statistic
Remission data: n=42
# failures
# in risk set
t j
m1 j m2 j
n1 j
n2 j
1
2
3
4
5
6
7
8
10
11
12
13
15
16
17
22
23
0
0
0
0
0
3
1
0
1
0
0
1
0
1
0
1
1
21
21
21
21
21
21
17
16
15
13
12
12
11
11
10
7
6
21
19
17
16
14
12
12
12
8
8
6
4
4
3
3
2
1
2
2
1
2
2
0
0
4
0
2
12
0
1
0
1
1
1
Expected cell counts:
e1 j
n1 j
n n
2j
1j
Proportion
in risk set
e2 j
n2 j
n n
2j
1j
m1 j m2 j
# of failures
over both
groups
m1 j m2 j
Oi Ei
# failure times
j 1
ij
eij
O1 E1 10.26
O2 E2 10.26
Log-rank statistic
O2 E2 2
=
Var O2 E2
Remark: We could also work
with O1 E1 and would get the
same statistic! Why?
Distribution of log-rank statistic
H 0 : no difference between survival curves
Log-rank statistic for two groups =
O2 E2 2
Var O2 E2
12
Idea of the Proof:
If is standard normal disitributed then 2 has a 2 distribution with 1 df
(assuming to be one-dim)
Set =
2 2
2 2
Then is standardized and appr. normal distributed for large samples
Hence 2 , which is exactly our statistic, has appr. a 2 distribution.
Log-Rank Test for Remission data
R-code
> time <c(6,6,6,7,10,13,16,22,23,6,9,10,11,17,19,20,25,32,32,34,35,1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,
12,12,15,17,22,23)
> status <c(1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
> treatment <c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
> fit <- survdiff(Surv(time, status) ~ treatment)
Result
p-value is the probability of obtaining a test
statistic at least as extreme as the one that
was actually observed!
> fit
Call:
survdiff(formula = Surv(time, status) ~ treatment)
N Observed Expected (O-E)^2/E (O-E)^2/V
treatment=1 21
9
19.3
5.46
16.8
treatment=2 21
21
10.7
9.77
16.8
Chisq = 16.8 on 1 degrees of freedom, p = 4.17e-05
What does this tell us?
The Log-Rank Test for Several
Groups
0 : All survival curves are the same
Log-rank statistics for > 2 groups involves variances
and covariances of
( 2) groups:
log-rank statistic ~ 2 with 1 df
Remarks
Alternatives to the Log-Rank Test
Wilcoxen
Tarone-Ware
Peto
Flemington-Harrington
Variations of the log
rank test, derived by
applying different
weights at the jth
failure time
Weighting the
Test statistic:
w(t j )(mij eij )
Var w(t j )(mij eij )
j
Weight at jth
failure time
Remarks
Choosing a Test
Results of different weightings usually
lead to similar conclusions
The best choice is test with most power
There may be a clinical reason to choose a particular
weighting
Choice of weighting should be a priori! Not fish for a
desired p-value!
Stratified log rank test
Variation of log rank test
Allows controlling for additional (stratified) variable
Split data into stratas, depending on value of
stratified variable
Calculate scores within strata
Sum across strata
Stratified log rank test - Example
Remission data
Stratified variable: 3-level variable (LWBC3) indicating
low, medium, or high log white blood cell count (coded 1,
2, and 3, respectively)
Treated Group: rx = 0
Placebo Group: rx = 1
Recall: Non-stratified test 2 -value of 16.79
and corresponding p-value rounded to 0.0000
Stratified Log-Rank Test for
Remission data
R-code
> data <- read.table("http://www.sph.emory.edu/~dkleinb/surv2datasets/anderson.dat")
> lwbc3 <c(1,1,1,2,1,2,2,1,1,1,3,2,2,2,2,2,3,3,2,3,3,1,2,2,1,1,3,3,1,3,3,2,3,3,3,3,2,3,3,3,2,3)
> fit <- survdiff(Surv(data$V1,data$V2)~data$V5+strata(lwbc3))
Result
> fit
Call:
survdiff(formula = Surv(data$V1, data$V2) ~ data$V5 + strata(lwbc3))
N Observed Expected (O-E)^2/E (O-E)^2/V
data$V5=0 21
9
16.4
3.33
10.1
data$V5=1 21
21
13.6
4.00
10.1
Chisq = 10.1
on 1 degrees of freedom, p = 0.00145
Stratified vs. unstratified approach
Stratified vs. unstratified approach
Limitation: Sample size may be
small within strata
Stratified vs. unstratified approach
Limitation: Sample size may be
small within strata
In next chapter: controlling for
other explanatory variables!
References
KLEINBAUM, D.G. and KLEIN, M. (2005).
Survival Analysis. A self-learning text.
Springer.
MAATHUIS, M. (2007). Survival analysis for
interval censored data. Part I.