Statistics 262:
Intermediate Biostatistics
Kaplan-Meier methods and Parametric Regression
methods
More on Kaplan-Meier estimator of S(t)
(product-limit estimator or KM
estimator)
When there are no censored data, the KM
estimator is simple and intuitive:
When there are censored data, KM provides
estimate of S(t) that takes censoring into account
(see last weeks lecture).
Estimated S(t)= proportion of observations with failure times >
t.
For example, if you are following 10 patients, and 3 of them die
by the end of the first year, then your best estimate of S(1 year)
= 70%.
If the censored observation had actually been a failure: S(1
year)=4/5*3/4*2/3=2/5=40%
KM estimator is defined only at times when events
occur! (empirically defined)
2
KM (product-limit)
estimator, formally
k distinct event times t1 t j ... t k
at each event time t j , there are n j individuals at - risk
d j is the number who have the event at time t j
S (t )
dj
[1 n
j:t j t
KM (product-limit)
estimator, formally
Observed event times
k distinct event times t1 t j ... t k
at each event time t j , there are n j individuals at - risk
d j is the number who have the event at time t j
dj
[1 n
The risk set nj at time tj consists of
Typically
dj= 1 sample
person, minus
unlessall
data
the original
those
have in
been
censored
had the
arewho
grouped
time
intervalsor(e.g.,
j
j:t j t
event before
tj the event in the
everyone
who had
rd
3 month).
dj/nj=proportion that failed at the event
S(t) represents estimated survival probability at time t:
time tj
P(T>t)
S (t )
1- dj/nj=proportion surviving the event
time
Multiply the probability of surviving
This formula gives the product-limit estimate of survival at each time an event happe
event time t with the probabilities of
surviving all the previous event times.
Example 1: time-toconception for subfertile
women
Failure
here is a good thing.
38 women (in 1982) were treated for infertility
with laparoscopy and hydrotubation.
All women were followed for up to 2-years to
describe time-to-conception.
The event is conception, and women "survived"
until they conceived.
Example from: BMJ, Dec 1998; 317: 1572 - 1580.
5
Raw data: Time (months) to conception or censoring in 38sub-fertile
women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16
2
3
4
7
7
8
8
9
9
9
11
24
24
Did not conceive
(censored)
Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014
Corresponding KaplanMeier Curve
S(t) is estimated at 9 event
times.
(step-wise function)
Raw data: Time (months) to conception or censoring in 38sub-fertile
women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16
2
3
4
7
7
8
8
9
9
9
11
24
24
Did not conceive
(censored)
Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014
Raw data: Time (months) to conception or censoring in 38sub-fertile
women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16
2
3
4
7
7
8
8
9
9
9
11
24
24
Did not conceive
(censored)
Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014
Corresponding KaplanMeier Curve
6 women conceived in 1st
month (1st menstrual cycle).
Therefore, 32/38 survived
pregnancy-free past 1 month.
10
Corresponding KaplanMeier Curve
S(t=1) = 32/38 = 84.2%
S(t) represents estimated survival probability: P(T>t)
Here P(T>1).
11
Raw data: Time (months) to conception or censoring in 38sub-fertile
women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16
Did not conceive
(censored)
2.1
3
4
7
7
8
8
9
9
9
11
24
24
Important detail of how the data were coded:
t=2 indicates survival PAST the 2 nd cycle
Censoring at
(i.e., we know
the woman survived her 2nd cycle
pregnancy-free).
Thus, for calculating KM estimator at 2 months, this
person should still be included in the risk set.
Think of it as
2+ months, e.g., 2.1 months.
Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014
Corresponding KaplanMeier Curve
13
Corresponding KaplanMeier Curve
5 women conceive in 2nd month.
The risk set at event time 2 included
32 women.
Therefore, 27/32=84.4% survived
event time 2 pregnancy-free.
S(t=2) = ( 84.2%)*(84.4%)=71.1%
Can get an estimate of the hazard rate
here, h(t=2)= 5/32=15.6%. Given
that you didnt get pregnant in month
1, you have an estimated 5/32 chance
of conceiving in the 2nd month.
And estimate of density (marginal
probability of conceiving in month 2):
f(t)=h(t)*S(t)=(.711)*(.156)=11%
14
Raw data: Time (months) to conception or censoring in 38sub-fertile
women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16
2.1
3.1
4
7
7
8
8
9
9
9
11
24
24
Did not conceive
(censored)
Risk set at 3
months
includes 26
women
Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014
Corresponding KaplanMeier Curve
16
Corresponding KaplanMeier Curve
3 women conceive in the 3rd month.
The risk set at event time 3 included
26 women.
23/26=88.5% survived event time 3
pregnancy-free.
S(t=3) = ( 84.2%)*(84.4%)*(88.5%)=62.8%
17
Raw data: Time (months) to conception or censoring in 38sub-fertile
women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16
2
3.1
4
7
7
8
8
9
9
9
11
24
24
Did not conceive
(censored)
Risk set at 4
months
includes 22
women
Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014
Corresponding KaplanMeier Curve
19
Corresponding KaplanMeier Curve
3 women conceive in the 4th month,
and 1 was censored between months
3 and 4.
The risk set at event time 4 included
22 women.
19/22=86.4% survived event time 4
pregnancy-free.
S(t=4) = ( 84.2%)*(84.4%)*(88.5%)*(86.4%)=54.2%
Hazard rates (conditional chances of
conceiving, e.g. 100%-84%) look
similar over time.
And estimate of density (marginal
probability of conceiving in month
4):
f(t)=h(t)*S(t)=(.136)*
(.542)=7.4%
20
Raw data: Time (months) to conception or censoring in 38sub-fertile
women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16
2
3
4.1
7
7
8
8
9
9
9
11
24
24
Did not conceive
(censored)
Risk set at 6
months
includes 18
women
Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014
Corresponding KaplanMeier Curve
22
Corresponding KaplanMeier Curve
2 women conceive in the 6th month of
the study, and one was censored
between months 4 and 6.
The risk set at event time 5 included
18 women.
16/18=88.8% survived event time 5
pregnancy-free.
S(t=6) = (54.2%)*(88.8%)=42.9%
23
Skipping ahead to the 9th and
final event time (months=16)
S(t=13) 22%
(eyeball approximation)
24
Raw data: Time (months) to conception or censoring in 38sub-fertile
women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16
2
3
4
7
7
8
8
9
9
9
11
24
24
Did not conceive
(censored)
2 remaining at 16
months (9th event
time)
Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014
Skipping ahead to the 9th and
final event time (months=16)
S(t=16) =( 22%)*(2/3)=15%
Tail here just represents that
the final 2 women did not
conceive (cannot make many
inferences from the end of a
KM curve)!
26
Kaplan-Meier: SAS output
The LIFETEST Procedure
Product-Limit Survival Estimates
time
0.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
2.0000
2.0000
2.0000
2.0000
2.0000
2.0000*
3.0000
3.0000
3.0000
3.0000*
4.0000
4.0000
4.0000
4.0000*
Survival
1.0000
.
.
.
.
.
0.8421
.
.
.
.
0.7105
.
.
.
0.6285
.
.
.
0.5428
.
Failure
0
.
.
.
.
.
0.1579
.
.
.
.
0.2895
.
.
.
0.3715
.
.
.
0.4572
.
Survival
Standard
Error
0
.
.
.
.
.
0.0592
.
.
.
.
0.0736
.
.
.
0.0789
.
.
.
0.0822
.
Number
Failed
Number
Left
0
1
2
3
4
5
6
7
8
9
10
11
11
12
13
14
14
15
16
17
17
38
37
36
35
34
33
32
31
30
29
28
27
26
25
24
23
22
21
20
19
18
27
Kaplan-Meier: SAS output
Survival
time
6.0000
Survival
.
6.0000
7.0000*
7.0000*
8.0000*
8.0000*
9.0000
9.0000
9.0000
9.0000*
9.0000*
9.0000*
10.0000
11.0000*
13.0000
16.0000
24.0000*
24.0000*
Failure
.
0.4825
.
.
.
.
.
.
0.3619
.
.
.
0.3016
.
0.2262
0.1508
.
.
Standard
Error
.
0.5175
.
.
.
.
.
.
0.6381
.
.
.
0.6984
.
0.7738
0.8492
.
.
18
0.0834
.
.
.
.
.
.
0.0869
.
.
.
0.0910
.
0.0944
0.0880
.
.
Number
Failed
Number
Left
17
19
19
19
19
19
20
21
22
22
22
22
23
23
24
25
25
25
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
NOTE: The marked survival times are censored observations.
28
Monday Gut Check
Problem
Calculate the product-limit estimate of
survival for the following data (n=9):
Time-to-event (months)
Survival
(1=died/0=censored)
10
12
14
10
29
Not so easy to get a plot of the actual hazard
function!
In SAS, need a complicated MACRO, and
depends on assumptionsheres what I get
from Paul Allisons macro for these data
At best, you can get the
cumulative hazard
function
t
S (t ) e
h ( u ) du
0
log S (t ) h(u )du
0
Linear cumulative
hazard function
indicates a
constant hazard.
See lecture 1 if you
want more math!
31
log S (t ) h(u )du
Cumulative Hazard
Function
0
If the hazard function is constant, e.g. h(t)=k, then the cumulative hazard function
will be linear (and higher hazards will have steeper slopes):
kdu kt
0
If the hazard function is increasing with time, e.g. h(t)=kt, then the cumulative
hazard function will be curved up, for example h(t)=kt gives a quadratic:
t
kt 2
ktdu
2
0
If the hazard function is decreasing over time, e.g. h(t)=k/t, then the
cumulative hazard function should be curved down, for example:
t
k
du k log(t )
t
32
Kaplan-Meier: example 2
Researchers randomized 44 patients with chronic active
hepatitis were to receive prednisolone or no treatment
(control), then compared survival curves.
Example from: BMJ 1998;317:468-469 (15August)
33
Survival times (months) of 44patients with chronic active hepatitis randomised to
receive prednisolone or no treatment.
Prednisolone (n=22)
Control (n=22)
12
54
56 *
10
68
22
89
28
96
29
96
32
125*
37
128*
40
131*
41
140*
54
141*
61
143
63
145*
71
146
127*
148*
140*
162*
146*
168
158*
173*
167*
181*
182*
Data from: BMJ 1998;317:468-469 (15August)
*=censored
Kaplan-Meier: example 2
Are these two curves
different?
Big drops at the end
of the curve indicate
few patients left.
E.g., only 2/3 (66%)
survived this drop.
Misleading to the eye
apparent convergence by
end of study. But this is
due to 6 controls who
survived fairly long, and 3
events in the treatment
group when the sample size
was small.
35
Control group:
Survival
time
0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*
6 controls
made it
past 100
months.
Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.
Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.
Number
Failed
Number
Left
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
treated group:
time
5/6 of 54%
rapidly
drops the
curve to
45%.
2/3 of 45%
rapidly
drops the
curve to
30%.
0.000
2.000
6.000
12.000
54.000
56.000*
68.000
89.000
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*
Survival
1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.
Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.
Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.
Number
Failed
Number
Left
0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
Point-wise confidence
intervals
We will not worry about mathematical formula for confidence
bands. The important point is that there is a confidence
interval for each estimate of S(t). (SAS uses Greenwoods
38
Log-rank test
Test of Equality over Strata
Test
Log-Rank
Wilcoxon
-2Log(LR)
Chi-Square
4.6599
6.5435
5.4096
Pr >
DF
1
1
1
Chi-Square
0.0309
0.0105
0.0200
Chi-square test (with 1 df) of the (overall)
difference between the two groups.
Groups appear significantly different.
39
Log-rank test
Log-rank test is just a Cochran-Mantel-Haenszel chi-square
Anyone remember (know) what this is?
40
CMH test of conditional
independence
K Strata =
unique event
times
Event
No Event
Group 1
Group 2
Nk
(ak E (ak ))]2
i 1
Var (a )
k
i 1
~ 12
E ( ak )
(ak bk ) * (ak ck )
Nk
Var (ak )
(ak bk ) * (ck d k ) * (ak ck ) * (bk d k )
N k2 ( N k 1)
CMH test of conditional
independence
K Strata =
unique event
times
Event
No Event
Group 1
Group 2
Nk
(ak E (ak ))]2
i 1
Var (a )
k
i 1
~ 12
E (ak )
row1k * col1k
Nk
Var (ak )
row1k * row 2 k * col1k * col 2 k
N k2 ( N k 1)
CMH test of conditional
independence
E ( you
How
eventsdo
events) know
observed expected
Z
standard
deviation
thatVarthis
eventsis a chisquare
with 1 df?
Z
k event times
2
1
(a
No Event
Group 1
Group 2
k event times
k event times
Event
E (ak ))]
i 1
Var (a )
k
2
1
Why is this the
expected value
in each stratum?
E (ak )
row1k * col1k
Nk
Var (ak )
row1k * row 2 k * col1k * col 2 k
N k2 ( N k 1)
i 1
Variance is the variance of a
hypergeometric distribution
Event time 1 (2 months), control group:
Survival
time
1st
event
at
month
2.
0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*
Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.
Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.
Number
Failed
Number
Left
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
At
risk=2
2
Event time 1 (2 months), treated group:
time
Survival
0.000
1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.
1st
2.000
event 6.000
12.000
at
54.000
month 56.000*
68.000
2.
89.000
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*
Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.
Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.
Number
Failed
Number
Left
0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
At
risk=2
2
Stratum 1= event time 1
Event time 1:
1 died from each
group. (22 at risk in
each group)
Event
No Event
treated
21
control
21
44
a1 1
( 22) * (2)
1
44
(22) * (22) * (2) * (42)
Var ( a1 )
.244
2
44 (43)
E (a1 )
Event time 2 (3 months), control group:
Survival
time
Next
event
at
month
3.
0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*
Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.
Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.
Number
Failed
Number
Left
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
At
risk=2
1
Event time 2 (3 months), treated group:
time
Survival
0.000
1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.
2.000
No
6.000
events12.000
at 3 54.000
56.000*
month 68.000
89.000
s
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*
Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.
Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.
Number
Failed
Number
Left
0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
At
risk=2
1
Stratum 2= event time 2
Event time 2:
Event
No Event
At 3 months, 1 died in
the control group.
treated
21
At that time 21 from
each group were at risk
control
20
42
a1 0
(1) * ( 21)
.5
42
(21) * (21) * (1) * (41)
Var (a1 )
.25
2
42 (41)
E (a1 )
Event time 3 (4 months), control group:
Survival
time
1 event
at month
4.
0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*
Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.
Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.
Number
Failed
Number
Left
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
At
risk=2
0
Event time 3 (4 months), treated group:
time
0.000
2.000
6.000
12.000
54.000
56.000*
68.000
89.000
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*
Survival
1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.
Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.
Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.
Number
Failed
Number
Left
0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
At
risk=2
1
Stratum 3= event time 3
(4 months)
Event time 3:
At 4 months, 1 died in
the control group.
At that time 21 from
the treated group and
20 from the control
group were at-risk.
Event
No Event
treated
21
control
19
a1 0
(1) * (21)
.51
41
(21) * (20) * (1) * (40)
Var (a1 )
.25
2
41 (40)
E (a1 )
41
Etc.
22
(a k E (a k ))] 2
i 1
22
Var (a
i 1
[(1 1) (0 .5) (0 .51) ...............] 2
4.66
.244 .25 .25 .....
Log-rank test, et al.
Wilcoxon is just a version of
the log-rank test that
weights strata by their size
(giving more weight to
earlier time points).
Test of Equality over Strata
More sensitive to
differences at earlier time
points.
Test
Log-Rank
Wilcoxon
-2Log(LR)
Chi-Square
4.6599
6.5435
5.4096
Likelihood Ratio test is not ideal
here because it assumes
exponential distribution
(constant hazard).
Pr >
DF
1
1
1
Chi-Square
0.0309
0.0105
0.0200
Log-rank test has most
power to test differences
that fit the proportional
hazards modelso works
well as a set-up for
subsequent Cox regression.
54
Estimated log(S(t))
Maybe hazard
function decreases a
little then increases
a little? Hard to say
exactly
55
Approximated h(t)
56
One more graph from
SAS
log(-log(S(t))=
log(cumulative hazard)
If group plots are
parallel, this indicates
that the proportional
hazards assumption is
valid.
Necessary assumption
for calculation of
Hazard Ratios
57
Uses of Kaplan-Meier
Commonly used to describe
survivorship of study population/s.
Commonly used to compare two
study populations.
Intuitive graphical presentation.
58
Limitations of Kaplan-Meier
Mainly descriptive
Doesnt control for covariates
Requires categorical predictors
SAS does let you easily discretize continuous
variables for KM methods, for exploratory
purposes.
Cant accommodate time-dependent
variables
59
Parametric Models for the
hazard/survival function
The class of regression models
estimated by PROC LIFEREG is
known as the accelerated failure
time models.
60
Shape parameter (inverse of
the scale parameter):
<1: hazard rate is decreasing
>1 hazard rate is increasing
Parameters of
the Weibull 61
Constant hazard rate
(special case of
Weibull where shape
parameter =1.0)
62
Recall: two parametric
models
Components:
A baseline hazard function (that may change over time).
A linear function of a set of k fixed covariates that when
exponentiated (and a few other things) gives the relative
risk.
Exponential model assumes fixed baseline hazard that we can
estimate.
log hi (t ) 1 xi1 ... k xik
Weibull model models the baseline hazard as a function of time. Two parameters
(baseline hazard and scale) must be estimated to describe the underlying hazard
function over time.
log hi (t ) log t 1 xi1 ... k xik
63
To get Hazard Ratios
(relative risk)
Weibull (and thus exponential) are proportional hazards
models, so hazard ratio can be calculated.
For other parametric models, you cannot calculate hazard
ratio (hazards are not necessarily proportional over time).
Exponential Model :
HR e
Weibull Model :
HR e
scale
More tricky to get confidence intervals
64
Whats a hazard ratio?
Distinction between hazard/rate
ratio and odds ratio/risk ratio:
Hazard/rate ratio: ratio of
incidence rates
Odds/risk ratio: ratio of proportions
65
Example 1
Using data from pregnancy study
Recall: roughly, hazard rates were
similar over time
(implies exponential model should
be a good fit).
66
The LIFEREG Procedure
Analysis of Parameter Estimates
Standard
Parameter
DF Estimate
Error
95% Confidence
Limits
Intercept
2.2636
0.2049
1.8621
2.6651
Scale
1.0217
0.1638
0.7462
1.3987
Weibull Shape
0.9788
0.1569
0.7149
1.3401
Scale of 1.0 makes a Weibull
an exponential, so looks
exponential.
ChiSquare Pr > ChiSq
122.08
<.0001
Parametric estimates of survival function
based on a Weibull model (left) and
exponential (right).
Compar
e to KM:
68
Example 2: 2 groups
Using data from hepatitis trial, I fit
exponential and Weibull models in
SAS using LIFEREG (Weibull is
default in LIFEREG)
69
-2Log Likelihood = 2*68= 136
The LIFEREG Procedure
Dependent Variable
Log(time)
Right Censored Values
17
Left Censored Values
Interval Censored Values
Name of Distribution
Exponential
Log Likelihood
Scale parameter is set to
1, because its
exponential.
-68.03461345
Analysis of Parameter Estimates
Standard
Parameter
DF Estimate
Error
95% Confidence
Limits
P-value for group very
similar to p-value from
log-rank test.
ChiSquare Pr > ChiSq
Intercept
4.4886
0.2500
3.9986
4.9786
322.37
<.0001
group
0.9008
0.3917
0.1332
1.6685
5.29
0.0214
Scale
1.0000
0.0000
1.0000
1.0000
Weibull Shape
1.0000
0.0000
1.0000
1.0000
Hazard ratio (treated vs.
Interpretation: median time to death was decreased
control):
60% in treated group; or, equivalently, mortality rate
is 60% lower in treated group.
-2Log Likelihood = 2*67= 134
Model Information
Dependent Variable
Right Censored Values
Left Censored Values
Comparison of models using Likelihood Ratio test:
Log(time)
-2LogLikelihood(simpler model)2LogLikelihood(more
complex) = chi-square
17with 1 df (1 extra parameter estimated
for weibull model).
=136-134 = 2
Interval Censored Values NS
Name of Distribution
Log Likelihood
Scale parameter is
greater than 1, indicating
decreasing hazard with
time.
No evidence that
Weibull model is much better than
Weibull
exponential.
-66.94904552
P-value for group very
similar to p-value from
log-rank test and
exponential model.
Analysis of Parameter Estimates
Standard
Parameter
DF Estimate
Error
95% Confidence
Limits
ChiSquare Pr > ChiSq
Intercept
4.4811
0.3169
3.8601
5.1022
200.00
<.0001
group
1.0544
0.5096
0.0556
2.0533
4.28
0.0385
Scale
1.2673
0.2139
0.9103
1.7643
Weibull Shape
0.7891
0.1332
0.5668
1.0985
Shape parameter is just
1/scale parameter!
Hazard ratio (treated vs.
control):
Parametric estimates of cumulative survival
based on Weibull model (left) and exponential
(right), by group.
Compar
e to KM:
Compare to Cox
regression:
Variable
group
DF
Parameter
Estimate
Standard
Error
Chi-Square
Pr > ChiSq
Hazard
Ratio
-0.83230
0.39739
4.3865
0.0362
0.435
95% Hazard Ratio
Confidence Limits
0.200
0.948
73