0% found this document useful (0 votes)

78 views73 pages

Biostatistics: Kaplan-Meier Analysis

Statistics 262 covers intermediate biostatistical methods including Kaplan-Meier methods and parametric regression. The Kaplan-Meier method estimates the survival function S(t) by accounting for censored data and is defined at event times. An example analyzes time to conception for 38 subfertile women, with conception as the event. The Kaplan-Meier curve is calculated based on event and censored times to estimate the probability of surviving without conception at each time point. The estimated probability of surviving without conception at 16 months is 15%.

Uploaded by

anova12345

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views73 pages

Biostatistics: Kaplan-Meier Analysis

Uploaded by

anova12345

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 73

Statistics 262:

Intermediate Biostatistics
Kaplan-Meier methods and Parametric Regression
methods

More on Kaplan-Meier estimator of S(t)

(product-limit estimator or KM
estimator)

When there are no censored data, the KM

estimator is simple and intuitive:

When there are censored data, KM provides

estimate of S(t) that takes censoring into account
(see last weeks lecture).

Estimated S(t)= proportion of observations with failure times >

t.
For example, if you are following 10 patients, and 3 of them die
by the end of the first year, then your best estimate of S(1 year)
= 70%.

If the censored observation had actually been a failure: S(1

year)=4/5*3/4*2/3=2/5=40%

KM estimator is defined only at times when events

occur! (empirically defined)
2

KM (product-limit)
estimator, formally
k distinct event times t1 t j ... t k
at each event time t j , there are n j individuals at - risk
d j is the number who have the event at time t j
S (t )

[1 n
j:t j t

KM (product-limit)
estimator, formally
Observed event times

k distinct event times t1 t j ... t k

at each event time t j , there are n j individuals at - risk
d j is the number who have the event at time t j
dj

[1 n

The risk set nj at time tj consists of

Typically
dj= 1 sample
person, minus
unlessall
data
the original
those
have in
been
censored
had the
arewho
grouped
time
intervalsor(e.g.,
j
j:t j t
event before
tj the event in the
everyone
who had
rd
3 month).
dj/nj=proportion that failed at the event
S(t) represents estimated survival probability at time t:
time tj
P(T>t)

S (t )

1- dj/nj=proportion surviving the event

time

Multiply the probability of surviving

This formula gives the product-limit estimate of survival at each time an event happe
event time t with the probabilities of
surviving all the previous event times.

Example 1: time-toconception for subfertile

women
Failure
here is a good thing.
38 women (in 1982) were treated for infertility
with laparoscopy and hydrotubation.
All women were followed for up to 2-years to
describe time-to-conception.
The event is conception, and women "survived"
until they conceived.
Example from: BMJ, Dec 1998; 317: 1572 - 1580.
5

Raw data: Time (months) to conception or censoring in 38sub-fertile

women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3
4
7
7
8
8
9
9
9
11
24
24

Did not conceive

(censored)

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve

S(t) is estimated at 9 event
times.
(step-wise function)

Raw data: Time (months) to conception or censoring in 38sub-fertile

women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3
4
7
7
8
8
9
9
9
11
24
24

Did not conceive

(censored)

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Raw data: Time (months) to conception or censoring in 38sub-fertile

women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3
4
7
7
8
8
9
9
9
11
24
24

Did not conceive

(censored)

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve

6 women conceived in 1st
month (1st menstrual cycle).
Therefore, 32/38 survived
pregnancy-free past 1 month.

Corresponding KaplanMeier Curve

S(t=1) = 32/38 = 84.2%
S(t) represents estimated survival probability: P(T>t)
Here P(T>1).

Raw data: Time (months) to conception or censoring in 38sub-fertile

women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

Did not conceive

(censored)

2.1
3
4
7
7
8
8
9
9
9
11
24
24

Important detail of how the data were coded:

t=2 indicates survival PAST the 2 nd cycle
Censoring at

(i.e., we know
the woman survived her 2nd cycle

pregnancy-free).

Thus, for calculating KM estimator at 2 months, this

person should still be included in the risk set.

Think of it as
2+ months, e.g., 2.1 months.

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve

5 women conceive in 2nd month.

The risk set at event time 2 included

32 women.
Therefore, 27/32=84.4% survived
event time 2 pregnancy-free.

S(t=2) = ( 84.2%)*(84.4%)=71.1%
Can get an estimate of the hazard rate
here, h(t=2)= 5/32=15.6%. Given
that you didnt get pregnant in month
1, you have an estimated 5/32 chance
of conceiving in the 2nd month.
And estimate of density (marginal
probability of conceiving in month 2):
f(t)=h(t)*S(t)=(.711)*(.156)=11%

Raw data: Time (months) to conception or censoring in 38sub-fertile

women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2.1
3.1
4
7
7
8
8
9
9
9
11
24
24

Did not conceive

(censored)

Risk set at 3
months
includes 26
women

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve

3 women conceive in the 3rd month.

The risk set at event time 3 included

26 women.
23/26=88.5% survived event time 3
pregnancy-free.

S(t=3) = ( 84.2%)*(84.4%)*(88.5%)=62.8%

Raw data: Time (months) to conception or censoring in 38sub-fertile

women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3.1
4
7
7
8
8
9
9
9
11
24
24

Did not conceive

(censored)

Risk set at 4
months
includes 22
women

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve

3 women conceive in the 4th month,
and 1 was censored between months
3 and 4.
The risk set at event time 4 included
22 women.
19/22=86.4% survived event time 4
pregnancy-free.

S(t=4) = ( 84.2%)*(84.4%)*(88.5%)*(86.4%)=54.2%

Hazard rates (conditional chances of

conceiving, e.g. 100%-84%) look
similar over time.
And estimate of density (marginal
probability of conceiving in month
4):
f(t)=h(t)*S(t)=(.136)*
(.542)=7.4%
20

Raw data: Time (months) to conception or censoring in 38sub-fertile

women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3
4.1
7
7
8
8
9
9
9
11
24
24

Did not conceive

(censored)

Risk set at 6
months
includes 18
women

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve

2 women conceive in the 6th month of
the study, and one was censored
between months 4 and 6.
The risk set at event time 5 included
18 women.
16/18=88.8% survived event time 5
pregnancy-free.

S(t=6) = (54.2%)*(88.8%)=42.9%

Skipping ahead to the 9th and

final event time (months=16)

S(t=13) 22%
(eyeball approximation)

Raw data: Time (months) to conception or censoring in 38sub-fertile

women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3
4
7
7
8
8
9
9
9
11
24
24

Did not conceive

(censored)

2 remaining at 16
months (9th event
time)

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Skipping ahead to the 9th and

final event time (months=16)

S(t=16) =( 22%)*(2/3)=15%

Tail here just represents that

the final 2 women did not
conceive (cannot make many
inferences from the end of a
KM curve)!
26

Kaplan-Meier: SAS output

The LIFETEST Procedure
Product-Limit Survival Estimates

time
0.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
2.0000
2.0000
2.0000
2.0000
2.0000
2.0000*
3.0000
3.0000
3.0000
3.0000*
4.0000
4.0000
4.0000
4.0000*

Survival
1.0000
.
.
.
.
.
0.8421
.
.
.
.
0.7105
.
.
.
0.6285
.
.
.
0.5428
.

Failure
0
.
.
.
.
.
0.1579
.
.
.
.
0.2895
.
.
.
0.3715
.
.
.
0.4572
.

Survival
Standard
Error
0
.
.
.
.
.
0.0592
.
.
.
.
0.0736
.
.
.
0.0789
.
.
.
0.0822
.

Number
Failed

Number
Left

0
1
2
3
4
5
6
7
8
9
10
11
11
12
13
14
14
15
16
17
17

38
37
36
35
34
33
32
31
30
29
28
27
26
25
24
23
22
21
20
19
18

Kaplan-Meier: SAS output

Survival
time
6.0000

Survival
.

6.0000
7.0000*
7.0000*
8.0000*
8.0000*
9.0000
9.0000
9.0000
9.0000*
9.0000*
9.0000*
10.0000
11.0000*
13.0000
16.0000
24.0000*
24.0000*

Failure
.

0.4825
.
.
.
.
.
.
0.3619
.
.
.
0.3016
.
0.2262
0.1508
.
.

Standard
Error
.

0.5175
.
.
.
.
.
.
0.6381
.
.
.
0.6984
.
0.7738
0.8492
.
.

18
0.0834
.
.
.
.
.
.
0.0869
.
.
.
0.0910
.
0.0944
0.0880
.
.

Number
Failed

Number
Left
17

19
19
19
19
19
20
21
22
22
22
22
23
23
24
25
25
25

16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

NOTE: The marked survival times are censored observations.

Monday Gut Check

Problem

Calculate the product-limit estimate of

survival for the following data (n=9):
Time-to-event (months)

Survival
(1=died/0=censored)

Not so easy to get a plot of the actual hazard

function!
In SAS, need a complicated MACRO, and
depends on assumptionsheres what I get
from Paul Allisons macro for these data

At best, you can get the

cumulative hazard
function
t

S (t ) e

h ( u ) du
0

log S (t ) h(u )du

Linear cumulative
hazard function
indicates a
constant hazard.

See lecture 1 if you

want more math!

log S (t ) h(u )du

Cumulative Hazard
Function
0

If the hazard function is constant, e.g. h(t)=k, then the cumulative hazard function
will be linear (and higher hazards will have steeper slopes):

kdu kt
0

If the hazard function is increasing with time, e.g. h(t)=kt, then the cumulative
hazard function will be curved up, for example h(t)=kt gives a quadratic:
t

kt 2
ktdu
2
0

If the hazard function is decreasing over time, e.g. h(t)=k/t, then the
cumulative hazard function should be curved down, for example:
t

k
du k log(t )
t

Kaplan-Meier: example 2
Researchers randomized 44 patients with chronic active
hepatitis were to receive prednisolone or no treatment
(control), then compared survival curves.

Example from: BMJ 1998;317:468-469 (15August)

Survival times (months) of 44patients with chronic active hepatitis randomised to

receive prednisolone or no treatment.
Prednisolone (n=22)

Control (n=22)

56 *

125*

128*

131*

140*

141*

143

145*

146

127*

148*

140*

162*

146*

168

158*

173*

167*

181*

182*

Data from: BMJ 1998;317:468-469 (15August)

*=censored

Kaplan-Meier: example 2
Are these two curves
different?

Big drops at the end

of the curve indicate
few patients left.
E.g., only 2/3 (66%)
survived this drop.

Misleading to the eye

apparent convergence by
end of study. But this is
due to 6 controls who
survived fairly long, and 3
events in the treatment
group when the sample size
was small.

Control group:
Survival
time
0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*

6 controls
made it
past 100
months.

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.

Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.

Number
Failed

Number
Left

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

treated group:
time

5/6 of 54%
rapidly
drops the
curve to
45%.
2/3 of 45%
rapidly
drops the
curve to
30%.

0.000
2.000
6.000
12.000
54.000
56.000*
68.000
89.000
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.

Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.

Number
Failed

Number
Left

0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

Point-wise confidence
intervals

We will not worry about mathematical formula for confidence

bands. The important point is that there is a confidence
interval for each estimate of S(t). (SAS uses Greenwoods

Log-rank test
Test of Equality over Strata

Test
Log-Rank
Wilcoxon
-2Log(LR)

Chi-Square
4.6599
6.5435
5.4096

Pr >
DF
1
1
1

Chi-Square
0.0309
0.0105
0.0200

Chi-square test (with 1 df) of the (overall)

difference between the two groups.
Groups appear significantly different.
39

Log-rank test

Log-rank test is just a Cochran-Mantel-Haenszel chi-square

Anyone remember (know) what this is?

CMH test of conditional

independence
K Strata =
unique event
times

Event

No Event

Group 1

Group 2

Nk
(ak E (ak ))]2

i 1

Var (a )
k

i 1

~ 12

E ( ak )

(ak bk ) * (ak ck )
Nk

Var (ak )

(ak bk ) * (ck d k ) * (ak ck ) * (bk d k )

N k2 ( N k 1)

CMH test of conditional

independence
K Strata =
unique event
times

Event

No Event

Group 1

Group 2

Nk
(ak E (ak ))]2

i 1

Var (a )
k

i 1

~ 12

E (ak )

row1k * col1k
Nk

Var (ak )

row1k * row 2 k * col1k * col 2 k

N k2 ( N k 1)

CMH test of conditional

independence
E ( you
How
eventsdo
events) know
observed expected

Z
standard
deviation
thatVarthis
eventsis a chisquare
with 1 df?
Z
k event times

2
1

No Event

Group 1

Group 2

k event times

Event

E (ak ))]

i 1

Var (a )
k

2
1

Why is this the

expected value
in each stratum?
E (ak )

row1k * col1k
Nk

Var (ak )

row1k * row 2 k * col1k * col 2 k

N k2 ( N k 1)

i 1

Variance is the variance of a

hypergeometric distribution

Event time 1 (2 months), control group:

Survival
time

1st
event
at
month
2.

0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.

Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.

Number
Failed

Number
Left

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
2

Event time 1 (2 months), treated group:

time

Survival

0.000

1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.

1st
2.000
event 6.000
12.000
at
54.000
month 56.000*
68.000
2.
89.000
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*

Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.

Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.

Number
Failed

Number
Left

0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
2

Stratum 1= event time 1

Event time 1:
1 died from each
group. (22 at risk in
each group)

Event

No Event

treated

control

44
a1 1
( 22) * (2)
1
44
(22) * (22) * (2) * (42)
Var ( a1 )
.244
2
44 (43)
E (a1 )

Event time 2 (3 months), control group:

Survival
time

Next
event
at
month
3.

0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.

Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.

Number
Failed

Number
Left

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
1

Event time 2 (3 months), treated group:

time

Survival

0.000

1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.

2.000
No
6.000
events12.000
at 3 54.000
56.000*
month 68.000
89.000
s
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*

Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.

Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.

Number
Failed

Number
Left

0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
1

Stratum 2= event time 2

Event time 2:

Event

No Event

At 3 months, 1 died in
the control group.

treated

At that time 21 from

each group were at risk

control

42
a1 0
(1) * ( 21)
.5
42
(21) * (21) * (1) * (41)
Var (a1 )
.25
2
42 (41)
E (a1 )

Event time 3 (4 months), control group:

Survival
time

1 event
at month
4.

0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.

Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.

Number
Failed

Number
Left

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
0

Event time 3 (4 months), treated group:

time
0.000
2.000
6.000
12.000
54.000
56.000*
68.000
89.000
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.

Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.

Number
Failed

Number
Left

0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
1

Stratum 3= event time 3

(4 months)
Event time 3:
At 4 months, 1 died in
the control group.
At that time 21 from
the treated group and
20 from the control
group were at-risk.

Event

No Event

treated

control

a1 0
(1) * (21)
.51
41
(21) * (20) * (1) * (40)
Var (a1 )
.25
2
41 (40)
E (a1 )

Etc.
22

(a k E (a k ))] 2

i 1

Var (a
i 1

[(1 1) (0 .5) (0 .51) ...............] 2

4.66
.244 .25 .25 .....

Log-rank test, et al.

Wilcoxon is just a version of
the log-rank test that
weights strata by their size
(giving more weight to
earlier time points).

Test of Equality over Strata

More sensitive to
differences at earlier time
points.
Test

Log-Rank
Wilcoxon
-2Log(LR)

Chi-Square
4.6599
6.5435
5.4096

Likelihood Ratio test is not ideal

here because it assumes
exponential distribution
(constant hazard).

Pr >
DF
1
1
1

Chi-Square
0.0309
0.0105
0.0200

Log-rank test has most

power to test differences
that fit the proportional
hazards modelso works
well as a set-up for
subsequent Cox regression.

Estimated log(S(t))
Maybe hazard
function decreases a
little then increases
a little? Hard to say
exactly

Approximated h(t)

One more graph from

SAS
log(-log(S(t))=
log(cumulative hazard)
If group plots are
parallel, this indicates
that the proportional
hazards assumption is
valid.
Necessary assumption
for calculation of
Hazard Ratios
57

Uses of Kaplan-Meier

Commonly used to describe

survivorship of study population/s.
Commonly used to compare two
study populations.
Intuitive graphical presentation.

Limitations of Kaplan-Meier

Mainly descriptive
Doesnt control for covariates
Requires categorical predictors

SAS does let you easily discretize continuous

variables for KM methods, for exploratory
purposes.

Cant accommodate time-dependent

variables
59

Parametric Models for the

hazard/survival function

The class of regression models

estimated by PROC LIFEREG is
known as the accelerated failure
time models.

Shape parameter (inverse of

the scale parameter):
<1: hazard rate is decreasing
>1 hazard rate is increasing

Parameters of
the Weibull 61

Constant hazard rate

(special case of
Weibull where shape
parameter =1.0)

Recall: two parametric

models
Components:
A baseline hazard function (that may change over time).
A linear function of a set of k fixed covariates that when
exponentiated (and a few other things) gives the relative
risk.
Exponential model assumes fixed baseline hazard that we can
estimate.

log hi (t ) 1 xi1 ... k xik

Weibull model models the baseline hazard as a function of time. Two parameters
(baseline hazard and scale) must be estimated to describe the underlying hazard
function over time.

log hi (t ) log t 1 xi1 ... k xik

To get Hazard Ratios

(relative risk)
Weibull (and thus exponential) are proportional hazards
models, so hazard ratio can be calculated.
For other parametric models, you cannot calculate hazard
ratio (hazards are not necessarily proportional over time).

Exponential Model :
HR e

Weibull Model :
HR e

scale

More tricky to get confidence intervals

Whats a hazard ratio?

Distinction between hazard/rate
ratio and odds ratio/risk ratio:
Hazard/rate ratio: ratio of
incidence rates
Odds/risk ratio: ratio of proportions

Example 1
Using data from pregnancy study
Recall: roughly, hazard rates were
similar over time
(implies exponential model should
be a good fit).

The LIFEREG Procedure

Analysis of Parameter Estimates
Standard
Parameter

DF Estimate

Error

95% Confidence
Limits

Intercept

2.2636

0.2049

1.8621

2.6651

Scale

1.0217

0.1638

0.7462

1.3987

Weibull Shape

0.9788

0.1569

0.7149

1.3401

Scale of 1.0 makes a Weibull

an exponential, so looks
exponential.

ChiSquare Pr > ChiSq

122.08

<.0001

Parametric estimates of survival function

based on a Weibull model (left) and
exponential (right).

Compar
e to KM:
68

Example 2: 2 groups
Using data from hepatitis trial, I fit
exponential and Weibull models in
SAS using LIFEREG (Weibull is
default in LIFEREG)

-2Log Likelihood = 2*68= 136

The LIFEREG Procedure

Dependent Variable

Log(time)

Right Censored Values

Left Censored Values

Interval Censored Values

Name of Distribution

Exponential

Log Likelihood
Scale parameter is set to
1, because its
exponential.

-68.03461345

Analysis of Parameter Estimates

Standard
Parameter

DF Estimate

Error

95% Confidence
Limits

P-value for group very

similar to p-value from
log-rank test.

ChiSquare Pr > ChiSq

Intercept

4.4886

0.2500

3.9986

4.9786

322.37

<.0001

group

0.9008

0.3917

0.1332

1.6685

5.29

0.0214

Scale

1.0000

0.0000

1.0000

Weibull Shape

1.0000

0.0000

1.0000

Hazard ratio (treated vs.

Interpretation: median time to death was decreased
control):
60% in treated group; or, equivalently, mortality rate
is 60% lower in treated group.

-2Log Likelihood = 2*67= 134

Model Information
Dependent Variable
Right Censored Values
Left Censored Values

Comparison of models using Likelihood Ratio test:

Log(time)

-2LogLikelihood(simpler model)2LogLikelihood(more
complex) = chi-square
17with 1 df (1 extra parameter estimated
for weibull model).
=136-134 = 2

Interval Censored Values NS

Name of Distribution
Log Likelihood
Scale parameter is
greater than 1, indicating
decreasing hazard with
time.

No evidence that
Weibull model is much better than
Weibull
exponential.

-66.94904552

P-value for group very

similar to p-value from
log-rank test and
exponential model.

Analysis of Parameter Estimates

Standard

Parameter

DF Estimate

Error

95% Confidence
Limits

ChiSquare Pr > ChiSq

Intercept

4.4811

0.3169

3.8601

5.1022

200.00

<.0001

group

1.0544

0.5096

0.0556

2.0533

4.28

0.0385

Scale

1.2673

0.2139

0.9103

1.7643

Weibull Shape

0.7891

0.1332

0.5668

1.0985

Shape parameter is just

1/scale parameter!

Hazard ratio (treated vs.

control):

Parametric estimates of cumulative survival

based on Weibull model (left) and exponential
(right), by group.

Compar
e to KM:

Compare to Cox
regression:
Variable
group

Parameter
Estimate

Standard
Error

Chi-Square

Pr > ChiSq

Hazard
Ratio

-0.83230

0.39739

4.3865

0.0362

0.435

95% Hazard Ratio

Confidence Limits
0.200

0.948

Logistic Regression Survival Analysis Kaplan-Meier
No ratings yet
Logistic Regression Survival Analysis Kaplan-Meier
13 pages
Survival Analysis With STATA 1701597623
No ratings yet
Survival Analysis With STATA 1701597623
252 pages
Kaplan-Meier Estimator: Association. The Journal Editor, John Tukey, Convinced Them To Combine Their
No ratings yet
Kaplan-Meier Estimator: Association. The Journal Editor, John Tukey, Convinced Them To Combine Their
7 pages
Scope of Operational Research: Presented By: Ramsha Ghaffar Saira Bano Muhammad Ayaz Syed Hassan Ali Hashmi
No ratings yet
Scope of Operational Research: Presented By: Ramsha Ghaffar Saira Bano Muhammad Ayaz Syed Hassan Ali Hashmi
10 pages
Advanced Survival Analysis Guide
No ratings yet
Advanced Survival Analysis Guide
71 pages
Introduction To Survival Analysis: BIOST 515 February 26, 2004
No ratings yet
Introduction To Survival Analysis: BIOST 515 February 26, 2004
30 pages
Kaplan-Meier and Log-Rank Analysis
No ratings yet
Kaplan-Meier and Log-Rank Analysis
39 pages
Introduction to Survival Analysis
No ratings yet
Introduction to Survival Analysis
54 pages
Non-Parametric Survival Models
100% (1)
Non-Parametric Survival Models
4 pages
TO Operations Research: Dr. Tanvir Abir Mob: 01942520034
100% (1)
TO Operations Research: Dr. Tanvir Abir Mob: 01942520034
38 pages
Cox Proportional Hazard Model Overview
No ratings yet
Cox Proportional Hazard Model Overview
34 pages
A Confidence Interval For The Median Survival Time
No ratings yet
A Confidence Interval For The Median Survival Time
14 pages
CT 4201304
No ratings yet
CT 4201304
26 pages
Understanding Number Systems in Mathematics
100% (1)
Understanding Number Systems in Mathematics
5 pages
Count Data Models in SAS
No ratings yet
Count Data Models in SAS
12 pages
TO Operations Research: Grishma Sharma Kjsce
No ratings yet
TO Operations Research: Grishma Sharma Kjsce
39 pages
Survival Models for Actuarial Students
No ratings yet
Survival Models for Actuarial Students
13 pages
Survival Competing Risk
No ratings yet
Survival Competing Risk
29 pages
MT 281 Lecture Notes
No ratings yet
MT 281 Lecture Notes
292 pages
Survival Analysis Dengan Pendekatan R
No ratings yet
Survival Analysis Dengan Pendekatan R
32 pages
Odds Ratio, Hazard Ratio and Relative Risk: Janez Stare Delphine Maucort-Boulch
No ratings yet
Odds Ratio, Hazard Ratio and Relative Risk: Janez Stare Delphine Maucort-Boulch
9 pages
Gamma Extended Frechet Distribution
No ratings yet
Gamma Extended Frechet Distribution
23 pages
Kaplan-Meier Estimator and Censoring
No ratings yet
Kaplan-Meier Estimator and Censoring
33 pages
Survival Analysis with R Guide
No ratings yet
Survival Analysis with R Guide
42 pages
Pearson Distribution
No ratings yet
Pearson Distribution
11 pages
Survival Models ANU Lecture 3
No ratings yet
Survival Models ANU Lecture 3
17 pages
Bahan Pelengkap Parametric Dan Semiparametric Model
100% (1)
Bahan Pelengkap Parametric Dan Semiparametric Model
51 pages
Solution CH # 5
No ratings yet
Solution CH # 5
39 pages
13 Pag Design and Analysis of Experiments in The Health Sciences
100% (1)
13 Pag Design and Analysis of Experiments in The Health Sciences
13 pages
Survival Analysis for Researchers
100% (1)
Survival Analysis for Researchers
23 pages
Binomial Distribution Explained
No ratings yet
Binomial Distribution Explained
16 pages
Dummy Regression
No ratings yet
Dummy Regression
23 pages
Cox Regression for Survival Data
No ratings yet
Cox Regression for Survival Data
22 pages
Chapter 2-Life Tables
No ratings yet
Chapter 2-Life Tables
18 pages
Cox Regression Overview by Kristin Sainani
No ratings yet
Cox Regression Overview by Kristin Sainani
62 pages
Kaplan-Meier vs. Cox Regression Analysis
No ratings yet
Kaplan-Meier vs. Cox Regression Analysis
12 pages
Survival Analysis Methods Guide
100% (1)
Survival Analysis Methods Guide
15 pages
Churn Data
100% (1)
Churn Data
56 pages
Analysis of Survival Data - LN - D Zhang - 05
100% (1)
Analysis of Survival Data - LN - D Zhang - 05
264 pages
What Is A Cox Model?: Sponsored by An Educational Grant From Aventis Pharma
No ratings yet
What Is A Cox Model?: Sponsored by An Educational Grant From Aventis Pharma
8 pages
Survival Analysis
No ratings yet
Survival Analysis
36 pages
Generalized Linear Failure Rate Distribution
No ratings yet
Generalized Linear Failure Rate Distribution
23 pages
Analysis of Multivariate Survival Data Full Download
No ratings yet
Analysis of Multivariate Survival Data Full Download
15 pages
ct42005 2009
No ratings yet
ct42005 2009
285 pages
Aiken & West (1991) Chap07 PDF
No ratings yet
Aiken & West (1991) Chap07 PDF
14 pages
HIV Patient Survival Analysis Methods
No ratings yet
HIV Patient Survival Analysis Methods
45 pages
Survival Analysis
No ratings yet
Survival Analysis
30 pages
Hosmer DW & Lemeshow S (1999) - Applied Survival Analysis Regression Modeling of Time To Event Da
No ratings yet
Hosmer DW & Lemeshow S (1999) - Applied Survival Analysis Regression Modeling of Time To Event Da
206 pages
Prob&StatsBook PDF
No ratings yet
Prob&StatsBook PDF
202 pages
Survival Analysis Overview
No ratings yet
Survival Analysis Overview
23 pages
Birth-Death Processes & Markov Chains
0% (1)
Birth-Death Processes & Markov Chains
49 pages
10 - Chapter 4 PDF
No ratings yet
10 - Chapter 4 PDF
30 pages
The Cox Proportional Hazards Model and Its Charact-Eristics
No ratings yet
The Cox Proportional Hazards Model and Its Charact-Eristics
63 pages
Optimization & Stochastic Theory
No ratings yet
Optimization & Stochastic Theory
29 pages
06 - Natural Experiment (Part 1) PDF
No ratings yet
06 - Natural Experiment (Part 1) PDF
89 pages
Expected Returns and Portfolio Analysis
No ratings yet
Expected Returns and Portfolio Analysis
17 pages
Comparing The Areas Under Two or More Correlated Receiver Operating Characteristic Curves A Nonparametric Approach
No ratings yet
Comparing The Areas Under Two or More Correlated Receiver Operating Characteristic Curves A Nonparametric Approach
10 pages
Full VolActuarial Tablesume
No ratings yet
Full VolActuarial Tablesume
138 pages
Kaplan-Meier Survival Analysis Guide
100% (1)
Kaplan-Meier Survival Analysis Guide
2 pages
Censored.) We in Fact Do Not Know Whether These Subjects Survived or Died. Yet
No ratings yet
Censored.) We in Fact Do Not Know Whether These Subjects Survived or Died. Yet
4 pages
Tobit Postestimation - Postestimation Tools For Tobit
No ratings yet
Tobit Postestimation - Postestimation Tools For Tobit
5 pages
Vascular Testing QA Essentials
No ratings yet
Vascular Testing QA Essentials
19 pages
Lecture 2 Graphical Representation of Data I
No ratings yet
Lecture 2 Graphical Representation of Data I
14 pages
Statistical Process Control Overview
No ratings yet
Statistical Process Control Overview
46 pages
Overfitting & Underfitting in Machine Learning
No ratings yet
Overfitting & Underfitting in Machine Learning
9 pages
Diabetes Case Study - Jupyter Notebook
100% (1)
Diabetes Case Study - Jupyter Notebook
10 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
33 pages
Assignment 5 - STAT
No ratings yet
Assignment 5 - STAT
8 pages
ID Card 18 Mar 2025
No ratings yet
ID Card 18 Mar 2025
8 pages
Ashageri Assignment
No ratings yet
Ashageri Assignment
13 pages
Chapter 14 1
No ratings yet
Chapter 14 1
83 pages
Probability and Statistics in Quant Finance
No ratings yet
Probability and Statistics in Quant Finance
2 pages
Test1 15
No ratings yet
Test1 15
2 pages
Stats Interview Questions Answers 1697190472
No ratings yet
Stats Interview Questions Answers 1697190472
54 pages
Aman Khedia CA FND MOCT & MOD Summary Notes
No ratings yet
Aman Khedia CA FND MOCT & MOD Summary Notes
9 pages
Econometric Methods: Gauss Markov & PDL Models
No ratings yet
Econometric Methods: Gauss Markov & PDL Models
20 pages
Statistic Interview Questions and Answers by Jeevan Raj
No ratings yet
Statistic Interview Questions and Answers by Jeevan Raj
21 pages
Descriptive Statistics Assignment 1
No ratings yet
Descriptive Statistics Assignment 1
2 pages
Chi-Square Test of Independence
No ratings yet
Chi-Square Test of Independence
15 pages
4 - LM Test and Heteroskedasticity
No ratings yet
4 - LM Test and Heteroskedasticity
13 pages
Machine Learning: An Applied Econometric Approach: Sendhil Mullainathan and Jann Spiess
No ratings yet
Machine Learning: An Applied Econometric Approach: Sendhil Mullainathan and Jann Spiess
48 pages
Short Quiz 15 STAT PDF
No ratings yet
Short Quiz 15 STAT PDF
11 pages
BST 231 - 2019 Spring (55666)
No ratings yet
BST 231 - 2019 Spring (55666)
2 pages
Fixed vs Random Effects in Panel Data
No ratings yet
Fixed vs Random Effects in Panel Data
2 pages
TSNotes 2
No ratings yet
TSNotes 2
28 pages
Product Sales Assignment Solutions
No ratings yet
Product Sales Assignment Solutions
15 pages
Lecture Part 7 - Biostat
No ratings yet
Lecture Part 7 - Biostat
71 pages
PSY325 Week 2 Discussion
No ratings yet
PSY325 Week 2 Discussion
5 pages
Prg7a - Jupyter Notebook
No ratings yet
Prg7a - Jupyter Notebook
12 pages