0% found this document useful (0 votes)
63 views73 pages

Statistics 262: Intermediate Biostatistics: Kaplan-Meier Methods and Parametric Regression Methods

Statistics 262 covers intermediate biostatistical methods including Kaplan-Meier methods and parametric regression. The Kaplan-Meier method estimates the survival function S(t) by accounting for censored data and is defined at event times. An example analyzes time to conception for 38 subfertile women, with conception as the event. The Kaplan-Meier curve is calculated based on event and censored times to estimate the probability of surviving without conception at each time point. The estimated probability of surviving without conception at 16 months is 15%.

Uploaded by

anova12345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views73 pages

Statistics 262: Intermediate Biostatistics: Kaplan-Meier Methods and Parametric Regression Methods

Statistics 262 covers intermediate biostatistical methods including Kaplan-Meier methods and parametric regression. The Kaplan-Meier method estimates the survival function S(t) by accounting for censored data and is defined at event times. An example analyzes time to conception for 38 subfertile women, with conception as the event. The Kaplan-Meier curve is calculated based on event and censored times to estimate the probability of surviving without conception at each time point. The estimated probability of surviving without conception at 16 months is 15%.

Uploaded by

anova12345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 73

Statistics 262:

Intermediate Biostatistics
Kaplan-Meier methods and Parametric Regression
methods

More on Kaplan-Meier estimator of S(t)


(product-limit estimator or KM
estimator)

When there are no censored data, the KM


estimator is simple and intuitive:

When there are censored data, KM provides


estimate of S(t) that takes censoring into account
(see last weeks lecture).

Estimated S(t)= proportion of observations with failure times >


t.
For example, if you are following 10 patients, and 3 of them die
by the end of the first year, then your best estimate of S(1 year)
= 70%.

If the censored observation had actually been a failure: S(1


year)=4/5*3/4*2/3=2/5=40%

KM estimator is defined only at times when events


occur! (empirically defined)
2

KM (product-limit)
estimator, formally
k distinct event times t1 t j ... t k
at each event time t j , there are n j individuals at - risk
d j is the number who have the event at time t j
S (t )

dj

[1 n
j:t j t

KM (product-limit)
estimator, formally
Observed event times

k distinct event times t1 t j ... t k


at each event time t j , there are n j individuals at - risk
d j is the number who have the event at time t j
dj

[1 n

The risk set nj at time tj consists of


Typically
dj= 1 sample
person, minus
unlessall
data
the original
those
have in
been
censored
had the
arewho
grouped
time
intervalsor(e.g.,
j
j:t j t
event before
tj the event in the
everyone
who had
rd
3 month).
dj/nj=proportion that failed at the event
S(t) represents estimated survival probability at time t:
time tj
P(T>t)

S (t )

1- dj/nj=proportion surviving the event


time

Multiply the probability of surviving


This formula gives the product-limit estimate of survival at each time an event happe
event time t with the probabilities of
surviving all the previous event times.

Example 1: time-toconception for subfertile


women
Failure
here is a good thing.
38 women (in 1982) were treated for infertility
with laparoscopy and hydrotubation.
All women were followed for up to 2-years to
describe time-to-conception.
The event is conception, and women "survived"
until they conceived.
Example from: BMJ, Dec 1998; 317: 1572 - 1580.
5

Raw data: Time (months) to conception or censoring in 38sub-fertile


women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3
4
7
7
8
8
9
9
9
11
24
24

Did not conceive


(censored)

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve


S(t) is estimated at 9 event
times.
(step-wise function)

Raw data: Time (months) to conception or censoring in 38sub-fertile


women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3
4
7
7
8
8
9
9
9
11
24
24

Did not conceive


(censored)

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Raw data: Time (months) to conception or censoring in 38sub-fertile


women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3
4
7
7
8
8
9
9
9
11
24
24

Did not conceive


(censored)

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve


6 women conceived in 1st
month (1st menstrual cycle).
Therefore, 32/38 survived
pregnancy-free past 1 month.

10

Corresponding KaplanMeier Curve


S(t=1) = 32/38 = 84.2%
S(t) represents estimated survival probability: P(T>t)
Here P(T>1).

11

Raw data: Time (months) to conception or censoring in 38sub-fertile


women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

Did not conceive


(censored)

2.1
3
4
7
7
8
8
9
9
9
11
24
24

Important detail of how the data were coded:


t=2 indicates survival PAST the 2 nd cycle
Censoring at

(i.e., we know
the woman survived her 2nd cycle

pregnancy-free).

Thus, for calculating KM estimator at 2 months, this

person should still be included in the risk set.

Think of it as
2+ months, e.g., 2.1 months.

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve

13

Corresponding KaplanMeier Curve


5 women conceive in 2nd month.

The risk set at event time 2 included


32 women.
Therefore, 27/32=84.4% survived
event time 2 pregnancy-free.

S(t=2) = ( 84.2%)*(84.4%)=71.1%
Can get an estimate of the hazard rate
here, h(t=2)= 5/32=15.6%. Given
that you didnt get pregnant in month
1, you have an estimated 5/32 chance
of conceiving in the 2nd month.
And estimate of density (marginal
probability of conceiving in month 2):
f(t)=h(t)*S(t)=(.711)*(.156)=11%

14

Raw data: Time (months) to conception or censoring in 38sub-fertile


women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2.1
3.1
4
7
7
8
8
9
9
9
11
24
24

Did not conceive


(censored)

Risk set at 3
months
includes 26
women

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve

16

Corresponding KaplanMeier Curve


3 women conceive in the 3rd month.

The risk set at event time 3 included


26 women.
23/26=88.5% survived event time 3
pregnancy-free.

S(t=3) = ( 84.2%)*(84.4%)*(88.5%)=62.8%

17

Raw data: Time (months) to conception or censoring in 38sub-fertile


women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3.1
4
7
7
8
8
9
9
9
11
24
24

Did not conceive


(censored)

Risk set at 4
months
includes 22
women

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve

19

Corresponding KaplanMeier Curve


3 women conceive in the 4th month,
and 1 was censored between months
3 and 4.
The risk set at event time 4 included
22 women.
19/22=86.4% survived event time 4
pregnancy-free.

S(t=4) = ( 84.2%)*(84.4%)*(88.5%)*(86.4%)=54.2%

Hazard rates (conditional chances of


conceiving, e.g. 100%-84%) look
similar over time.
And estimate of density (marginal
probability of conceiving in month
4):
f(t)=h(t)*S(t)=(.136)*
(.542)=7.4%
20

Raw data: Time (months) to conception or censoring in 38sub-fertile


women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3
4.1
7
7
8
8
9
9
9
11
24
24

Did not conceive


(censored)

Risk set at 6
months
includes 18
women

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding KaplanMeier Curve

22

Corresponding KaplanMeier Curve


2 women conceive in the 6th month of
the study, and one was censored
between months 4 and 6.
The risk set at event time 5 included
18 women.
16/18=88.8% survived event time 5
pregnancy-free.

S(t=6) = (54.2%)*(88.8%)=42.9%

23

Skipping ahead to the 9th and


final event time (months=16)

S(t=13) 22%
(eyeball approximation)

24

Raw data: Time (months) to conception or censoring in 38sub-fertile


women after laparoscopy and hydrotubation (1982 study)
Conceived (event)
1
1
1
1
1
1
2
2
2
2
2
3
3
3
4
4
4
6
6
9
9
9
10
13
16

2
3
4
7
7
8
8
9
9
9
11
24
24

Did not conceive


(censored)

2 remaining at 16
months (9th event
time)

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy
and hydrotubation. BMJ 1982; 284: 1013-1014

Skipping ahead to the 9th and


final event time (months=16)

S(t=16) =( 22%)*(2/3)=15%

Tail here just represents that


the final 2 women did not
conceive (cannot make many
inferences from the end of a
KM curve)!
26

Kaplan-Meier: SAS output


The LIFETEST Procedure
Product-Limit Survival Estimates

time
0.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
2.0000
2.0000
2.0000
2.0000
2.0000
2.0000*
3.0000
3.0000
3.0000
3.0000*
4.0000
4.0000
4.0000
4.0000*

Survival
1.0000
.
.
.
.
.
0.8421
.
.
.
.
0.7105
.
.
.
0.6285
.
.
.
0.5428
.

Failure
0
.
.
.
.
.
0.1579
.
.
.
.
0.2895
.
.
.
0.3715
.
.
.
0.4572
.

Survival
Standard
Error
0
.
.
.
.
.
0.0592
.
.
.
.
0.0736
.
.
.
0.0789
.
.
.
0.0822
.

Number
Failed

Number
Left

0
1
2
3
4
5
6
7
8
9
10
11
11
12
13
14
14
15
16
17
17

38
37
36
35
34
33
32
31
30
29
28
27
26
25
24
23
22
21
20
19
18

27

Kaplan-Meier: SAS output


Survival
time
6.0000

Survival
.

6.0000
7.0000*
7.0000*
8.0000*
8.0000*
9.0000
9.0000
9.0000
9.0000*
9.0000*
9.0000*
10.0000
11.0000*
13.0000
16.0000
24.0000*
24.0000*

Failure
.

0.4825
.
.
.
.
.
.
0.3619
.
.
.
0.3016
.
0.2262
0.1508
.
.

Standard
Error
.

0.5175
.
.
.
.
.
.
0.6381
.
.
.
0.6984
.
0.7738
0.8492
.
.

18
0.0834
.
.
.
.
.
.
0.0869
.
.
.
0.0910
.
0.0944
0.0880
.
.

Number
Failed

Number
Left
17

19
19
19
19
19
20
21
22
22
22
22
23
23
24
25
25
25

16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

NOTE: The marked survival times are censored observations.

28

Monday Gut Check


Problem

Calculate the product-limit estimate of


survival for the following data (n=9):
Time-to-event (months)

Survival
(1=died/0=censored)

10

12

14

10

29

Not so easy to get a plot of the actual hazard


function!
In SAS, need a complicated MACRO, and
depends on assumptionsheres what I get
from Paul Allisons macro for these data

At best, you can get the


cumulative hazard
function
t

S (t ) e

h ( u ) du
0

log S (t ) h(u )du


0

Linear cumulative
hazard function
indicates a
constant hazard.

See lecture 1 if you


want more math!

31

log S (t ) h(u )du

Cumulative Hazard
Function
0

If the hazard function is constant, e.g. h(t)=k, then the cumulative hazard function
will be linear (and higher hazards will have steeper slopes):

kdu kt
0

If the hazard function is increasing with time, e.g. h(t)=kt, then the cumulative
hazard function will be curved up, for example h(t)=kt gives a quadratic:
t

kt 2
ktdu
2
0

If the hazard function is decreasing over time, e.g. h(t)=k/t, then the
cumulative hazard function should be curved down, for example:
t

k
du k log(t )
t

32

Kaplan-Meier: example 2
Researchers randomized 44 patients with chronic active
hepatitis were to receive prednisolone or no treatment
(control), then compared survival curves.

Example from: BMJ 1998;317:468-469 (15August)

33

Survival times (months) of 44patients with chronic active hepatitis randomised to


receive prednisolone or no treatment.
Prednisolone (n=22)

Control (n=22)

12

54

56 *

10

68

22

89

28

96

29

96

32

125*

37

128*

40

131*

41

140*

54

141*

61

143

63

145*

71

146

127*

148*

140*

162*

146*

168

158*

173*

167*

181*

182*

Data from: BMJ 1998;317:468-469 (15August)

*=censored

Kaplan-Meier: example 2
Are these two curves
different?

Big drops at the end


of the curve indicate
few patients left.
E.g., only 2/3 (66%)
survived this drop.

Misleading to the eye


apparent convergence by
end of study. But this is
due to 6 controls who
survived fairly long, and 3
events in the treatment
group when the sample size
was small.

35

Control group:
Survival
time
0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*

6 controls
made it
past 100
months.

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.

Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.

Number
Failed

Number
Left

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

treated group:
time

5/6 of 54%
rapidly
drops the
curve to
45%.
2/3 of 45%
rapidly
drops the
curve to
30%.

0.000
2.000
6.000
12.000
54.000
56.000*
68.000
89.000
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.

Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.

Number
Failed

Number
Left

0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

Point-wise confidence
intervals

We will not worry about mathematical formula for confidence


bands. The important point is that there is a confidence
interval for each estimate of S(t). (SAS uses Greenwoods

38

Log-rank test
Test of Equality over Strata

Test
Log-Rank
Wilcoxon
-2Log(LR)

Chi-Square
4.6599
6.5435
5.4096

Pr >
DF
1
1
1

Chi-Square
0.0309
0.0105
0.0200

Chi-square test (with 1 df) of the (overall)


difference between the two groups.
Groups appear significantly different.
39

Log-rank test

Log-rank test is just a Cochran-Mantel-Haenszel chi-square

Anyone remember (know) what this is?

40

CMH test of conditional


independence
K Strata =
unique event
times

Event

No Event

Group 1

Group 2

Nk
(ak E (ak ))]2

i 1

Var (a )
k

i 1

~ 12

E ( ak )

(ak bk ) * (ak ck )
Nk

Var (ak )

(ak bk ) * (ck d k ) * (ak ck ) * (bk d k )


N k2 ( N k 1)

CMH test of conditional


independence
K Strata =
unique event
times

Event

No Event

Group 1

Group 2

Nk
(ak E (ak ))]2

i 1

Var (a )
k

i 1

~ 12

E (ak )

row1k * col1k
Nk

Var (ak )

row1k * row 2 k * col1k * col 2 k


N k2 ( N k 1)

CMH test of conditional


independence
E ( you
How
eventsdo
events) know
observed expected

Z
standard
deviation
thatVarthis
eventsis a chisquare
with 1 df?
Z
k event times

2
1

(a

No Event

Group 1

Group 2

k event times

k event times

Event

E (ak ))]

i 1

Var (a )
k

2
1

Why is this the


expected value
in each stratum?
E (ak )

row1k * col1k
Nk

Var (ak )

row1k * row 2 k * col1k * col 2 k


N k2 ( N k 1)

i 1

Variance is the variance of a


hypergeometric distribution

Event time 1 (2 months), control group:


Survival
time

1st
event
at
month
2.

0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.

Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.

Number
Failed

Number
Left

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
2

Event time 1 (2 months), treated group:


time

Survival

0.000

1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.

1st
2.000
event 6.000
12.000
at
54.000
month 56.000*
68.000
2.
89.000
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*

Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.

Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.

Number
Failed

Number
Left

0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
2

Stratum 1= event time 1


Event time 1:
1 died from each
group. (22 at risk in
each group)

Event

No Event

treated

21

control

21

44
a1 1
( 22) * (2)
1
44
(22) * (22) * (2) * (42)
Var ( a1 )
.244
2
44 (43)
E (a1 )

Event time 2 (3 months), control group:


Survival
time

Next
event
at
month
3.

0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.

Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.

Number
Failed

Number
Left

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
1

Event time 2 (3 months), treated group:


time

Survival

0.000

1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.

2.000
No
6.000
events12.000
at 3 54.000
56.000*
month 68.000
89.000
s
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*

Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.

Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.

Number
Failed

Number
Left

0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
1

Stratum 2= event time 2


Event time 2:

Event

No Event

At 3 months, 1 died in
the control group.

treated

21

At that time 21 from


each group were at risk

control

20

42
a1 0
(1) * ( 21)
.5
42
(21) * (21) * (1) * (41)
Var (a1 )
.25
2
42 (41)
E (a1 )

Event time 3 (4 months), control group:


Survival
time

1 event
at month
4.

0.000
2.000
3.000
4.000
7.000
10.000
22.000
28.000
29.000
32.000
37.000
40.000
41.000
54.000
61.000
63.000
71.000
127.000*
140.000*
146.000*
158.000*
167.000*
182.000*

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
0.7727
0.7273
0.6818
0.6364
0.5909
0.5455
0.5000
0.4545
0.4091
0.3636
0.3182
0.2727
.
.
.
.
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
0.2273
0.2727
0.3182
0.3636
0.4091
0.4545
0.5000
0.5455
0.5909
0.6364
0.6818
0.7273
.
.
.
.
.
.

Standard
Error
0
0.0444
0.0613
0.0732
0.0822
0.0893
0.0950
0.0993
0.1026
0.1048
0.1062
0.1066
0.1062
0.1048
0.1026
0.0993
0.0950
.
.
.
.
.
.

Number
Failed

Number
Left

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
16
16
16
16
16
16

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
0

Event time 3 (4 months), treated group:


time
0.000
2.000
6.000
12.000
54.000
56.000*
68.000
89.000
96.000
96.000
125.000*
128.000*
131.000*
140.000*
141.000*
143.000
145.000*
146.000
148.000*
162.000*
168.000
173.000*
181.000*

Survival
1.0000
0.9545
0.9091
0.8636
0.8182
.
0.7701
0.7219
.
0.6257
.
.
.
.
.
0.5475
.
0.4562
.
.
0.3041
.
.

Failure
0
0.0455
0.0909
0.1364
0.1818
.
0.2299
0.2781
.
0.3743
.
.
.
.
.
0.4525
.
0.5438
.
.
0.6959
.
.

Survival
Standard
Error
0
0.0444
0.0613
0.0732
0.0822
.
0.0904
0.0967
.
0.1051
.
.
.
.
.
0.1175
.
0.1285
.
.
0.1509
.
.

Number
Failed

Number
Left

0
1
2
3
4
4
5
6
7
8
8
8
8
8
8
9
9
10
10
10
11
11
11

22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0

At
risk=2
1

Stratum 3= event time 3


(4 months)
Event time 3:
At 4 months, 1 died in
the control group.
At that time 21 from
the treated group and
20 from the control
group were at-risk.

Event

No Event

treated

21

control

19

a1 0
(1) * (21)
.51
41
(21) * (20) * (1) * (40)
Var (a1 )
.25
2
41 (40)
E (a1 )

41

Etc.
22

(a k E (a k ))] 2

i 1

22

Var (a
i 1

[(1 1) (0 .5) (0 .51) ...............] 2

4.66
.244 .25 .25 .....

Log-rank test, et al.


Wilcoxon is just a version of
the log-rank test that
weights strata by their size
(giving more weight to
earlier time points).

Test of Equality over Strata

More sensitive to
differences at earlier time
points.
Test

Log-Rank
Wilcoxon
-2Log(LR)

Chi-Square
4.6599
6.5435
5.4096

Likelihood Ratio test is not ideal


here because it assumes
exponential distribution
(constant hazard).

Pr >
DF
1
1
1

Chi-Square
0.0309
0.0105
0.0200

Log-rank test has most


power to test differences
that fit the proportional
hazards modelso works
well as a set-up for
subsequent Cox regression.

54

Estimated log(S(t))
Maybe hazard
function decreases a
little then increases
a little? Hard to say
exactly

55

Approximated h(t)

56

One more graph from


SAS
log(-log(S(t))=
log(cumulative hazard)
If group plots are
parallel, this indicates
that the proportional
hazards assumption is
valid.
Necessary assumption
for calculation of
Hazard Ratios
57

Uses of Kaplan-Meier

Commonly used to describe


survivorship of study population/s.
Commonly used to compare two
study populations.
Intuitive graphical presentation.

58

Limitations of Kaplan-Meier

Mainly descriptive
Doesnt control for covariates
Requires categorical predictors

SAS does let you easily discretize continuous


variables for KM methods, for exploratory
purposes.

Cant accommodate time-dependent


variables
59

Parametric Models for the


hazard/survival function

The class of regression models


estimated by PROC LIFEREG is
known as the accelerated failure
time models.

60

Shape parameter (inverse of


the scale parameter):
<1: hazard rate is decreasing
>1 hazard rate is increasing

Parameters of
the Weibull 61

Constant hazard rate


(special case of
Weibull where shape
parameter =1.0)

62

Recall: two parametric


models
Components:
A baseline hazard function (that may change over time).
A linear function of a set of k fixed covariates that when
exponentiated (and a few other things) gives the relative
risk.
Exponential model assumes fixed baseline hazard that we can
estimate.

log hi (t ) 1 xi1 ... k xik


Weibull model models the baseline hazard as a function of time. Two parameters
(baseline hazard and scale) must be estimated to describe the underlying hazard
function over time.

log hi (t ) log t 1 xi1 ... k xik

63

To get Hazard Ratios


(relative risk)
Weibull (and thus exponential) are proportional hazards
models, so hazard ratio can be calculated.
For other parametric models, you cannot calculate hazard
ratio (hazards are not necessarily proportional over time).

Exponential Model :
HR e

Weibull Model :
HR e

scale

More tricky to get confidence intervals

64

Whats a hazard ratio?


Distinction between hazard/rate
ratio and odds ratio/risk ratio:
Hazard/rate ratio: ratio of
incidence rates
Odds/risk ratio: ratio of proportions

65

Example 1
Using data from pregnancy study
Recall: roughly, hazard rates were
similar over time
(implies exponential model should
be a good fit).

66

The LIFEREG Procedure


Analysis of Parameter Estimates
Standard
Parameter

DF Estimate

Error

95% Confidence
Limits

Intercept

2.2636

0.2049

1.8621

2.6651

Scale

1.0217

0.1638

0.7462

1.3987

Weibull Shape

0.9788

0.1569

0.7149

1.3401

Scale of 1.0 makes a Weibull


an exponential, so looks
exponential.

ChiSquare Pr > ChiSq


122.08

<.0001

Parametric estimates of survival function


based on a Weibull model (left) and
exponential (right).

Compar
e to KM:
68

Example 2: 2 groups
Using data from hepatitis trial, I fit
exponential and Weibull models in
SAS using LIFEREG (Weibull is
default in LIFEREG)

69

-2Log Likelihood = 2*68= 136

The LIFEREG Procedure

Dependent Variable

Log(time)

Right Censored Values

17

Left Censored Values

Interval Censored Values

Name of Distribution

Exponential

Log Likelihood
Scale parameter is set to
1, because its
exponential.

-68.03461345

Analysis of Parameter Estimates


Standard
Parameter

DF Estimate

Error

95% Confidence
Limits

P-value for group very


similar to p-value from
log-rank test.

ChiSquare Pr > ChiSq

Intercept

4.4886

0.2500

3.9986

4.9786

322.37

<.0001

group

0.9008

0.3917

0.1332

1.6685

5.29

0.0214

Scale

1.0000

0.0000

1.0000

1.0000

Weibull Shape

1.0000

0.0000

1.0000

1.0000

Hazard ratio (treated vs.


Interpretation: median time to death was decreased
control):
60% in treated group; or, equivalently, mortality rate
is 60% lower in treated group.

-2Log Likelihood = 2*67= 134


Model Information
Dependent Variable
Right Censored Values
Left Censored Values

Comparison of models using Likelihood Ratio test:

Log(time)

-2LogLikelihood(simpler model)2LogLikelihood(more
complex) = chi-square
17with 1 df (1 extra parameter estimated
for weibull model).
=136-134 = 2

Interval Censored Values NS


Name of Distribution
Log Likelihood
Scale parameter is
greater than 1, indicating
decreasing hazard with
time.

No evidence that
Weibull model is much better than
Weibull
exponential.

-66.94904552

P-value for group very


similar to p-value from
log-rank test and
exponential model.

Analysis of Parameter Estimates


Standard

Parameter

DF Estimate

Error

95% Confidence
Limits

ChiSquare Pr > ChiSq

Intercept

4.4811

0.3169

3.8601

5.1022

200.00

<.0001

group

1.0544

0.5096

0.0556

2.0533

4.28

0.0385

Scale

1.2673

0.2139

0.9103

1.7643

Weibull Shape

0.7891

0.1332

0.5668

1.0985

Shape parameter is just


1/scale parameter!

Hazard ratio (treated vs.


control):

Parametric estimates of cumulative survival


based on Weibull model (left) and exponential
(right), by group.

Compar
e to KM:

Compare to Cox
regression:
Variable
group

DF

Parameter
Estimate

Standard
Error

Chi-Square

Pr > ChiSq

Hazard
Ratio

-0.83230

0.39739

4.3865

0.0362

0.435

95% Hazard Ratio


Confidence Limits
0.200

0.948

73

You might also like