Sei sulla pagina 1di 18

1.

The Basics of Survival Analysis


Special features of survival analysis
Censoring mechanisms
Basic functions and quantities in survival analysis
Models for survival analysis
1.1. Special features of survival analysis
Application fields of survival analysis
Medicine, Public health, Epidemiology, Engineering, etc.
Time-to-event
The main variable of interest in survival analysis is time-to-event.
Time-to-event is a positive random variable.
Examples of time-to-event:
Times to death of patients with certain disease
Remission duration of certain disease in clinical trials
Incubation times of certain disease, such as AIDS, Hypertitis B, SARS etc.
Failure times of certain manufactured products
Life times of elderly in particular social programs

Incomplete observation of time-to-event


Example 1.1. Survival time of HIV+ patients

Times-to-event are not always completely observable. These times


are subject to censoring and truncation. For a censored or a truncated time-to-event, only partial information is available.
Types of censoring:
left censoring, right censoring, interval censoring.
Types of truncation:
left truncation, right truncation, interval truncation.
2

1.2 Censoring mechanisms


Right censoring
Right censoring includes ordinary type I censoring, progressive
type I censoring, generalized type I censoring, random censoring
and type II censoring. We will focus only on ordinary and generalized type I censoring and random censoring.
Ordinary Type I censoring:
The censoring time is prespecied and the same for all individuals.
this kind of censoring is usually used in animal studies and clinical
trials.
Figure 1.1 Illustration of ordinary Type I censoring

Generalized Type I censoring


In generalized Type I censoring, each individual has a specic xed
censoring time.
Figure 1.2: Illustration of generalized Type I censoring

Example 1.2. Remission duration from a clinical trial for acute


leukemia
Purpose of the trial: investigate the eect of the drug 6-MP.
Design of study: 21 pairs of children matched by either
complete remission or partial remission. For each pair,
the drug 6-MP and a placebo are randomly assigned to
the children in the pair. The patients are followed until
relapse or until the end of the study.

Random censoring
Each individual is censored at random.
Random censoring can be described by a random variable C which
is independent of X. An individual is censored if C < X.
Left censoring
A lifetime X associated with an individual in a study is considered
to be left censored if it is less than a censoring time Cl . The data
observed on the individual can be recorded as (T, ) where
T = max{X, Cl },

1, if T = X,
 =
0, if T = Cl .
Example 1.3. Childhood learning
Time-to-event: the age at which a child learns to accomplish certain tasks in children learning centers.
Left censoring occurs if children can already perform the
tasks when they start their study at the centers.

Example 1.4. Time to rst use of marijuana


Data are collected through survey by asking When did
you rst use marijuana? The answers are:
a. Exact age
b. I never used it.
c. I used it but can not remember when the rst
time was.
Answer c gives a left censored observation.
Interval censoring
When lifetime is only known to fall within an interval, it is referred
to as interval censoring. Interval censoring occurs in clinical trial
where patients have periodic follow-ups, and in industrial experiments where equipment items are inspected periodically, etc.
Example 1.5. Time to cosmetic deterioration of breast cancer
patients
To compare the cosmetic eect of two treatments on early
breast cancer patients: (i) radiotherapy and (ii) radiotherapy plus chemotherapy, patients were observed in
intervals. The event of interest is the rst time breast
retraction is obeserved.

Breast cancer data:

Truncation
Truncation is a dierent phenomenon from censoring. Truncation
is due to sampling bias that only those individuals whose lifetimes
lie within a certain interval [YL, YR] can be observed.
Left truncation YR = .
Example 1.6. Death times of elderly residents of a retirement
community
The time a resident died or left the community is observed. The
entry time is recorded. Only the elderly people of a certain age
can be admitted into the community. People died before this age
can not be observed.
Right truncation YL = 0
Example 1.7. Time to AIDS
258 adults and 37 children who were infected with AIDS virus after
April 1, 1978 and developed AIDS by June 30 , 1986 were included
in a study. Those who were infected in this period but have not
yet developed AIDS are not observed
The event of interest is the induction time (period from time of
infection to time of onset of AIDS).
For people infected at a particular infection time, their induction
time is truncated at the length which equals the period from the
infection time to June 30 , 1986.

The time-to-AIDS data

10

1.3. Basic functions and quantities in survival analysis


Let X denote the random variable time-to-event.
Besides the usual probability density function f (x) and cumulative
distribution function F (x), the distribution of X can be described by
several equivalent functions. They are:
Survival function, Hazard function, Cumulative hazard function, and so on.
Survival function
S(x) = P r(X > x)

=
f (x)dx
x

= 1 F (x)
 
if X is continuous,
x f (x)dx,

=
xj >x p(xj ), if X is discrete,
where xj are the points with positive mass.
Basic properties of survival function:
1. S(0) = 1, S() = 0;
2. S(x) is non-increasing.

11

3. For discrete X,
 S(xj )
.
S(x) =
S(x
)
j1
x x
j

Examples of parametric distribution families for survival analysis:


Exponential distribution:
f (x) = ex
S(x) = ex.
Weibull distribution:
f (x) = x1ex

S(x) = ex .

Hazard rate function


P r[x X x + x|X x]
x0
x
f (x)
=
S(x)
d ln[S(x)]
=
.
dx

h(x) = lim

12

The hazard rate function can be viewed as the probability of an


individual of age x experiencing the event instantateously.
Examples:
Exponential distribution:
h(x) = .
Weibull distribution:
h(x) = x1 .
Generic types of hazard rates:
1. Constant
Un-realistic.
2. Increasing: Convex or concave
Arising from natural aging or wear
3. Decreasing: Convex or concave
Lifetimes of electronic devices; Patients experincing
certain types of transplants; etc.
4. Convex bathtub shape
Appropriate for populations followed from birth;
Mortality data; Manufactured equipments; etc.
5. Concave bump shape
Survival after successful surgery; etc.
13

Figure 1.3 Shapes of generic hazard functions

14

Cumulative hazard function


Continuous case:

H(x) =

h(x)dx

x
d ln[S(x)]

dx
=
dx
0
= ln[S(x)]

Discrete case:
H(x) =

0

h(xj ).

xj x

1.4. Regression models for survival data


Regression models accounting for covariates which cause heterogeneity.
Covariates could be quantitative, e.g., blood pressure, temperature, age, weight, etc.
Covariates could also be qualitative, e.g., gender, race, treatment, disease status, etc.
Covariates may or may not be time-depedent. The covariates
are denoted by
Z t (x) = (Z1 (x), . . . , Zp(x)).

15

Log survival time regression


Let Y = ln(X). Y is modeles as


Y = + Z + W,
where W is a random variable with mean 0 and variance 1.
Choices of the distribution of W :
Normal corresponding to log-normal distribution for X,
Extreme-value corresponding to Weibull distribution for
X,
Logistic corresponding to log-logistic distribution for X.
Survival function under the model:
S(x|Z) =
=
=
=

P r[X > x|Z] = P r[Y > ln x|Z]



P r[ + W > ln x Z|Z]

P r[e+W > x exp{ Z}|Z]

S0 (x exp{ Z}),

where S0 is the survival function of the baseline failure time


exp{ + W }. This model is called the accelerated failure-time

model, since the failure-time is accelerated by a factor exp{ Z}.

Hazard function


h(x|Z) = h0 (x exp{ Z}) exp{ Z}.


16

Example:
Suppose W follows an extreme value distribution with density

function fW (w) = exp{w ew }. Let Y = ln X = Z + W.
Then


S(x|Z) = S0 (x exp( Z)) = exp{[x exp( Z)]1/ }


1


h(x|Z) = h0 (x exp( Z)) = [x exp( Z)]1/1 ,

1/

where S0(t) = et and h0 (t) =


hazard functions of X0 = eW .

1 1/1
t

are the survival and

Hazard rate regression


It is more exible to model the hazard rate by a regression function
of the covariates.
Multiplicative hazard models
The hazard rate is modeled as


h(x|Z) = h0 (x)c( Z),


where h0 (x) is a baseline hazard function and c() is a positive
function.
The multiplicative model has the feature that, if all the covariates
are xed at time zero, the hazard rates of two individuals with
dierent covariate values are proportional. It is easy to see that


h(x|Z 1) c( Z 1)
.
=
h(x|Z 2) c(  Z 2)
17

The survival function of the multiplicative hazard model:


c(

S(x|Z) = S0(x)

Z ).

In the particular Cox proportional hazard rate model,




c( Z) = exp( Z).
A remark on the estimation
In principle, the regression models can be estimated by maximum
likelihood method. But, due to computational diculties involved
by using the exact likelihood, certain alternatives will be considered
for the estimation in the light of likelihood principle.

18

Potrebbero piacerti anche