The Basics of Survival Analysis

1.
The Basics of Survival Analysis

Special features of survival analysis
Censoring mechanisms
Basic functions and quantities in survival analysis
Models for survival analysis
1.1. Special features of survival analysis
Application fields of survival analysis
Medicine, Public health, Epidemiology, Engineering, etc.
Time-to-event
The main variable of interest in survival analysis is time-to-event.
Time-to-event is a positive random variable.
Examples of time-to-event:
Times to death of patients with certain disease
Remission duration of certain disease in clinical trials
Incubation times of certain disease, such as AIDS, Hypertitis B, SARS etc.
Failure times of certain manufactured products
Life times of elderly in particular social programs
Incomplete observation of time-to-event

Example 1.1. Survival time of HIV+ patients
Times-to-event are not always completely observable. These times

are subject to censoring and truncation. For a censored or a truncated time-to-event, only partial information is available.
Types of censoring:
left censoring, right censoring, interval censoring.
Types of truncation:
left truncation, right truncation, interval truncation.
2
1.2 Censoring mechanisms

Right censoring
Right censoring includes ordinary type I censoring, progressive
type I censoring, generalized type I censoring, random censoring
and type II censoring. We will focus only on ordinary and generalized type I censoring and random censoring.
Ordinary Type I censoring:
The censoring time is prespecied and the same for all individuals.
this kind of censoring is usually used in animal studies and clinical
trials.
Figure 1.1 Illustration of ordinary Type I censoring
Generalized Type I censoring

In generalized Type I censoring, each individual has a specic xed
censoring time.
Figure 1.2: Illustration of generalized Type I censoring
Example 1.2. Remission duration from a clinical trial for acute

leukemia
Purpose of the trial: investigate the eect of the drug 6-MP.
Design of study: 21 pairs of children matched by either
complete remission or partial remission. For each pair,
the drug 6-MP and a placebo are randomly assigned to
the children in the pair. The patients are followed until
relapse or until the end of the study.
Random censoring
Each individual is censored at random.
Random censoring can be described by a random variable C which
is independent of X. An individual is censored if C < X.
Left censoring
A lifetime X associated with an individual in a study is considered
to be left censored if it is less than a censoring time Cl . The data
observed on the individual can be recorded as (T, ) where
T = max{X, Cl },

1, if T = X,
=
0, if T = Cl .
Example 1.3. Childhood learning
Time-to-event: the age at which a child learns to accomplish certain tasks in children learning centers.
Left censoring occurs if children can already perform the
tasks when they start their study at the centers.
Example 1.4. Time to rst use of marijuana

Data are collected through survey by asking When did
you rst use marijuana? The answers are:
a. Exact age
b. I never used it.
c. I used it but can not remember when the rst
time was.
Answer c gives a left censored observation.
Interval censoring
When lifetime is only known to fall within an interval, it is referred
to as interval censoring. Interval censoring occurs in clinical trial
where patients have periodic follow-ups, and in industrial experiments where equipment items are inspected periodically, etc.
Example 1.5. Time to cosmetic deterioration of breast cancer
patients
To compare the cosmetic eect of two treatments on early
breast cancer patients: (i) radiotherapy and (ii) radiotherapy plus chemotherapy, patients were observed in
intervals. The event of interest is the rst time breast
retraction is obeserved.
Breast cancer data:
Truncation
Truncation is a dierent phenomenon from censoring. Truncation
is due to sampling bias that only those individuals whose lifetimes
lie within a certain interval [YL, YR] can be observed.
Left truncation YR = .
Example 1.6. Death times of elderly residents of a retirement
community
The time a resident died or left the community is observed. The
entry time is recorded. Only the elderly people of a certain age
can be admitted into the community. People died before this age
can not be observed.
Right truncation YL = 0
Example 1.7. Time to AIDS
258 adults and 37 children who were infected with AIDS virus after
April 1, 1978 and developed AIDS by June 30 , 1986 were included
in a study. Those who were infected in this period but have not
yet developed AIDS are not observed
The event of interest is the induction time (period from time of
infection to time of onset of AIDS).
For people infected at a particular infection time, their induction
time is truncated at the length which equals the period from the
infection time to June 30 , 1986.
The time-to-AIDS data
10
1.3. Basic functions and quantities in survival analysis

Let X denote the random variable time-to-event.
Besides the usual probability density function f (x) and cumulative
distribution function F (x), the distribution of X can be described by
several equivalent functions. They are:
Survival function, Hazard function, Cumulative hazard function, and so on.
Survival function
S(x) = P r(X > x)

=
f (x)dx
x
= 1 F (x)

if X is continuous,
x f (x)dx,

=
xj >x p(xj ), if X is discrete,
where xj are the points with positive mass.
Basic properties of survival function:
1. S(0) = 1, S() = 0;
2. S(x) is non-increasing.
11
3. For discrete X,
S(xj )
.
S(x) =
S(x
)
j1
x x
j
Examples of parametric distribution families for survival analysis:

Exponential distribution:
f (x) = ex
S(x) = ex.
Weibull distribution:
f (x) = x1ex
S(x) = ex .
Hazard rate function

P r[x X x + x|X x]
x0
x
f (x)
=
S(x)
d ln[S(x)]
=
.
dx
h(x) = lim
12
The hazard rate function can be viewed as the probability of an

individual of age x experiencing the event instantateously.
Examples:
Exponential distribution:
h(x) = .
Weibull distribution:
h(x) = x1 .
Generic types of hazard rates:
1. Constant
Un-realistic.
2. Increasing: Convex or concave
Arising from natural aging or wear
3. Decreasing: Convex or concave
Lifetimes of electronic devices; Patients experincing
certain types of transplants; etc.
4. Convex bathtub shape
Appropriate for populations followed from birth;
Mortality data; Manufactured equipments; etc.
5. Concave bump shape
Survival after successful surgery; etc.
13
Figure 1.3 Shapes of generic hazard functions
14
Cumulative hazard function

Continuous case:
H(x) =
h(x)dx

x
d ln[S(x)]
dx
=
dx
0
= ln[S(x)]
Discrete case:
H(x) =
0
h(xj ).
xj x
1.4. Regression models for survival data

Regression models accounting for covariates which cause heterogeneity.
Covariates could be quantitative, e.g., blood pressure, temperature, age, weight, etc.
Covariates could also be qualitative, e.g., gender, race, treatment, disease status, etc.
Covariates may or may not be time-depedent. The covariates
are denoted by
Z t (x) = (Z1 (x), . . . , Zp(x)).
15
Log survival time regression

Let Y = ln(X). Y is modeles as

Y = + Z + W,
where W is a random variable with mean 0 and variance 1.
Choices of the distribution of W :
Normal corresponding to log-normal distribution for X,
Extreme-value corresponding to Weibull distribution for
X,
Logistic corresponding to log-logistic distribution for X.
Survival function under the model:
S(x|Z) =
=
=
=
P r[X > x|Z] = P r[Y > ln x|Z]

P r[ + W > ln x Z|Z]

P r[e+W > x exp{ Z}|Z]

S0 (x exp{ Z}),
where S0 is the survival function of the baseline failure time

exp{ + W }. This model is called the accelerated failure-time

model, since the failure-time is accelerated by a factor exp{ Z}.
Hazard function

h(x|Z) = h0 (x exp{ Z}) exp{ Z}.

16
Example:
Suppose W follows an extreme value distribution with density

function fW (w) = exp{w ew }. Let Y = ln X = Z + W.
Then

S(x|Z) = S0 (x exp( Z)) = exp{[x exp( Z)]1/ }

1

h(x|Z) = h0 (x exp( Z)) = [x exp( Z)]1/1 ,
1/
where S0(t) = et and h0 (t) =

hazard functions of X0 = eW .
1 1/1
t
are the survival and
Hazard rate regression

It is more exible to model the hazard rate by a regression function
of the covariates.
Multiplicative hazard models
The hazard rate is modeled as

h(x|Z) = h0 (x)c( Z),

where h0 (x) is a baseline hazard function and c() is a positive
function.
The multiplicative model has the feature that, if all the covariates
are xed at time zero, the hazard rates of two individuals with
dierent covariate values are proportional. It is easy to see that

h(x|Z 1) c( Z 1)
.
=
h(x|Z 2) c( Z 2)
17
The survival function of the multiplicative hazard model:

c(
S(x|Z) = S0(x)
Z ).
In the particular Cox proportional hazard rate model,

c( Z) = exp( Z).
A remark on the estimation
In principle, the regression models can be estimated by maximum
likelihood method. But, due to computational diculties involved
by using the exact likelihood, certain alternatives will be considered
for the estimation in the light of likelihood principle.
18

The Basics of Survival Analysis

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

The Basics of Survival Analysis

Caricato da

Copyright:

Formati disponibili

1.

The Basics of Survival Analysis

Incomplete observation of time-to-event

Times-to-event are not always completely observable. These times

1.2 Censoring mechanisms

Generalized Type I censoring

Example 1.2. Remission duration from a clinical trial for acute

Example 1.4. Time to rst use of marijuana

Breast cancer data:

The time-to-AIDS data

1.3. Basic functions and quantities in survival analysis

Examples of parametric distribution families for survival analysis:

Hazard rate function

The hazard rate function can be viewed as the probability of an

Figure 1.3 Shapes of generic hazard functions

Cumulative hazard function

1.4. Regression models for survival data

Log survival time regression

P r[X > x|Z] = P r[Y > ln x|Z]

where S0 is the survival function of the baseline failure time

h(x|Z) = h0 (x exp{ Z}) exp{ Z}.

S(x|Z) = S0 (x exp( Z)) = exp{[x exp( Z)]1/ }

where S0(t) = et and h0 (t) =

are the survival and

Hazard rate regression

h(x|Z) = h0 (x)c( Z),

The survival function of the multiplicative hazard model:

In the particular Cox proportional hazard rate model,

Potrebbero piacerti anche