Sei sulla pagina 1di 6

Survival Analysis

In many biomedical applications the primary endpoint of interest is time to a


certain event.
Examples are
• time to death;
• time it takes for a patient to respond to a therapy;
• time from response until disease relapse (i.e., disease returns); etc.
We may be interested in characterizing the distribution of “time to event” for a
given pop-pulation as well as comparing this “time to event” among different groups
(e.g., treatment vs.control in a clinical trial or an observational study), or modeling the
relationship of “time toevent” to other covariates (sometimes called prognostic factors or
predictors). Typically, in biomedical applications the data are collected over a finite
period of time and consequently the“time to event” may not be observed for all the
individuals in our study population (sample).This results in what is called censored data.
That is, the “time to event” for those individualswho have not experienced the event
under study is censored (by the end of study). It is alsocommon that the amount of
follow-up for the individuals in a sample vary from subject to subject. The combination of
censoring and differential follow-up creates some unusual difficulties inthe analysis of such
data that cannot be handled properly by the standard statistical methods.Because of this, a
new research area in statistics has emerged which is called Survival Analysis or Censored
Survival Analysis.
 
Some examples
• Survival time (in general): measured from birth to death for an individual. This is the
survival time we need to investigate in a life expectancy study.
• Survival time of a treatment for a population with certain disease: measured from the time
of treatment initiation until death.
• Survival time due to heart disease: (the event is death from heart disease): measured
from birth (or other time point such as treatment initiation for heart disease patients) to
death caused by heart disease. (This may be a bit tricky if individuals die from other causes.
This is competing risk problem. That is, other risks are competing with heart disease to
produce an event – death.)
The time of interest may be time to something “good” happening. For example,
we may be interested in how long it takes to eradicate an infection after treatment with
antibiotics.
Survival curves plot percent survival as a function of time. The figure represented
below shows a simple survival curve. Fifteen patients were followed for 36 months.
Nine patients died at known times, and six were still alive at the end of the study.
 

  1
Figure A
simple
survival
curve.
Fifteen
subjects
were
followed for
36 months.
Nine of the
subjects
died. You
can see
each death
as a
downward
step in the
curve. Two
subjects
died at 19
months, so
the drop is
twice as
large. Note
that time 0
does not
have to be
any
particular
day or
year. Time
0 is the
time that
each
subject
was
enrolled in
the trial.
 

Time zero is not some specified calendar date; rather, it is the time that each patient
entered the study. In many clinical studies, "time zero" spans several calendar years
as patients are enrolled. At time zero, by definition, all patients are alive, so Y =
100%. Whenever a patient dies, the percent surviving decreases. If the study (and thus
the X axis) were extended far enough, Y would eventually reach 0. This study ended
at 36 months with 40% (6/15) of the patients still alive.
 
Each patient's death is clearly visible as a downward jump in the curve. When the
first patient died the percent survival dropped from 100.0% to 93.3% (14/15). When

  2
the next patient died, the percent survival dropped again to 86.7%. At 19 months, two
patients died, so the downward step is larger
 
The term survival curve is a bit misleading, as "survival" curves can plot time to any
well-defined end point, such as occlusion of a vascular graft, date of first metastasis,
or rejection of a transplanted kidney. The event does not have to be dire. The event
could be restoration of renal function, discharge from a hospital, or graduation. The
event must be a one-time event. Recurring events should not be analyzed with survival
curves.
 
CENSORED SURVIVAL DATA
 
In the previous example, we knew that all subjects either died before 36 months or
survived longer than 36 months (the right end of our curve). Real data are rarely so
simple. In most survival studies, some surviving subjects are not followed for the entire
span of the curve. This can happen in two ways:
 
• Some subjects are still alive at the end of the study but were not followed for
the entire span of the curve. Many studies enroll patients over a period of
several years. The patients who enroll later are not followed for as many
years as patients who enroll early. Imagine a study that enrolls patients between
1985 and 1989, and that ends in 1991. Patient A enrolled in 1989 and is still
alive at the end of the study. Even though the study lasted 6 years, we only
know that patient A survived at least 3 years.
• Some drop out of the study early. Perhaps they moved to a different city or
got fed up with university hospitals. Patient B enrolled in 1986 but moved to
another city (and stopped following the protocol) in 1988. We know that this
subject survived at least 2 years on the protocol but can't evaluate survival after
that.
 
In either case, you know that the subject survived up to a certain time but have no
useful information about what happened after that. Information about these patients
is said to be censored. Before the censored time, you know they were alive and
following the experimental protocol, so these subjects contribute useful information.
After they are censored, you can't use any information on the subjects. Either we
don't have information beyond the censoring day (because the data weren't or can't
be collected) or we have information but can't use it (because the patient no longer
was following the experimental protocol). The word censor has a negative ring to it, It
sounds like the subject has done something bad. Not so. It's the data that have been
censored, not the subject!
 
CREATING A SURVIVAL CURVE
 
There are two slightly different methods to create a survival curve. With the actuarial
method, the X axis is divided up into regular intervals, perhaps months or years, and
survival is calculated for each interval. With the Kaplan-Meier method, survival is
recalculated every time a patient dies. This method is preferred, unless the number
of patients is huge. The term life-table analysis is used inconsistently, but usually
includes both methods. You should recognize all three names.
 
The Kaplan-Meier method is logically simple but tedious. Since computer programs
can do the calculations for you, the details will not be presented here. The idea is pretty
simple. To calculate the fraction of patients who survived on a particular day, simply
divide the number alive at the end of the day by the number alive at the

  3
beginning of the day (excluding any who were censored on that day from both the
numerator and denominator). This gives you the fraction of patients who were alive at
the beginning of a particular day who were still alive at the beginning of the next day.
To calculate the fraction of patients who survive from day 0 until a particular day,
multiply the fraction of patients who survive day 1, times the fraction of those patients
who survive day 2, times the fraction of those patients who survive day 3 ... times the
fraction who survive day k. This method automatically accounts for censored
patients, as both the numerator and denominator are reduced on the day a patient is
censored. Because we calculate the product of many survival fractions, this method
is also called the product-limit method. Note that day refers to day of the study, not a
particular day on the calendar. Day I is the first day of the study for each subject.
 
Figure 6.2 shows a survival curve with censored data. The study started with 15
patients. Nine died during the study (same as the previous example) and six were
censored at various times during the study. On the left panel, each censored patient is
denoted by upward blips in the survival curve. On the right panel, each censored patient
is denoted by a symbol in the middle of a horizontal part of the survival curve. At the
time a patient is censored, the survival curve does not dip down as no one has died.
When the next patient dies, the step downward is larger because the denominator (the
number of patients still being followed) has shrunk.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure. A survival curve with censored subjects. A subject is censored at a certain time
for one of two reasons. He stopped following the study protocol at that time. The trial
ended with the subject still alive. In the left panel, censored subjects are shown as
upward blips. In the right panel, censored subjects are shown as solid circles in a
horizontal portion of the curve. You'll see both kinds of graphs frequently.
 
CONFIDENCE INTERVAL OF A SURVIVAL CURVE
 
In order to extrapolate from our knowledge of a sample to the overall population, a
survival curve is far more informative when it includes a 95% Cl. Calculating Cls is not
straightforward and is best left to computer programs. The interpretation of the
95% Cl for a survival curve should be clear to you by now. We have measured
survival exactly in a sample but don't know what the survival curve for the entire
population looks like. We can be 95% sure that the true population survival curve lies
within the 95% CI shown on our graph at all times.

  4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure: A survival curve with 95% Cls. The solid line shows the survival curve t e
sample of 15 subjects. You can be 95% sure that the overall survival curve for the
entire population lies within the dotted lines. The Cls are wide because the sample is
so small.
 
Comparing Two Survival Curves
 
 
 
 
This is chapter 33 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky.
Copyright © 1995 by Oxford University Press Inc. All rights reserved. You may order
the book from GraphPad Software with a software purchase, from any academic
bookstore, or from amazon.com.
 
 
 
 
You've already learned (Chapter 6) how to interpret survival curves. It is common to
compare two survival curves to compare two treatments. Compare two survival
curves using the log-rank test. This test calculates a P value testing the null hypothesis
that the survival curves are identical in the two populations. If that assumption is true,
the P value is the probability of randomly selecting subjects whose survival curves
are as different (or more so) than was actually observed. (You will sometimes see
survival curves compared with the method of Mantel-Haenszel, rather than the log-rank
test. The two methods are essentially equivalent.)
 
Example
 
Rosman and colleagues investigated whether diazepam would prevent febrile seizures
in children (NP Rosman, T Colton, J Labazzo, PL Gilbert, NB Gardella, EM Kaye, C
Van Bennekom, MR Winter. A controlled trial of diazepam administered during febrile
illnesses to prevent recurrence of febrile seizures. N Engl J Med

  5
329:79-84, 1993). They recruited about 400 children who had had at least one febrile seizure.
Their parents were instructed to give medication to the children whenever they had a fever.
Half were given diazepam and half were given placebo. They analyzed the data in several
ways, including survival analysis. Here the term survival is a bit misleading, as they compared
time until the first seizure, not time until death. When they compared the times to first seizure
with the log rank test, the placebo treated subjects tended to have seizures earlier and the P
value was 0.06. The difference in survival curves was small. If diazepam was really no more
effective than placebo, you'd expect that 6% of experiments this size would find a difference
this large or larger. The authors did not reach a conclusion from this analysis because they
analyzed the data in a fancier way, which we will discuss later in the chapter.
 
 
ASSUMPTIONS OF THE LOG-RANK TEST
 
 
The log-rank test depends on these assumptions:
 
 
• The subjects are randomly sampled from, or at least are representative of, larger populations.
 
• The subjects were chosen independently.
 
• Consistent criteria. If patients are enrolled in the study over a period of years, it is important
that the entry criteria don't change over time and that the definition of survival is consistent.
This is obvious if the end point is death, but survival methods are often used for other end
points.
 
• Baseline survival rate is not changing over time. If the subjects are enrolled in the study over
a period of years, you must assume that the survival of the control patients enrolled early
in the study would be the same (on average) as the survival of those enrolled later (otherwise
you have to do a fancier analysis to adjust for the difference).
 
• The survival of the censored subjects would be the same, on average, as the survival of the
remaining subjects.
 
 
 
The calculations of the log-rank test are tedious and best left to computer. The idea is pretty
simple. For each time interval, compare the observed number of deaths in each group with
the expected number of deaths if the null hypothesis were true. Combine all the observed and
expected values into one chi-square statistic and determine the P value from that.
 

Potrebbero piacerti anche