Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Response-Adaptive Randomisation in
Clinical Trials with Binary Responses
2014-15
Mateusz Matyjaszczyk
140293481
Supervisor: Dr. D.S. Coad
140293481
Abstract
Randomisation is a fundamental concept in experimental design as it is
the best known way of removing unwanted bias. The classical approach to
randomisation is to balance the number of patients receiving each treatment.
However, in a clinical trial this has an ethical disadvantage as it could lead
to a high number of treatment failures. We explore various response-adaptive
randomisation schemes which aim to assign more patients to the superior
treatment in order to reduce the number of treatment failures. In this dissertation we only consider clinical trials with binary responses.
We start by introducing the randomised play-the-winner (RPW) rule. The
RPW rule has many statistical disadvantages and a previous application in
a clinical trial lead to disastrous results. We therefore introduce three different randomisation rules: drop-the-loser (DL) rule, odds ratio based design
(ORBD) and doubly adaptive biased coin design (DBCD). For these rules to
be applicable under a realistic setting, each one is extended to (i) allow any
number of treatments (ii) allow delayed responses (iii) incorporate covariates.
We then analyse the efficient randomised adaptive design (ERADE) which
obtains the Cramer-Rao lower bound on the asymptotic variance.
The final section compares the randomisation rules mentioned. For K = 2
and K = 3 treatment design, we compare the allocation proportion and define
a hypothesis test which we then use to simulate power and significance level.
Then the same methods are used to compare the randomisation rules under
delayed responses and incorporating covariates.
We find that in general there is an inverse relationship between more ethical allocation and power. A suitable response-adaptive randomisation scheme
needs to have a good balance between these two criteria and thus such a
randomisation procedure should be tailored to a specific clinical trial.
140293481
Contents
1 Introduction
. . . . . . .
. . . . . . .
. . . . . . .
and power .
. . . . . . .
. . . . . . .
and power .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
3
4
7
7
8
8
10
11
11
12
13
13
15
17
18
18
.
.
.
.
.
.
.
.
.
.
.
.
.
19
19
20
20
23
25
25
27
29
29
33
36
36
39
4 Conclusion
A R code used
A.1 RPW . .
A.2 DL . . .
A.3 GDL . .
A.4 DLC . .
A.5 ORBD .
A.6 DBCD .
A.7 RDBCD
A.8 ERADE
41
in simulations
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
44
44
44
45
46
47
48
49
50
140293481
Introduction
Suppose that a new treatment has been developed and we wish to compare it to
an existing treatment through a clinical trial. In this trial, the patients arrive
sequentially and each patient is assigned to either of the treatments. If the treatment
assignment is systematic, then the physician or medical examiner may be able to
predict the next assignment and choose a patient that they would prefer to receive
the corresponding treatment. This is called selection bias and is highly undesirable
as it can invalidate the trial. The best known way of removing such bias is through
randomisation.
Complete randomisation is the most basic form of randomisation where we assign a treatment to a patient with equal probability. In an experiment with two
treatments this can be compared to a toss of a fair coin. Although such a scheme
minimises selection bias and is quite easy to implement, it has many undesirable
properties e.g. it does not take the medical histories of the patients into account.
Suppose that we have a covariate under which the treatment can have various effects. Then, it is highly desired for this covariate to be equally represented in each
treatment as unbalanced treatments could have an effect on any statistical inference
performed later on.. This form of randomisation is called covariate-adaptive randomisation and historically is the most widely used type of randomisation in clinical
trials.
Next, consider the responses to this experiment to be binary (i.e. treatment was
successful or unsuccessful) and instantaneous (i.e. before the next patient is randomised). After some patients have been randomised and their responses obtained,
one of the treatments has been shown to have a lower proportion of treatment failures than the other treatments. Due to ethics, we wish to assign more patients to
this superior treatment in order to minimise the number of treatment failures. This
type of randomisation is called response-adaptive randomisation (RAR). Although
Thompson (1933) proposed such adaptive designs as early as 1930s, they have had a
very limited use in practice with the randomised play-the-winner (RPW) rule being
the design most often applied. In Section 2.1 we will briefly investigate this design
and its limitations. One of the most severe limitations is that when the success
probability is high on both treatments then the variance is unbounded. This in turn
means that the allocation proportion of patients to each treatment heavily depends
on the initial settings of the scheme. We mention the ECMO trial in which a bad
choice of the initial settings of the RPW scheme lead to disastrous results.
For a given randomisation scheme to be applicable in a practical setting, we need
to consider the following limitations:
So far we assumed the responses to be instantaneous. In practice, this is rarely
the case as new patients may be assigned to a treatment before a response is
available for all the previous patients. Hence for an adaptive design to be
1
140293481
applicable to a wide range of clinical trials, the design should allow delayed
responses.
Covariate imbalance may be an issue, similarly as in complete randomisation.
Thus, for an adaptive design to be practical, the design should take into consideration the response history so far as well as the covariate balance of the
treatments.
We also assumed that there are only two treatments. However, this may not be
the case in many clinical trials. For example, when comparing a new treatment
to an existing one, we may wish to include a placebo group. Similarly, if the
newly proposed treatment is a drug, patients can be assigned to treatments
with different dosages in order to find the optimum dose.
In Sections 2.2-2.4 three well studied RAR designs are introduced: drop-theloser rule (DL), odds-ratio based design (ORBD) and doubly adaptive biased-coin
design (DBCD). Each of these designs is extended to allow for the three limitations
above: delayed responses; covariate adaptiveness and K > 2 treatments. Of note
is the extension of the ORBD to K = 3 treatments as this has not been previously
explored in the literature. With the exception of the DL rule, none of the rules
mentioned are known to obtain the lower bound on the asymptotic variance and
thus in Section 2.5 we introduce the efficient randomised adaptive design (ERADE)
that obtains this lower bound.
Finally, in Chapter 3 we compare the statistical properties of the randomisation
schemes. Section 3.2 compares the allocation and failure proportions for all the
rules with K = 2 treatments. We find that in general the DL, DBCD and ERADE
rules have the least variable allocation proportions while the ORBD assigns the most
patients to the superior treatment, resulting in the lowest failure proportion. In fact,
we notice the inverse relationship between these two criteria. We then define the
Wald test that can be used to test the hypothesis of no treatment difference. We use
this test to simulate the power and significance level of different randomisation rules
and confirm the well-known inverse relationship between power and variability. That
is, a more variable rule in general leads to reduced power. Thus, the DL, DBCD
and ERADE are the most powerful.
In Section 3.3 we explore the DL, ORBD and DBCD when extended to K = 3
treatments. The allocation and failure proportion are investigated for each rule and
the results reflect the findings of the previous section. That is, the DL and DBCD
are found to be the least variable but assign less patients to the superior treatment
than the ORBD. We then define the contrast test of homogeneity which allows us to
compare one treatment (usually the placebo) to the other treatments. We use this
test to simulate the power and significance level for the DL, ORBD and DBCD rules
and it is shown that the DL and DBCD maintain the highest power, when compared
2
140293481
to complete randomisation. Note that a simulation of the ORBD and DBCD when
extended to K = 3 treatments is not reported in the literature.
We then consider the DL, ORBD and DBCD rules under delayed responses in
Section 3.4. Note that out of these rules, the literature only investigates the DL under delayed responses and there is no investigation of power under delayed responses
for any of the rules mentioned. The investigation into ORBD and DBCD with delayed responses is the first known investigation of this type that has been reported.
We note that moderate delay does not have an effect on allocation proportion. We
also simulate power and significance level and find that delayed responses may lower
the power of some designs. We perform such an investigation for K = 2 and K = 3
treatments. Finally, we briefly discuss the performance of RAR designs under severe
delay.
In Section 3.5 we extensions of the DL, ORBD and DBCD rules to incorporate
covariates. We see that incorporating covariates can significantly reduce the ethical
allocation, which results in higher failure rates. We also perform an investigation
into the significance level and power of these designs and find that the power can
be severely reduced for these designs. Note that this is the first investigation of the
ORBD and DBCD incorporating covariates that has been reported in the literature.
The DL incorporating covariates with K = 3 has also been studied for the first time.
We conclude that RAR designs can be statistically and ethically desirable. However, we also find that under some realistic assumptions i.e. delayed responses and
covariate-adaptiveness these RAR designs do not perform as well. Thus, a given
RAR design should be chosen in such a way that we obtain satisfactory statistical
properties whilst maximising ethical advantages. We then consider extensions to
the work presented here.
2
2.1
2.1.1
140293481
predictable as the physician is able to guess the next treatment assignment if the
response and treatment assignment of the previous patient is known.
Wei and Durham (1978) extended the above idea by proposing the randomised
play-the-winner (RPW) rule. In this design sequentially arriving patients are assigned to a treatment by a ball being drawn from an urn. We assume that there are
i = 1, 2 treatments. There are i balls of the colour corresponding to each treatment.
We would usually choose 1 = 2 so that the urn is balanced in the beginning. Here,
this will always be the case so we let = 1 = 2 . When a patient is ready to be
randomised, a ball is drawn from the urn. The patient is assigned the corresponding
treatment, the ball is replaced and a response is observed. If the response of the
treatment for this patient was a success, we add balls of the corresponding colour
to the urn. However, if the response was a failure, then balls of the opposite kind
are added to the urn. This way we skew the probability of assignment towards the
more successful treatment so far. This process continues until a suitable stopping
criteria has been reached e.g. sufficient number of patients have been randomised.
We denote this design by RPW(,).
The above design deals with many of the limitations of the PW rule. Firstly,
the design is less predictable than PW rule since the allocation probability depends
on the whole response history, rather than just the last response. RPW also allows
the responses to be delayed as the urn can be updated once a response is available,
which was not the case with the PW rule.
2.1.2
We now list some interesting statistical properties of the RPW rule. Consider a
trial with treatments i = 1, 2, binary responses and n patients to be assigned.
The treatments have success probabilities 0 < pi < 1. Then, the probabilities of
treatment failure are given by qi = 1 pi and Ni (n) is the number of patients
assigned to treatment i. Wei and Durham (1978) have shown that
1/qi
Ni (n)
n
1/q1 + 1/q2
(1)
140293481
Atkinson and Biswas (2013) give the probability that the (j +1)th patient is assigned
to treatment i = 1 as
P (j+1 = 1) = 1/2 + dj+1
where
dj+1 =
j(p1 p1 ) (p1 + p1 1) X
+
dk
2(2 + j)
2 + j
k=1
q2
N1 (n)
n
q1 + q2
where
v=
!
N(0, v),
q1 q2 (5 2(q1 + q2 ))
.
(2(q1 + q2 ) 1)(q1 + q2 )2
Hu et al. (2006) then showed that the lower bound on this asymptotic variance is
not obtained for the RPW rule. This is an undesirable result as a reduced variability
of a randomisation rule is directly correlated to a gain in statistical power of the
5
140293481
(p1 , p2 )
(0.8,0.8)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.6)
(0.6,0.4)
(0.6,0.2)
(0.4,0.4)
(0.4,0.2)
(0.2,0.2)
100
100
100
100
100
100
100
100
100
100
Table 1: Allocation proportion of the RPW rule for different choices of the initial
urn composition. This simulation used 5, 000 replications.
design, as has been shown by Melfi and Page (1998) and Hu and Rosenberger (2003).
Thus, designs that obtain the lower bound on the asymptotic variance are highly
regarded.
We now investigate some choices of the initial urn composition under different pi
through a simulation. For each choice of pi , the n = 100 patients were assigned using
the corresponding RPW rule. The results are given in Table 1. We investigated four
choices of parameters, with the corresponding values reported. The table shows
the average allocation to treatment i = 1 over 5, 000 replications with the standard
deviation of this allocation proportion given in the brackets. The allocation to i = 2
is not given as it can be obtained by subtraction the allocation to i = 1 from one. It
can be seen that when p1 = p2 the allocation proportion is equal for all choices of the
parameters. However, the corresponding standard deviation is smaller for models
with smaller (i.e. higher ). When p1 6= p2 the RPW rule assigns more patients
to the superior treatment which is always the first treatment in the table. The
proportion of patients assigned to the superior treatment increases as the difference
between the two treatments grows, leading to an ethical advantage. It can also be
noticed that the choice of can have an effect on the allocation proportion when
p1 6= p2 . For higher values of we allocate more patients to the superior treatment
but this has a trade-off as the urn models with lower have a smaller standard
deviation. Due to the earlier mentioned relation between variance and power, the
models with smaller may prove to be less powerful. The code written in the
programming language R used for the RPW rule can be seen in Section A.1.
Overall, the RPW rule exhibits many undesirable properties meaning that a
practical application is usually troublesome. Despite this, much work has been done
on the topic. Bandyopadhyay and Biswas (1999) extend the rule to incorporate
covariates, while Biswas (1999) studies the rule under delayed responses. Such
6
140293481
extensions are not studied here and we focus on other designs with more desirable
properties.
2.2
2.2.1
Ivanova (2003) proposed the following urn model. Consider a clinical trial of K
treatments with instantaneous, binary responses. Then, we start with an urn containing K + 1 types of balls. The ball types i = 1 . . . K correspond to the K
treatments while the balls of type 0 are the so-called immigration balls. The initial
urn composition in given by Z0 = {Z0,0 , . . . , ZK,0 } while after draws it is given by
Z = {Z0, , . . . , ZK, }. When a patient is ready to be randomised, a ball is drawn
from the urn. If the ball is an immigration ball (of type 0) then no treatment is
assigned and the ball is replaced together with one ball for each of the K treatments.
This process is repeated until a treatment ball (of type i) is drawn and then the
patient is assigned the corresponding treatment. The response of the treatment is
observed. If it is a success, the ball is replaced and the urn composition is unchanged
and so we let Z+1 = Z . However, if the response is a failure, then the ball is not
replaced. The urn composition becomes Zi,+1 = Zi, 1 and Zj,+1 = Zj, , j 6= i.
The process continues until a suitable stopping rule has been triggered, such as all
patients available have been assigned. The inclusion of the immigration balls is an
important feature of the DL rule as it allows a treatment to not die out even if it
has a very small success probability.
The design proposed above is a discrete time process. The technique of embedding an urn model in a continuous time birth and death proposed by Ivanova
and Flournoy (2001) was used by Ivanova (2003) to obtained some useful statistical
properties of this design. The limiting allocation to treatment i = 1, . . . , K is given
by
1/qi
Ni (n)
(2)
n
1/q1 + + 1/qK
almost surely as n . Note that when K = 2 this limiting proportion is equal
to the allocation of the RPW rule given in (1). Thus, so far we have two RAR
procedures that both target urn allocation.
Ivanova (2003) showed that
n1/2
N1 (n)
q2
n
q1 + q2
where
v=
!
N(0, v),
q1 q2 (p1 + p2 )
.
(q1 + q2 )3
Hu et al. (2006) then demonstrated that the DL attains the lower bound on this
7
140293481
asymptotic variance, unlike the RPW rule. Thus, it could be said that although
both the rules so far target the same allocation proportion, the DL has a theoretical
advantage as it is able to obtain the minimum variance. In fact the overall variability
of this rule is known to be lower than that of the RPW rule, as shown by Ivanova
(2003) and Hu and Rosenberger (2003).
Section A.2 reports an R program that was used to simulate the DL rule.
2.2.2
Zhang et al. (2007) extended the DL rule in order to study delayed responses. Such
a design is called the generalised drop-the-loser (GDL) rule. Sun et al. (2007) then
extended this rule to K > 2 treatments and here we will deal with this extension.
Similarly to the DL rule, we start with K + 1 type balls. Balls of type 0 are
immigration balls and balls of type i = 1, . . . , K are balls corresponding to the
treatments. The initial urn composition is given by Z0 = {Z0,0 , . . . , Z0,K } while the
urn composition after draws have been made is given by Z = {Z,0 , . . . , Z,K }.
+
+
+
We then let Z,k
= max(0, Z,k ), k = 0, . . . , K and Z ={Z,0
, . . . , Z,K
}. This step
is required as we now allow the urn to have fractional or a negative number of balls.
Then, the probability of selecting a ball of type i is
+
Z,i
PK
+
c=0 Z,c
If the ball selected is of type 0 (i.e. an immigration ball) then no treatment is assigned and the ball is returned to the urn together with ai , i = 1, . . . , K treatment
type balls. If a treatment type ball is drawn then the subject is assigned the corresponding treatment i. Since we wish to allow delayed responses, the ball is not
replaced immediately. Instead we continue to allocate treatments until a response
is available. Once we obtain the response, we alter the urn by adding D,i > 0 balls
to the urn if the treatment was a success, leaving the urn unchanged otherwise. We
continue until a suitable stopping criteria has been reached. This design reduces to
the DL urn when ai = 1 and D,i = 0.
2.2.3
By embedding the GDL rule in a continuous time process, Zhang et al. (2007) have
shown that the asymptotic allocation proportion of the GDL rule is given by
ai /qi
Ni (n)
n
a1 /q1 + + aK /qK
(3)
140293481
By carefully choosing the value of ai , we can alter (3) so that the rule targets a
different allocation proportion. For example, for K treatments with binary probabilities pi the allocation proportion of
pi
Ni (n)
, i = 1, . . . , K
n
p1 + + p K
(4)
is of a particular interest. Rosenberger et al. (2001) have shown that this allocation
proportion minimises the expected number of treatment failures, assuming a fixed
variance of the estimator for the treatment difference. We will refer to this allocation
proportion as RSIHR allocation, named after the initials of the authors. We can
target this allocation by letting
pi
, i = 1, . . . , K
ak = C
p1 + + pK
(5)
(6)
We now compare the GDL rule targeting urn allocation proportion (equivalent
to DL rule) to the GDL rule targeting RSIHR allocation. We also investigate two
choices of initial urn composition. For each combination of target allocation, initial
urn composition and pi we run the rule 5, 000 times. The mean allocation proportion
to treatment i = 1 was obtained as well as the standard deviation of this allocation.
The results of this simulation are given in Table 2 with the initial urn allocation
given in the brackets in the column name. Section A.3 reports an R program that
was used to simulate the GDL rule.
When p1 = p2 all the rules seem to have a similar mean allocation proportion
but the standard deviation is lower for the GDL rules targetting RSIHR allocation.
When p1 6= p2 the GDL rule targeting RSIHR on average assigns less patients to the
superior treatment than the standard DL rule. However, the rule targeting RSIHR
is much less variable which in turn leads to an increase in power. There is little
difference between initial urn compositions in terms of the variability.
140293481
(p1 , p2 )
(0.8,0.8)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.6)
(0.6,0.4)
(0.6,0.2)
(0.4,0.4)
(0.4,0.2)
(0.2,0.2)
100
100
100
100
100
100
100
100
100
100
Allocation Proportion
GDL(2,2,2) GDL(5,5,5)
0.50(0.06)
0.50(0.05)
0.60(0.05)
0.57(0.05)
0.68(0.04)
0.63(0.04)
0.73(0.03)
0.68(0.03)
0.50(0.05)
0.50(0.04)
0.58(0.04)
0.56(0.04)
0.64(0.04)
0.62(0.03)
0.50(0.04)
0.50(0.04)
0.56(0.03)
0.56(0.03)
0.50(0.03)
0.50(0.03)
(Standard Deviation)
GDL(2,2,2) GDL(5,5,5)
0.50(0.02)
0.50(0.02)
0.53(0.02)
0.53(0.02)
0.58(0.03)
0.57(0.03)
0.64(0.04)
0.63(0.04)
0.50(0.03)
0.50(0.03)
0.54(0.03)
0.54(0.03)
0.61(0.04)
0.60(0.04)
0.50(0.04)
0.50(0.04)
0.57(0.04)
0.56(0.04)
0.50(0.05)
0.50(0.05)
Table 2: Allocation proportion of the GDL rules for different choices of the initial
urn composition and target allocation. This simulation used 5, 000 replications.
2.2.4
Incorporating covariates
Bandyopadhyay et al. (2009) proposed an extension of the DL rule to allow incorporating covariates within treatments. For each sequentially entering patient
j = 1, . . . , n the level of the covariate Uj {0, . . . , G} is obtained. We use 0 for
the most favourable condition and G for the least favourable one. For example,
if Uj is the initial size of a tumour, then a lower category represents a favourable
condition to treat i.e. a smaller tumour. Let 0 < 1 < < G with G = 1 be a
set of probabilities representing the probability of success under the corresponding
grade Uj . In practice, the probabilities k , k = 0, . . . , G may be unknown and so
k+1
or another suitable function to obtain an estimate of the success
we may use G+1
probabilities.
Similarly as before, we start the urn with the composition Z0 = {Z0,0 , . . . , Z0,K }.
We draw a ball from the urn. If the ball drawn is of type 0 then no treatment is
assigned and we return the ball together with K balls, one ball for each treatment. If
a treatment ball is drawn, then the patient is assigned the corresponding treatment.
We note the grade k of this patient and the response is observed. If the response is a
success, then we replace the ball with the probability k . Otherwise, if the response
is a failure then we replace the ball with the probability 1 Gk .
However, treatments are likely to have different success probabilities under different covariate grades. That is, a given treatment is more likely to be successful when
the corresponding patient has the grade 0 than when the grade is G. Therefore, we
determine the success probability of a treatment by
P (Zj = 1|i, k) = aUj pi,j
where Zj is the success or failure for the j th patient, pi,j is the success probability
for the j th receiving treatment i and a (0, 1) is the so-called prognostic factor
10
140293481
index that can be estimated. Note that this definition is an extension of the one
originally proposed by Bandyopadhyay et al. (2009) to any number of treatments as
the authors only considered a trial with K = 2 treatments.
The DLC design has some strong disadvantages. Defining the number of grades
G can be troublesome. Using the tumour size example above, we may be able to
define G grades that tumours fall into, depending on their size. However, it might
also be possible to make the grades boundaries smaller, increasing the number of
grades. Grades of equal length might also not always be ideal. The number of
grades is likely to have effect on the allocation proportion and so needs to be chosen
carefully.
Similarly, not all covariates can be split into grades. Clinical trials are often
balanced by the institution. In such a case, we might be unable to rank institutions
in terms of grades. Even if such a grading was possible, then the grades are likely
to have very similar probabilities k . Finally, the DLC is not able to incorporate
multiple covariates.
Section A.4 reports an R program that can be used to simulate the DLC rule.
In the next section we define a randomisation procedure that allows a much more
flexible incorporation of covariates.
2.3
2.3.1
Rosenberger et al. (2001) proposed the following way to allow covariate balance
within treatments. Considering two treatments i = 1, 2, let Tj be the treatment
indicator (Tj = 1 if treatment is 1 and Tj = 0 if treatment is 2) for the j th patient
with j = 1, . . . , n and let zj be the covariate information for the given patient. We
then define the standard logistic regression model
logit(pj ) = + Tj + zj0 + Ti zi0
(7)
where pj is the probability of success for the j th patient, is the global mean,
is the treatment main effect, is the vector of covariate main effects and is the
vector of treatment-covariate interactions. Throughout this dissertation we used a
generalised linear model (GLM) to fit this regression model with a logit link function.
It is also possible to consider fitting (7) using a GLM with a probit link function or
another suitable choice.
The design works in the following way. Patients are assigned using another
randomisation scheme (e.g. block randomisation) until the regression equation in
(7) is obtainable using the data for all patients so far i.e. all possible maximum
likelihood estimates are available. Then, the covariate-adjusted odds ratio is given
by = exp( + zj+1 ) where is the current estimate of , is the current estimate
11
140293481
of and zj+1 is the covariate information for the (j + 1)th patient. Since (0, ),
we need to transform this function to (0, 1) in order to represent a probability. We
use the transformation f () = 1/(1 + ). We thus assign patients to treatment i = 1
with the probability
1
.
1 + exp( + z 0 )
j+1
This design has many advantages over the DLC design seen in Section 2.2.4. Firstly,
we no longer need to order the covariates into grades as the logistic model allows us
to have covariates that are continuous. We can also have categorical or binary covariates. The logistic model in (7) can be extended to incorporate multiple covariates
as well as interactions between them.
Rosenberger et al. (2001) obtained the limiting allocation to treatment 1 to be
N1 (n)
1
.
n
1 + exp( + z00 )
(8)
where z00 is a fixed vector of covariates. Unfortunately, the ORBD procedure is only
able to target this allocation proportion.
It is worth noting that the logistic model in (7) can be modified to not take the
covariates into account. Then, we instead build the regression model
logit(pj ) = + Tj
and assign patients to treatment i = 1 with the probability
1
1 + exp()
2.3.2
Atkinson and Biswas (2013) consider the extension of the two treatment ORBD
model without covariates to three treatments. Let i = 1, 2, 3 be the treatments
with the respective unknown success and failure probabilities pi and qi = 1 pi .
Similarly as before, we assign patient using another randomisation scheme to one
of the i = 1, 2, 3 treatments until the logistic model can be estimated. Once the
logistic model
logit(pj ) = + Tj
can be built on the data for all patients so far, we assign patients to treatment i
with the respective probabilities:
1
1 + exp(2 + 3 )
exp(2 )
,
1 + exp(2 + 3 )
exp(3 )
,
1 + exp(2 + 3 )
where 2 and 3 are the estimates of from the logistic model. We observe the
12
140293481
response of the patient to the treatment and update the logistic model accordingly.
We may also extend this K = 3 design to incorporate covariates. Such an
extension is not reported in the literature. That is, we now use the same logistic
regression model as in (7) with all terms defined as previously. We assign patients
using complete randomisation until the regression model is estimable. We then
assign patients to treatment i with the probabilities
1
0
)
1 + exp(2 + 3 + zj+1
exp(2 )
1 + exp(2 + 3 + z 0
j+1 )
exp(3 )
1 + exp(2 + 3 + z 0
j+1 )
We can further extend the model above to clinical trials with K > 3 treatments in
a similar way.
Thus, the ORBD gives a randomisation scheme with much flexibility. We are
able to incorporate multiple covariates of different types (e.g. continuous, categorical, binary) and even gives us a possibility of including interactions between
covariates. Section A.2 reports an R program that was used to simulate the ORBD
rule incorporating covariates.
The ORBD is also able to incorporate delayed responses. In such a case, the
logistic model uses the data for all patients so far and the model is updated whenever
a response is obtained.
However, there are also some drawbacks of the ORBD. Firstly, we are only able to
target one allocation proportion, namely (8). Also, the design is relatively variable
when compared to the RPW and DL rules. This is mainly caused by the use of
the logistic model and the variability associated with each estimate in the model.
Because a new model is built for each patient, the variances for all these estimates
add up and this results in a highly variable procedure overall.
2.4
2.4.1
With the exception of the GDL rule, all designs so far have been able to target
only one allocation proportion. Eisele (1994) and Eisele and Woodroofe (1995)
introduced the doubly adaptive biased-coin design (DBCD) which overcomes this
problem by allowing the target allocation proportion y(p1 , p2 ) to be specified.
The design heavily relies on the function g(x, y) from [0, 1] to [0, 1]2 . This function maps the current allocation proportion to the target allocation proportion.
Selection of g is often problematic due to the very restrictive rules it must follow,
as defined by Eisele (1994). In fact, Melfi et al. (2001) pointed out that the original
choice of g violates one of these rules. Hu and Zhang (2004) propose a more relaxed
set of conditions:
g is jointly continuous
13
140293481
g(x, x) = x
g(x, y) is strictly decreasing in x and strictly increasing in y
g has bounded derivatives in x and y.
Hu and Zhang (2004) chose g(x, y) to be
g(x, y) =
if x = 0,
y(y/x)
y(y/x) +(1y)((1y)/(1x))
if 0 < x < 1,
if x = 1,
where y(
p1 , p2 ) is the current estimate of the allocation proportion using pi , Ni (j) is
the current number of patients assigned to treatment i and j is the number of the
patient being assigned, as defined previously. For example, if we wish to target urn
allocation given in (1), we let
y(p1 , p2 ) =
1/(1 p1 )
.
1/(1 p1 ) + 1/(1 p2 )
The rule works by skewing the probability of assignment towards the treatment
that is the furthest away from its target allocation. We continue to assign patients
until all patients have been assigned or until a suitable stopping rule has been
triggered.
14
140293481
We now compare the DBCD rule under different target allocations and different
choices of . We considered = 2 and = 4 and the choice of this parameter
is given as DBCD(). We compare urn allocation in (1) to RSIHR allocation in
(4). For all of the rules we set n0 = 2. We obtained the allocation proportion to
i = 1 and its standard deviation using a similar simulation as was used for the
RPW and GDL rules and the results are shown in Table 3. In general, there is no
significant difference between the allocation proportions when p1 = p2 . However,
the allocation proportion is much more variable for the DBCD rule targeting urn
allocation. When p1 6= p2 , DBCD targeting urn allocation assigns more patients
to the superior treatment than DBCD targeting RSIHR allocation. This has a
trade-off, as the rules targeting urn allocation have a higher standard deviation,
which may translate to a loss in power. This inverse correlation between allocation
proportion and the variance has been seen in all simulations performed so far and
will be fully investigated Chapter 3. This simulation has also shown that there is no
significant difference between rules with = 2 and = 4 in terms of mean allocation
proportion but the standard deviation of the allocation proportion is slightly lower
when = 4.
Section A.6 reports an R program that was used to simulate the DBCD rule.
Hu and Rosenberger (2003) demonstrated that for the DBCD rule
n1/2
where
v=
!
N1 (n)
y(p1 , p2 ) N(0, v),
n
q1 q2 ((1 + 2)(p1 + p2 ) + 2)
.
(1 + 2)(q1 + q2 )3
Hu et al. (2006) then showed that the DBCD rule does not obtain the lower bound on
this asymptotic variance. Thus, we could say that the DL has a theoretical advantage
over the DBCD targeting urn allocation as it is able to obtain its minimum variance.
2.4.2
The DBCD design can be generalised in order to allow K > 2, i = 1, . . . , K treatments, as demonstrated by Hu and Zhang (2004). Let v = {v1 , . . . , vK } be the
vector of current allocation proportions for each treatment out of the j patients randomised so far. We define g(x, y) = {g1 (x, y), . . . , gK (x, y)} with sum{x} = 1 and
sum{y} = 1 to be a vector of functions from [0, 1] [0, 1]K with the conditions:
g(v, v) = v and g(x, y) g(x, v) 0 as y v.
For every i
gi (x, v) gi (v, v)
0
xi vi
15
for all
xi > vi
140293481
(p1 , p2 )
(0.8,0.8)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.6)
(0.6,0.4)
(0.6,0.2)
(0.4,0.4)
(0.4,0.2)
(0.2,0.2)
100
100
100
100
100
100
100
100
100
100
Allocation Proportion
DBCD(2) DBCD(4)
Target: Urn
0.50(0.10) 0.50(0.10)
0.66(0.08) 0.66(0.07)
0.74(0.06) 0.74(0.05)
0.79(0.05) 0.79(0.04)
0.50(0.07) 0.50(0.06)
0.60(0.05) 0.60(0.05)
0.66(0.04) 0.66(0.04)
0.50(0.05) 0.51(0.04)
0.57(0.04) 0.57(0.04)
0.50(0.03) 0.50(0.03)
(Standard Deviation)
DBCD(2) DBCD(4)
Target: RSIHR
0.50(0.03) 0.50(0.02)
0.54(0.03) 0.54(0.02)
0.59(0.04) 0.59(0.03)
0.67(0.04) 0.67(0.04)
0.50(0.03) 0.50(0.03)
0.56(0.04) 0.56(0.03)
0.64(0.05) 0.64(0.04)
0.50(0.04) 0.50(0.04)
0.59(0.05) 0.59(0.05)
0.50(0.06) 0.51(0.06)
Table 3: Allocation proportion of the DBCD rule for different choices of the target
allocation and with n0 = 5. This simulation used 5, 000 replications.
where 0 0 < 1 is a constant.
g(x, y) is strictly decreasing in x and strictly increasing in y.
g(x, y) has bounded derivatives in x and y.
Hu and Zhang (2004) propose gi (x, y) to be
(yi (yi /xi ) )L
,
gi (x, y) = PK
L
c=1 (yc (yc /xc ) )
i = 1, . . . , K
16
140293481
2.4.3
Incorporating covariates
Baldi Antognini and Zagoraiou (2012) suggest an extension of the DBCD to include
covariate information, called the reinforced doubly adaptive biased coin design (RDBCD). We start by defining the function g(x, y, z) with the properties:
g is decreasing in x and increasing in y for any z (0, 1)
g(x, x, z) = x for any z (0, 1)
g is decreasing in z if x < y and increasing in z if x > y
g(x, y, z) = 1 g(1 x, 1 y, z) for any z (0, 1)
Baldi Antognini and Zagoraiou (2012) suggest
g(x, y, z) =
y(y/x)z
y(y/x)z + (1 y)[(1 y)/(1 x)]z
as a suitable choice of g(x, y, z) with > 0 having a similar role as before. Note that
z in the above function corresponds to the covariate information for the patient we
wish to randomise. Due to the properties of this function, namely z (0, 1) we also
need to transform the covariates so that they are also in this range. A transformation
might also be needed such that high values of z will correspond to a higher value of
g than when z is small. We denote such a transformation by H(z).
The workings of this rule are similar to the K = 2 version of the DBCD rule.
We start by assigning n0 patients to each treatment using another randomisation
method. Since we wish to balance covariates, a covariate-adaptive rule might be
suitable but throughout this dissertation we use complete randomisation. Once
m = 2n0 patients have been assigned, we obtain the estimates pi for each treatment
using (6). Given (m + 1)th patient with covariate information zm+1 , we assign this
patient to treatment i with the probability
!
Ni (m)
, y(
p1 , p2 ), zm+1 .
m
We then update pi and assign the next patient using the same method. The rule
continues until all patients have been assigned or a suitable stopping rule has been
triggered.
The RDBCD has many disadvantages. Firstly, it is only able to deal with a
single covariate. The covariates also need to be defined in such a way that low z is
favourable to treat, as this produces a larger g(x, y, z) value.
The RDBCD procedure suffers from a similar problem as DLC rule. That is, it
requires the covariate to be defined in such a way that a certain value is favourable
to treat when compared to another one. This may often not be the case in practice.
17
140293481
2.5
2.5.1
Recall that the DL rule has been the only rule that is able to obtain the lower bound
on its asymptotic variance. The RPW and DBCD rules are not able to obtain it,
whilst there is no literature on whether the ORBD and GDL obtain their respective
lower bounds. Although the DL attains this lower bound, it has some disadvantages
such as only being able to target urn allocation. We now define a randomisation
procedure that is able to obtain the lower bound on its asymptotic variance and is
able to target any given allocation proportion.
Hu et al. (2009) proposed the following randomisation procedure. We start by
assigning n0 patients to each treatment, similarly as for the DBCD. Once m = 2n0
patients have been assigned and their responses observed, we obtain the estimates
pi of pi for each treatment i = 1, 2 using (6). We then obtain the value of the target
allocation proportion using these estimates, that is y(
p1 , p2 ). For example, if we
wish to target urn allocation we use the function
y(p1 , p2 ) =
1/(1 pi )
,
1/(1 p1 ) + 1/(1 p2 )
similarly as for the DBCD rule. We then assign the (m + 1)th patient to treatment
i with the probability
y(
p1 , p2 )
if
y(
p1 , p2 )
if
1 + y(
p1 , p2 ) if
Ni (m)
m
Ni (m)
m
Ni (m)
m
> y(
p1 , p2 ),
= y(
p1 , p2 ),
< y(
p1 , p2 ),
140293481
extensions will not be considered here and so we will only consider it for K = 2
trials.
Section A.8 reports an R program that can be used to execute the ERADE rule.
3
3.1
Comparing designs
Introduction
pi qi
Ni (n)
=
.
n
p1 q1 + p 2 q2
(9)
This allocation proportion is known as Neyman allocation and the closer a design
is to this allocation, the higher the power in general. Unfortunately, using the
Neyman allocation has an ethical disadvantage as it assigns more patients to the
inferior treatment. The Neyman allocation is the reason for the inverse correlation
between allocation proportion and its variance we have seen in the simulations so far;
more patient assigned to the superior treatment resulted in higher variance which
in turn meant lower power. Thus, we can say that a suitable RAR design should be
balanced between maintaining suitable power and having an ethically advantageous
allocation proportion.
The main reason why we wish to assign more patients to the superior treatments
is to lower the number of treatment failures. Thus, for a design that assigns more
patients to the superior treatment, we expect a lower proportion of treatment failures. Due to (9) we also expect a design that has a lower failure proportion to have
lower power. Once again, we wish to balance the ethical advantages of a design with
its statistical properties.
19
140293481
3.2
3.2.1
K = 2 treatments
Allocation proportion
140293481
(p1 , p2 )
(0.8,0.8)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.6)
(0.6,0.4)
(0.6,0.2)
(0.4,0.4)
(0.4,0.2)
(0.2,0.2)
100
100
100
100
100
100
100
100
100
100
(p1 , p2 )
(0.8,0.8)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.6)
(0.6,0.4)
(0.6,0.2)
(0.4,0.4)
(0.4,0.2)
(0.2,0.2)
100
100
100
100
100
100
100
100
100
100
CR
AP(SD)
FP(SD)
0.50(0.05) 0.20(0.04)
0.50(0.05) 0.30(0.05)
0.50(0.05) 0.40(0.05)
0.50(0.05) 0.50(0.05)
0.50(0.05) 0.40(0.05)
0.50(0.05) 0.50(0.05)
0.50(0.05) 0.60(0.05)
0.50(0.05) 0.60(0.05)
0.50(0.05) 0.70(0.05)
0.50(0.05) 0.80(0.04)
RPW
AP(SD)
FP(SD)
0.50(0.11) 0.20(0.04)
0.60(0.09) 0.28(0.05)
0.67(0.07) 0.33(0.05)
0.73(0.06) 0.36(0.06)
0.50(0.08) 0.40(0.05)
0.58(0.07) 0.48(0.05)
0.63(0.06) 0.55(0.06)
0.50(0.06) 0.60(0.05)
0.56(0.05) 0.69(0.05)
0.50(0.04) 0.80(0.04)
DL
AP(SD)
FP(SD)
0.50(0.05) 0.20(0.04)
0.59(0.05) 0.28(0.04)
0.66(0.04) 0.34(0.05)
0.71(0.03) 0.37(0.05)
0.50(0.05) 0.40(0.05)
0.57(0.04) 0.48(0.05)
0.63(0.03) 0.55(0.05)
0.50(0.04) 0.60(0.05)
0.56(0.03) 0.69(0.05)
0.50(0.03) 0.80(0.04)
ORBD
AP(SD)
FP(SD)
0.50(0.15) 0.20(0.04)
0.67(0.13) 0.27(0.05)
0.78(0.10) 0.29(0.06)
0.84(0.06) 0.29(0.05)
0.50(0.16) 0.40(0.05)
0.66(0.13) 0.47(0.06)
0.78(0.09) 0.49(0.06)
0.50(0.15) 0.60(0.05)
0.67(0.12) 0.67(0.05)
0.50(0.14) 0.80(0.04)
DBCD
AP(SD)
FP(SD)
0.50(0.11) 0.20(0.04)
0.66(0.08) 0.27(0.05)
0.75(0.06) 0.30(0.05)
0.80(0.05) 0.32(0.06)
0.51(0.07) 0.40(0.05)
0.60(0.06) 0.48(0.05)
0.67(0.05) 0.53(0.06)
0.50(0.05) 0.60(0.05)
0.57(0.04) 0.68(0.05)
0.50(0.03) 0.80(0.04)
ERADE
AP(SD)
FP(SD)
0.51(0.10) 0.20(0.04)
0.65(0.07) 0.27(0.05)
0.73(0.05) 0.31(0.05)
0.78(0.04) 0.33(0.06)
0.51(0.06) 0.40(0.05)
0.60(0.05) 0.48(0.05)
0.66(0.04) 0.53(0.05)
0.50(0.04) 0.60(0.05)
0.57(0.03) 0.69(0.05)
0.50(0.03) 0.80(0.04)
Table 4: Comparison of allocation proportion (AP) and failure proportion (FP) for
some response-adaptive designs targeting urn allocation. The simulation used 5,000
replications.
the average and standard deviation. The results of this simulation are shown in Table
4. Similarly as before, the table only reports allocation proportion to treatment i = 1
as the allocation to treatment i = 2 can by obtained by subtraction.
We start by considering the case p1 = p2 . For all rules, the allocation proportion is roughly equal, which results in very similar failure proportions. The only
significant difference between designs is the standard deviation of the allocation
proportion. When the success probability is high, the DL displays the lowest variability, very similar to CR. As we decrease the success probability, the DL rule is
actually less variable than complete randomisation. The ORBD is the most variable
with a high standard deviation for all choices of pi . The DBCD and ERADE procedures perform similarly. Their behaviour is interesting as they are highly variable
for high pi and their variability reduces for small pi . In fact, for small pi both rules
have a lower variability than CR and very comparable one to DL. We do not see a
significant difference in the variability of the failure proportion between designs.
We now consider the case p1 6= p2 . With the exception of CR, all rules assign
more patients to the better treatment. The bigger the difference between p1 and p2 ,
the more patients are assigned to the better treatment. The ORBD design performs
21
140293481
the best in this respect, with the highest proportion assigned to the best treatment
for all choices of pi . RPW and DL seem to show very similar allocation proportions
to each other with a maximum difference of 0.02. Finally, DBCD and ERADE
have the least skewed allocation and also have a similar AP to each other. We
now compare the variability of the allocation proportion for these procedures. The
highly skewed allocation proportion of ORBD translates to a very high variability
for all choices of pi . The RPW also shows high variability and it is worth noting
that although the allocation proportion of RPW and DL were similar, the DL is
much less variable. In fact the DL is less or equally variable than CR for all choices
of pi . The DBCD and ERADE designs show a similar pattern as mentioned above,
i.e. they are highly variable for high pi and their variance decreases for lower pi ,
becoming very similar to the DL rule.
Finally, we compare the failure proportion between the designs. The ORBD
displays the lowest failure proportion out of all the rules and this is mostly caused
by the highly skewed allocation proportion. We notice that the failure proportion
for the RPW and DL is similar, which is most likely caused by the similar allocation
proportion. The DL still seems superior due to its less variable allocation proportion.
The DBCD and ERADE have a similar failure proportion to each other, which is
slightly smaller than the one for the RPW and DL rules. We can say that all the
designs succeed in a more ethical allocation as the failure proportion is always lower
for all p1 6= p2 . The standard deviation of failure proportion differs insignificantly
between the designs.
In Table 2 and Table 3 we compared designs that tackled RSIHR allocation,
namely the GDL, DBCD and ERADE rules. We now perform an investigation
comparing these rules when targeting RSIHR allocation. Recall that this allocation
proportion is of a significant importance as it minimises the expected number of
treatment failures. The rules chosen are:
GDL In Table 2 we saw that the initial urn composition has little effect on the
allocation proportion or its variance. Thus, we choose Z0 = {3, 3, 3}. We also
140293481
(p1 , p2 )
(0.8,0.8)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.6)
(0.6,0.4)
(0.6,0.2)
(0.4,0.4)
(0.4,0.2)
(0.2,0.2)
100
100
100
100
100
100
100
100
100
100
GDL
AP(SD)
FP(SD)
0.50(0.02) 0.20(0.04)
0.53(0.02) 0.29(0.04)
0.57(0.03) 0.37(0.04)
0.64(0.04) 0.42(0.04)
0.50(0.03) 0.40(0.05)
0.54(0.03) 0.49(0.05)
0.61(0.04) 0.56(0.05)
0.50(0.04) 0.60(0.05)
0.57(0.04) 0.69(0.05)
0.50(0.05) 0.80(0.04)
DBCD
AP(SD)
FP(SD)
0.50(0.03) 0.20(0.04)
0.54(0.03) 0.29(0.04)
0.59(0.04) 0.36(0.04)
0.68(0.05) 0.39(0.04)
0.50(0.03) 0.40(0.05)
0.56(0.04) 0.49(0.05)
0.65(0.06) 0.54(0.05)
0.50(0.04) 0.60(0.05)
0.60(0.06) 0.68(0.05)
0.50(0.07) 0.80(0.04)
ERADE
AP(SD)
FP(SD)
0.50(0.02) 0.20(0.04)
0.54(0.02) 0.29(0.04)
0.59(0.03) 0.36(0.04)
0.66(0.04) 0.40(0.04)
0.50(0.02) 0.40(0.05)
0.55(0.03) 0.49(0.05)
0.63(0.04) 0.55(0.05)
0.50(0.03) 0.60(0.05)
0.58(0.05) 0.68(0.05)
0.51(0.05) 0.80(0.04)
Table 5: Comparison of allocation proportion (AP) and failure proportion (FP) for
some response-adaptive designs targeting RSIHR allocation. The simulation used
5,000 replications.
the variance of all the rules targeting RSIHR allocation seems to increase as we
decrease pi , which is the opposite of what happened for the same rules targeting urn
allocation. The failure proportion seems to be the same for all the rules.
We now consider the cases when p1 6= p2 . We can see that all the rules assign
more patients to the superior treatment, with all three rules having a very similar
allocation proportion. However, this allocation proportion is smaller than for all
the rules considered in Table 4. On the other hand, the rules targeting RSIHR
allocation have a smaller standard deviation. Amongst the three rules, ERADE and
GDL seem to have a very similar variability, with DBCD only slightly more variable.
Overall, we can say that on average rules targeting urn allocation seem to have a
more ethical allocation, whilst the rules targeting RSIHR allocation are less variable.
3.2.2
23
140293481
CR
RPW
DL
Urn
100
100
100
100
0.04
0.04
0.05
0.04
0.04
0.05
0.04
0.04
0.04
0.05
0.05
0.04
206
62
27
256
57
217
0.88
0.91
0.89
0.90
0.87
0.90
0.85
0.88
0.85
0.89
0.86
0.89
0.87
0.90
0.88
0.89
0.87
0.90
(p1 , p2 )
(0.8,0.8)
(0.6,0.6)
(0.4,0.4)
(0.2,0.2)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.4)
(0.6,0.2)
(0.4,0.2)
DBCD ERADE
RSIHR
Urn
ERADE
RSIHR
0.04
0.05
0.05
0.05
0.04
0.05
0.04
0.04
0.05
0.04
0.04
0.05
0.88
0.91
0.89
0.90
0.87
0.90
0.88
0.90
0.84
0.90
0.83
0.88
0.88
0.91
0.89
0.90
0.87
0.90
Table 6: Simulated power and significance of various RAR designs for a clinical trial
with K = 2 treatments. The results were obtained using a simulation with 5,000
replications.
To test these hypotheses we may use the Wald test used by Hu and Rosenberger
(2003) with the test statistic given by
Z=q
p1 p2
p1 q1
n1
p2 q2
n2
where ni is the number of patients assigned to treatment i. Then, Z 2 is asymptotically chi-squared distributed with 1 degree of freedom.
Recall that the significance level is defined as the probability that we incorrectly
rejecting H0 when it is true (type I error) while statistical power is the probability
that the test correctly rejects H0 when it is false (type II error). We now simulate
significance level and power for all the designs considered in Table 4 and Table 5.
For each value of pi , we start by assigning n patients to treatments i = 1, 2 using
each of the rules. We repeat this 5, 000 times and for each repetition we obtain
the value of the test statistic Z. Then, we calculate the proportion of values of Z
that exceed 3.841 which is our critical value of the test statistic at = 0.05 level of
significance. When p1 = p2 we will obtain the significance level while when p1 6= p2
we will obtain the power. The results of such a simulation can be seen in Table 6.
To ease the analysis we choose n such that the power of complete randomisation is
roughly 0.90 whilst we keep n the same when simulating significance level. For rules
that can target multiple allocation proportions, the line below the rule name gives
the allocation proportion that a given rule targets. We also kept all the parameters
the same as previously.
We start by comparing the significance level of the designs. We see that the
significance level simulated is very close to = 0.05 that we have used as the
significance level for the test. Thus, we can say that the significance level for all
these procedures is very similar.
24
140293481
It can be seen that CR maintains the highest power for all randomisation schemes.
This is mostly likely caused due to the allocation proportion being the closest to
Neyman allocation, for which power is maximised. However, various designs maintain a very high level of power. We notice that for DBCD and ERADE targeting
RSIHR allocation the power is matched to the power of CR. The GDL targeting
RSIHR also maintains a very high level of power. The rules targeting urn allocation
perform slightly worse, with the RPW resulting in a considerable drop in power for
some pi . The ORBD has the lowest power out of all the designs.
We can see that the rules that were the least variable (GDL, DBCD and ERADE)
also seem to maintain the highest power. We have also previously noticed that the
less variable rules assign less patients to the superior treatment. Thus we can say
that there seems to be an inverse relationship between a more ethical allocation
and power. This confirms the simulations previously carried out by Melfi and Page
(1998) and Hu and Rosenberger (2003).
3.3
3.3.1
K = 3 treatments
Allocation proportion
25
140293481
(p1 , p2 , p3 )
(0.8,0.8,0.8)
(0.8,0.6,0.4)
(0.8,0.6,0.2)
(0.8,0.4,0.2)
(0.6,0.6,0.6)
(0.6,0.6,0.4)
(0.6,0.4,0.2)
(0.4,0.4,0.4)
(0.4,0.4,0.2)
(0.2,0.2,0.2)
100
100
100
100
100
100
100
100
100
100
(p1 , p2 , p3 )
(0.8,0.8,0.8)
(0.8,0.6,0.4)
(0.8,0.6,0.2)
(0.8,0.4,0.2)
(0.6,0.6,0.6)
(0.6,0.6,0.4)
(0.6,0.4,0.2)
(0.4,0.4,0.4)
(0.4,0.4,0.2)
(0.2,0.2,0.2)
100
100
100
100
100
100
100
100
100
100
CR
AP(SD) to i = 1, 2
0.33(0.05) 0.33(0.05)
0.33(0.05) 0.33(0.05)
0.33(0.05) 0.33(0.05)
0.33(0.05) 0.33(0.05)
0.33(0.05) 0.33(0.05)
0.33(0.05) 0.33(0.05)
0.33(0.05) 0.33(0.05)
0.33(0.05) 0.33(0.05)
0.33(0.05) 0.33(0.05)
0.33(0.05) 0.33(0.05)
FP(SD)
0.20(0.04)
0.27(0.04)
0.33(0.05)
0.40(0.05)
0.40(0.05)
0.47(0.05)
0.53(0.05)
0.60(0.05)
0.67(0.05)
0.80(0.04)
DL
AP(SD) to i = 1, 2
0.33(0.05) 0.33(0.05)
0.36(0.05) 0.36(0.05)
0.39(0.05) 0.39(0.05)
0.40(0.05) 0.40(0.05)
0.33(0.04) 0.33(0.04)
0.36(0.04) 0.36(0.04)
0.38(0.04) 0.38(0.04)
0.33(0.03) 0.33(0.03)
0.36(0.03) 0.36(0.03)
0.33(0.02) 0.33(0.02)
FP(SD)
0.20(0.04)
0.25(0.04)
0.29(0.04)
0.31(0.04)
0.40(0.05)
0.46(0.05)
0.50(0.05)
0.60(0.05)
0.66(0.05)
0.80(0.04)
ORBD
AP(SD) to i = 1, 2
0.33(0.11) 0.33(0.11)
0.39(0.12) 0.39(0.13)
0.42(0.13) 0.42(0.13)
0.43(0.13) 0.43(0.13)
0.33(0.12) 0.34(0.12)
0.39(0.13) 0.39(0.13)
0.42(0.13) 0.42(0.13)
0.33(0.12) 0.33(0.12)
0.39(0.12) 0.39(0.12)
0.33(0.10) 0.33(0.10)
FP(SD)
0.20(0.04)
0.24(0.04)
0.26(0.05)
0.28(0.05)
0.40(0.05)
0.45(0.05)
0.46(0.05)
0.60(0.05)
0.64(0.05)
0.80(0.04)
DBCD
AP(SD) to i = 1, 2
0.33(0.09) 0.34(0.09)
0.39(0.09) 0.39(0.09)
0.42(0.10) 0.42(0.10)
0.43(0.10) 0.43(0.10)
0.33(0.06) 0.33(0.06)
0.37(0.06) 0.37(0.06)
0.40(0.06) 0.39(0.06)
0.33(0.04) 0.33(0.04)
0.36(0.04) 0.36(0.04)
0.33(0.03) 0.33(0.03)
FP(SD)
0.20(0.04)
0.24(0.04)
0.26(0.05)
0.28(0.05)
0.40(0.05)
0.45(0.05)
0.49(0.05)
0.60(0.05)
0.66(0.05)
0.80(0.04)
26
140293481
(p1 , p2 , p3 )
(0.8,0.8,0.8)
(0.8,0.6,0.4)
(0.8,0.6,0.2)
(0.8,0.4,0.2)
(0.6,0.6,0.6)
(0.6,0.6,0.4)
(0.6,0.4,0.2)
(0.4,0.4,0.4)
(0.4,0.4,0.2)
(0.2,0.2,0.2)
100
100
100
100
100
100
100
100
100
100
GDL
AP(SD) to i = 1, 2
0.33(0.02) 0.33(0.02)
0.35(0.02) 0.35(0.02)
0.36(0.02) 0.36(0.02)
0.39(0.02) 0.38(0.02)
0.33(0.02) 0.33(0.02)
0.35(0.03) 0.35(0.03)
0.37(0.03) 0.37(0.03)
0.33(0.03) 0.33(0.03)
0.36(0.03) 0.36(0.03)
0.33(0.04) 0.33(0.04)
FP(SD)
0.20(0.04)
0.26(0.04)
0.31(0.04)
0.34(0.04)
0.40(0.05)
0.46(0.05)
0.50(0.05)
0.60(0.05)
0.66(0.05)
0.80(0.04)
DBCD
AP(SD) to i = 1, 2
0.33(0.02) 0.33(0.02)
0.35(0.03) 0.35(0.03)
0.37(0.03) 0.37(0.03)
0.40(0.03) 0.40(0.03)
0.33(0.03) 0.33(0.03)
0.36(0.03) 0.36(0.03)
0.39(0.04) 0.39(0.04)
0.33(0.04) 0.33(0.04)
0.37(0.04) 0.37(0.04)
0.33(0.05) 0.33(0.05)
FP(SD)
0.20(0.04)
0.26(0.04)
0.30(0.04)
0.32(0.04)
0.40(0.05)
0.46(0.05)
0.49(0.05)
0.60(0.05)
0.65(0.05)
0.80(0.04)
Table 8: Comparison of allocation proportion (AP) and failure proportion (FP) for
some response-adaptive designs targeting RSIHR allocation. The simulation used
5,000 replications.
than the CR rule.
When treatment success probabilities are no longer equal, both rules assign most
patients to the better treatment, which results in reduced treatment failures. However, this proportion is much lower than that of the rules mentioned in Table 7.
The variability of the allocation proportion for both the rules is lower than for CR.
Interestingly, the variability of these procedures decreases as pi decreases which is
opposite to what was happening for the rules targeting urn allocation. This means
that when pi is high, rules targeting RSIHR allocation are less variable while when
pi is low, urn allocation seems to be slightly less variable out of the two. We saw a
similar behaviour for these two allocation proportion for the K = 2 case. Once again
we do not notice a significant difference in the variability of the failure proportion.
To conclude, we have seen that allocation and failure proportions for K = 3
case seem to behave similarly to those of the rules with K = 2 treatments. In
general, it can be noticed that a more ethical allocation usually results in a higher
variability of the allocation proportion. We have also noticed that the rules targeting
RSIHR allocation allocation are less variable, but have a less ethical allocation than
rules targeting urn allocation. In addition, the RSIHR allocation seems to be more
suitable when pi is high, while the urn allocation seems to perform slightly better
for small pi .
3.3.2
We now define a suitable statistical test for the K = 3 case. Often in a clinical trial
we wish to compare K 1 treatments to a control. We wish to test the hypothesis
of no difference between the control and the other treatments against the hypothesis
that there is a significant difference between the K 1 treatments and the control.
From now on, we will consider p3 to be the control. Formally, we can formulate the
27
140293481
hypotheses as
H0 : p1 p3 = 0, p2 p3 = 0
and
H1 : p1 p3 6= 0, p2 p3 6= 0.
respectively. Hu and Rosenberger (2003) consider the contrast test of homogeneity
to test the above hypotheses. The contrast of interest is defined as
pc = {p1 p3 , p2 p3 }0
with the respective estimator
c = {
p
p1 p3 , p2 p3 }0 .
We let
"
p3 q3 /N3
= p1 q1 /N1 + p3 q3 /N3
p3 q3 /N3
p2 q2 /N2 + p3 q3 /N3
140293481
(p1 , p2 , p3 )
CR
(0.8,0.8,0.8)
(0.6,0.6,0.6)
(0.4,0.4,0.4)
(0.2,0.2,0.2)
100
100
100
100
0.04
0.05
0.04
0.04
(0.8,0.8,0.6)
(0.8,0.8,0.4)
(0.8,0.8,0.2)
(0.6,0.6,0.4)
(0.6,0.6,0.2)
(0.4,0.4,0.2)
290 0.89
84 0.88
42 0.93
338 0.88
75 0.86
285 0.90
DL
GDL ORBD
Urn RSIHR
Significance Level
0.04
0.04
0.04
0.05
0.04
0.04
0.04
0.05
0.05
0.04
0.05
0.04
Power
0.83
0.88
0.74
0.84
0.87
0.71
0.88
0.91
0.83
0.85
0.87
0.82
0.81
0.84
0.67
0.87
0.87
0.79
DBCD
Urn
DBCD
RSIHR
0.04
0.05
0.05
0.04
0.04
0.05
0.05
0.05
0.79
0.83
0.83
0.84
0.81
0.86
0.87
0.87
0.89
0.87
0.82
0.88
Table 9: Simulated power and significance level of various RAR designs for a clinical
trial with K = 3 treatments. The results were obtained using a simulation with
5, 000 replications.
allocations than ORBD but the allocations are less variable and the rules maintain a
high level of power. It can be said that the rules targeting RSIHR allocation are less
variable for high pi while rules targeting urn allocation exhibit lower variability for
low pi . In terms of power, the RSIHR allocation is much more suitable, maintaining
a very high level of power. Overall, so far we have seen RAR designs that (i) assign
more patients to superior treatment (ii) are less variable than CR (iii) have lower
failure proportion than CR (iv) maintain a high level of power. The RAR designs
that is able to meet all these criteria often depends on pi .
We end the investigation of multi-treatment RAR designs here. Since DL, GDL,
ORBD and DBCD have been extended to any number of treatments, it is possible
to investigate how the rules behave for K > 3. The contrast test of homogeneity
can also be extended to K > 3 treatments, as shown by Hu and Rosenberger (2003).
However, such an investigation is not performed here and we now focus on delayed
responses.
3.4
3.4.1
Delayed responses
K = 2 treatments
We start by introducing a way of incorporating delayed responses for all the rules
being investigated. We allow patients to arrive sequentially, one patient at each
time unit. That is, the first patient arrives at time unit 1, the second patient
arrives at time unit 2 and so on. We also define the vector d = {d1 , . . . , di } which
corresponds to the mean delay in response for each treatment given in time units
defined previously. For example, d = {5, 1} means that the mean delay for treatment
29
140293481
30
140293481
GDL
Urn
AP(SD)
FP
(p1 , p2 )
(0.8,0.8)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.6)
(0.6,0.4)
(0.6,0.2)
(0.4,0.4)
(0.4,0.2)
(0.2,0.2)
100
100
100
100
100
100
100
100
100
100
0.50(0.05)
0.57(0.05)
0.63(0.04)
0.68(0.03)
0.50(0.04)
0.56(0.04)
0.62(0.03)
0.50(0.03)
0.56(0.03)
0.50(0.03)
0.20
0.29
0.35
0.39
0.40
0.49
0.55
0.60
0.69
0.80
(0.8,0.8)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.6)
(0.6,0.4)
(0.6,0.2)
(0.4,0.4)
(0.4,0.2)
(0.2,0.2)
100
100
100
100
100
100
100
100
100
100
0.48(0.05)
0.55(0.05)
0.61(0.04)
0.66(0.03)
0.49(0.04)
0.55(0.04)
0.60(0.03)
0.49(0.04)
0.55(0.03)
0.50(0.02)
0.20
0.29
0.35
0.40
0.40
0.49
0.56
0.60
0.69
0.80
(0.8,0.8)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.6)
(0.6,0.4)
(0.6,0.2)
(0.4,0.4)
(0.4,0.2)
(0.2,0.2)
100
100
100
100
100
100
100
100
100
100
0.50(0.04)
0.56(0.04)
0.61(0.04)
0.66(0.03)
0.50(0.04)
0.56(0.04)
0.61(0.03)
0.50(0.03)
0.55(0.03)
0.50(0.02)
0.20
0.29
0.36
0.40
0.40
0.49
0.56
0.60
0.69
0.80
GDL
ORBD
RSIHR
AP(SD)
FP
AP(SD)
FP
d = {1, 1}
0.50(0.02) 0.20 0.50(0.15) 0.20
0.53(0.02) 0.29 0.68(0.13) 0.27
0.57(0.03) 0.37 0.78(0.10) 0.29
0.63(0.04) 0.42 0.84(0.06) 0.29
0.50(0.03) 0.40 0.50(0.16) 0.40
0.54(0.03) 0.49 0.66(0.14) 0.47
0.60(0.04) 0.56 0.78(0.09) 0.49
0.50(0.04) 0.60 0.49(0.15) 0.60
0.56(0.04) 0.68 0.67(0.12) 0.66
0.50(0.05) 0.80 0.50(0.14) 0.80
d = {5, 1}
0.50(0.02) 0.20 0.48(0.15) 0.20
0.53(0.02) 0.29 0.66(0.13) 0.27
0.57(0.03) 0.37 0.77(0.09) 0.29
0.63(0.04) 0.42 0.83(0.06) 0.30
0.50(0.03) 0.40 0.50(0.15) 0.40
0.54(0.03) 0.49 0.65(0.14) 0.47
0.60(0.04) 0.56 0.77(0.09) 0.49
0.50(0.04) 0.60 0.52(0.15) 0.60
0.57(0.04) 0.69 0.67(0.11) 0.67
0.51(0.05) 0.80 0.51(0.14) 0.80
d = {5, 5}
0.50(0.02) 0.20 0.50(0.15) 0.20
0.53(0.02) 0.29 0.66(0.13) 0.27
0.57(0.03) 0.37 0.77(0.10) 0.29
0.62(0.04) 0.42 0.83(0.06) 0.30
0.50(0.03) 0.40 0.50(0.15) 0.40
0.54(0.03) 0.49 0.65(0.13) 0.47
0.60(0.04) 0.56 0.75(0.09) 0.50
0.50(0.04) 0.60 0.50(0.14) 0.60
0.56(0.04) 0.69 0.66(0.12) 0.67
0.50(0.05) 0.80 0.50(0.14) 0.80
DBCD
Urn
AP(SD)
FP
DBCD
RSIHR
AP(SD)
FP
0.51(0.10)
0.66(0.08)
0.74(0.06)
0.79(0.05)
0.51(0.07)
0.60(0.05)
0.66(0.05)
0.51(0.05)
0.57(0.04)
0.51(0.04)
0.20
0.27
0.31
0.33
0.40
0.48
0.53
0.60
0.69
0.80
0.51(0.03)
0.54(0.03)
0.59(0.03)
0.67(0.04)
0.51(0.03)
0.55(0.04)
0.63(0.05)
0.51(0.04)
0.59(0.05)
0.50(0.06)
0.20
0.29
0.36
0.40
0.40
0.49
0.55
0.60
0.68
0.80
0.50(0.10)
0.66(0.08)
0.74(0.06)
0.79(0.05)
0.51(0.07)
0.60(0.06)
0.66(0.05)
0.51(0.05)
0.57(0.04)
0.51(0.03)
0.20
0.27
0.30
0.33
0.40
0.48
0.53
0.60
0.69
0.80
0.50(0.03)
0.54(0.03)
0.59(0.03)
0.67(0.04)
0.51(0.03)
0.56(0.04)
0.63(0.05)
0.51(0.04)
0.59(0.05)
0.51(0.06)
0.20
0.29
0.36
0.40
0.40
0.49
0.54
0.60
0.68
0.80
0.51(0.11)
0.66(0.08)
0.74(0.06)
0.79(0.05)
0.51(0.07)
0.60(0.05)
0.66(0.05)
0.51(0.05)
0.57(0.04)
0.51(0.04)
0.20
0.27
0.30
0.33
0.40
0.48
0.54
0.60
0.69
0.80
0.51(0.03)
0.54(0.03)
0.59(0.04)
0.67(0.05)
0.51(0.03)
0.56(0.04)
0.63(0.05)
0.50(0.04)
0.58(0.05)
0.51(0.06)
0.20
0.29
0.36
0.40
0.40
0.49
0.54
0.60
0.69
0.80
Table 10: Comparison of allocation proportion (AP) and its standard deviation
for some response-adaptive designs with K = 2 treatments and delayed responses.
Three different values for the response delay were investigated. The simulation used
5,000 replications.
difference can be seen between these two rules especially for the rules targeting
urn allocation. Once again we see that the rules targeting urn allocation are more
appropriate for low pi whilst urns targeting RSIHR allocation are more appropriate
for high pi . There does not seem to be any significant difference between the three
settings of d or between this table and the results found for the same rules with
instantaneous responses, as seen in Table 4 and Table 5.
We now perform a simulation of significance level and power, much like the one
performed in Table 3. We still use the Wald test at 0.05 level of significance and all
parameters are as before. The results can be seen in Table 11.
We notice that the significance level does not differ significantly between the
procedures and d values. In fact, it is very similar to the significance level of the
procedures with instantaneous responses reported in Table 6.
When compared to CR, GDL rule maintains the highest power. The power is
high for this design for both target allocations considered, often maintaining a similar
31
140293481
power to CR. The DBCD targeting RSIHR allocation is also highly powerful. The
behaviour of DBCD targeting urn allocation is particularly interesting, with the rule
maintaining high power when the difference between pi is small. When the difference
between pi is larger, i.e. pi = (0.8, 0.8), the design has the lowest power out of all
the designs. The ORBD has the lowest power out of all the designs for all other pi
values. We notice only a slight difference between the different choices of d with no
clear pattern.
(p1 ,p2 )
CR
(0.8,0.8)
(0.6,0.6)
(0.4,0.4)
(0.2,0.2)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.4)
(0.6,0.2)
(0.4,0.2)
100 0.04
100 0.04
100 0.05
100 0.04
206 0.88
62 0.91
27 0.89
256 0.90
57 0.87
217 0.90
(0.8,0.8)
(0.6,0.6)
(0.4,0.4)
(0.2,0.2)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.4)
(0.6,0.2)
(0.4,0.2)
100 0.04
100 0.04
100 0.05
100 0.04
206 0.88
62 0.91
27 0.89
256 0.90
57 0.87
217 0.90
(0.8,0.8)
(0.6,0.6)
(0.4,0.4)
(0.2,0.2)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.4)
(0.6,0.2)
(0.4,0.2)
100 0.04
100 0.04
100 0.05
100 0.04
206 0.88
62 0.91
27 0.89
256 0.90
57 0.87
217 0.90
DBCD DBCD
Urn
RSIHR
0.04
0.05
0.06
0.05
0.88
0.84
0.76
0.89
0.85
0.90
0.05
0.05
0.06
0.05
0.88
0.90
0.88
0.90
0.87
0.90
0.04
0.04
0.04
0.04
0.86
0.85
0.76
0.88
0.86
0.89
0.05
0.04
0.05
0.05
0.87
0.90
0.87
0.90
0.87
0.90
0.03
0.04
0.05
0.04
0.85
0.85
0.76
0.89
0.85
0.89
0.03
0.04
0.05
0.06
0.87
0.91
0.89
0.90
0.87
0.88
Table 11: Simulated power of various RAR designs for a clinical trial with K = 2
treatments and delayed responses. Three different values for the response delay were
investigated. The results were obtained using a simulation with 5,000 replications.
32
140293481
3.4.2
K = 3 treatments
140293481
reduced when one pi is smaller than the others. When pi = (0.8, 0.8, 0.2), the DBCD
targeting urn allocation is actually the least powerful. For all other settings, the
ORBD has the lowest power. It is worth noting that the true power of the ORBD
may be higher due to the slightly smaller significance level that was simulated, when
compared to the 0.05 significance level at which we performed the Wald test.
To conclude, we have seen that a moderate delay in responses does not have a
significant effect on the allocation proportion and its variability. However, we have
noticed that the power can be effected by delayed responses for DBCD targeting
urn allocation. We have seen that for this particular design, when the difference
between success probabilities pi is the greatest, the design has the lowest power
when compared to the other designs. For all other designs there is no significant
drop in power due to delayed responses.
We now consider the case of a clinical trial with a large delay. Since all RAR
designs considered require some responses in order to skew the allocation proportion,
it can be said that RAR design would not be as effective. For example in the case
when all patients are assigned before any responses are received, all RAR designs
investigated would allocate patients in the same manner as complete randomisation.
Thus, none of the RAR designs would be suitable for trials where the delay is very
large e.g. survival trials.
However, in practice there are often ways of overcoming the problem of a large
delay. For example, Tamura et al. (1994) explored an application of a RAR design
in a study investigating the effect of fluoxetine in patients with a depressive disorder.
In this study, the time between the first and final measurement was approximately
8 weeks. However, the researchers decided that this delay was too large and have
decided to use a surrogate response instead. The surrogate response was thus defined
to be a success if the patient exhibited at least a 50% reduction in HAMD (a scale
measuring severity of depression) in two consecutive visits after 3 weeks of therapy. A
similar surrogate response might be possible for other clinical trials where otherwise
an RAR design would be ruled out on the basis of a large delay.
Note that Zhang et al. (2007) compared the allocation proportion of the GDL
and DBCD under delayed responses. They only considered K = 3 treatments
and no investigation is performed into the power of those designs. However, their
comparison also investigated non-uniform patient entry time which was not explored
here. The literature does not report any investigation of the ORBD under delayed
responses. The is also the first known instance of the power of the GDL, ORBD
and DBCD investigated under delayed responses.
34
35
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
(0.8,0.8,0.8)
(0.8,0.8,0.6)
(0.8,0.8,0.4)
(0.8,0.8,0.2)
(0.6,0.6,0.6)
(0.6,0.6,0.4)
(0.6,0.6,0.2)
(0.4,0.4,0.4)
(0.4,0.4,0.2)
(0.2,0.2,0.2)
(0.8,0.8,0.8)
(0.8,0.8,0.6)
(0.8,0.8,0.4)
(0.8,0.8,0.2)
(0.6,0.6,0.6)
(0.6,0.6,0.4)
(0.6,0.6,0.2)
(0.4,0.4,0.4)
(0.4,0.4,0.2)
(0.2,0.2,0.2)
(0.8,0.8,0.8)
(0.8,0.8,0.6)
(0.8,0.8,0.4)
(0.8,0.8,0.2)
(0.6,0.6,0.6)
(0.6,0.6,0.4)
(0.6,0.6,0.2)
(0.4,0.4,0.4)
(0.4,0.4,0.2)
(0.2,0.2,0.2)
0.33(0.04)
0.36(0.04)
0.38(0.04)
0.40(0.04)
0.33(0.04)
0.36(0.04)
0.38(0.04)
0.33(0.03)
0.36(0.03)
0.33(0.02)
0.33(0.04)
0.34(0.04)
0.36(0.04)
0.38(0.04)
0.32(0.04)
0.35(0.04)
0.37(0.04)
0.33(0.03)
0.35(0.03)
0.33(0.02)
0.33(0.04)
0.36(0.05)
0.38(0.05)
0.40(0.05)
0.33(0.04)
0.36(0.04)
0.38(0.04)
0.33(0.03)
0.36(0.03)
0.33(0.02)
0.33(0.04)
0.36(0.04)
0.38(0.04)
0.40(0.04)
0.33(0.04)
0.36(0.04)
0.38(0.04)
0.33(0.03)
0.36(0.03)
0.33(0.02)
0.34(0.04)
0.37(0.05)
0.40(0.04)
0.42(0.04)
0.34(0.04)
0.37(0.04)
0.39(0.04)
0.34(0.03)
0.36(0.03)
0.33(0.02)
0.33(0.04)
0.36(0.04)
0.39(0.05)
0.40(0.05)
0.33(0.04)
0.36(0.04)
0.38(0.04)
0.33(0.03)
0.36(0.03)
0.33(0.02)
GDL
Urn
AP(SD) to i = 1, 2
GDL
ORBD
RSIHR
AP(SD) to i = 1, 2
AP(SD) to i = 1, 2
d = {1, 1, 1}
0.33(0.02) 0.33(0.02) 0.33(0.11) 0.34(0.11)
0.35(0.02) 0.35(0.02) 0.39(0.12) 0.39(0.13)
0.36(0.02) 0.36(0.02) 0.42(0.13) 0.42(0.13)
0.38(0.02) 0.38(0.02) 0.43(0.13) 0.43(0.13)
0.33(0.02) 0.33(0.02) 0.33(0.12) 0.33(0.12)
0.35(0.03) 0.35(0.03) 0.39(0.13) 0.38(0.13)
0.37(0.03) 0.37(0.03) 0.42(0.13) 0.42(0.13)
0.33(0.03) 0.33(0.03) 0.33(0.12) 0.33(0.12)
0.36(0.03) 0.36(0.03) 0.39(0.12) 0.39(0.12)
0.33(0.04) 0.33(0.04) 0.33(0.10) 0.34(0.10)
d = {5, 1, 1}
0.33(0.02) 0.33(0.02) 0.33(0.11) 0.34(0.11)
0.34(0.02) 0.35(0.02) 0.38(0.12) 0.39(0.12)
0.36(0.02) 0.36(0.02) 0.41(0.13) 0.42(0.13)
0.38(0.02) 0.38(0.02) 0.43(0.13) 0.43(0.13)
0.33(0.02) 0.33(0.02) 0.33(0.12) 0.33(0.12)
0.35(0.03) 0.35(0.02) 0.38(0.13) 0.39(0.12)
0.37(0.03) 0.37(0.03) 0.42(0.12) 0.42(0.12)
0.33(0.03) 0.33(0.03) 0.34(0.12) 0.33(0.12)
0.36(0.03) 0.36(0.03) 0.39(0.12) 0.39(0.12)
0.34(0.04) 0.33(0.04) 0.34(0.10) 0.33(0.10)
d = {5, 5, 5}
0.33(0.02) 0.33(0.02) 0.33(0.11) 0.33(0.11)
0.35(0.02) 0.35(0.02) 0.38(0.12) 0.38(0.12)
0.36(0.02) 0.36(0.02) 0.41(0.13) 0.41(0.12)
0.38(0.02) 0.38(0.02) 0.42(0.12) 0.43(0.12)
0.33(0.02) 0.33(0.02) 0.33(0.12) 0.33(0.12)
0.35(0.03) 0.35(0.03) 0.39(0.12) 0.38(0.12)
0.37(0.03) 0.37(0.03) 0.41(0.12) 0.41(0.12)
0.33(0.03) 0.33(0.03) 0.33(0.11) 0.33(0.11)
0.36(0.03) 0.36(0.03) 0.38(0.11) 0.39(0.11)
0.33(0.04) 0.33(0.04) 0.33(0.10) 0.33(0.10)
0.33(0.09)
0.39(0.10)
0.41(0.10)
0.43(0.10)
0.33(0.06)
0.37(0.06)
0.40(0.06)
0.33(0.04)
0.36(0.04)
0.33(0.03)
0.33(0.10)
0.39(0.09)
0.42(0.10)
0.43(0.10)
0.33(0.06)
0.37(0.06)
0.39(0.06)
0.33(0.04)
0.36(0.04)
0.33(0.03)
0.33(0.09)
0.39(0.09)
0.42(0.10)
0.44(0.10)
0.33(0.06)
0.37(0.06)
0.39(0.06)
0.33(0.04)
0.36(0.04)
0.33(0.03)
0.33(0.09)
0.39(0.10)
0.42(0.10)
0.44(0.10)
0.33(0.06)
0.37(0.06)
0.39(0.06)
0.33(0.04)
0.36(0.04)
0.33(0.03)
0.33(0.10)
0.39(0.09)
0.42(0.10)
0.43(0.10)
0.33(0.06)
0.37(0.06)
0.39(0.06)
0.33(0.04)
0.36(0.04)
0.33(0.03)
0.34(0.09)
0.39(0.09)
0.42(0.10)
0.43(0.10)
0.33(0.06)
0.37(0.06)
0.39(0.06)
0.33(0.04)
0.36(0.04)
0.33(0.03)
DBCD
Urn
AP(SD) to i = 1, 2
0.33(0.03)
0.35(0.03)
0.37(0.03)
0.39(0.03)
0.33(0.03)
0.35(0.03)
0.38(0.03)
0.33(0.04)
0.37(0.04)
0.33(0.05)
0.33(0.02)
0.35(0.03)
0.37(0.03)
0.39(0.03)
0.33(0.03)
0.35(0.03)
0.38(0.03)
0.33(0.04)
0.36(0.04)
0.33(0.05)
0.33(0.03)
0.35(0.03)
0.37(0.03)
0.39(0.03)
0.33(0.03)
0.35(0.03)
0.38(0.03)
0.33(0.04)
0.37(0.04)
0.33(0.05)
0.33(0.02)
0.35(0.03)
0.37(0.03)
0.39(0.03)
0.33(0.03)
0.35(0.03)
0.38(0.03)
0.33(0.04)
0.36(0.04)
0.33(0.05)
0.33(0.02)
0.35(0.03)
0.37(0.03)
0.40(0.03)
0.33(0.03)
0.35(0.03)
0.38(0.03)
0.33(0.04)
0.37(0.04)
0.33(0.05)
0.33(0.02)
0.35(0.03)
0.37(0.03)
0.39(0.03)
0.33(0.03)
0.35(0.03)
0.38(0.03)
0.33(0.04)
0.37(0.04)
0.33(0.05)
DBCD
RSIHR
AP(SD) to i = 1, 2
Table 12: Comparison of allocation proportion (AP) and its standard deviation for some response-adaptive designs with K = 2 treatments and
delayed responses. Three different values for the response delay were investigated. The simulation used 5,000 replications.
(p1 , p2 , p3 )
140293481
140293481
(p1 , p2 , p3 )
CR
(0.8,0.8,0.8)
(0.6,0.6,0.6)
(0.4,0.4,0.4)
(0.2,0.2,0.2)
(0.8,0.8,0.6)
(0.8,0.8,0.4)
(0.8,0.8,0.2)
(0.6,0.6,0.4)
(0.6,0.6,0.2)
(0.4,0.4,0.2)
100 0.04
100 0.04
100 0.05
100 0.04
290 0.88
84 0.89
42 0.93
338 0.90
75 0.87
285 0.91
(0.8,0.8,0.8)
(0.6,0.6,0.6)
(0.4,0.4,0.4)
(0.2,0.2,0.2)
(0.8,0.8,0.6)
(0.8,0.8,0.4)
(0.8,0.8,0.2)
(0.6,0.6,0.4)
(0.6,0.6,0.2)
(0.4,0.4,0.2)
100 0.04
100 0.04
100 0.05
100 0.04
290 0.88
84 0.89
42 0.93
338 0.90
75 0.87
285 0.91
(0.8,0.8,0.8)
(0.6,0.6,0.6)
(0.4,0.4,0.4)
(0.2,0.2,0.2)
(0.8,0.8,0.6)
(0.8,0.8,0.4)
(0.8,0.8,0.2)
(0.6,0.6,0.4)
(0.6,0.6,0.2)
(0.4,0.4,0.2)
100 0.04
100 0.04
100 0.05
100 0.04
290 0.88
84 0.89
42 0.93
338 0.90
75 0.87
285 0.91
DBCD DBCD
Urn
RSIHR
0.04
0.03
0.06
0.04
0.79
0.69
0.74
0.86
0.79
0.88
0.03
0.05
0.05
0.06
0.88
0.89
0.88
0.87
0.81
0.88
0.04
0.04
0.05
0.04
0.80
0.73
0.74
0.84
0.75
0.90
0.04
0.05
0.05
0.06
0.88
0.88
0.90
0.86
0.85
0.88
0.04
0.04
0.04
0.04
0.79
0.74
0.77
0.85
0.79
0.86
0.04
0.05
0.05
0.04
0.89
0.87
0.91
0.86
0.83
0.87
Table 13: Simulated power of various RAR designs for a clinical trial with K = 3
treatments and delayed responses. Three different values for the response delay were
investigated. The results were obtained using a simulation with 5,000 replications.
3.5
3.5.1
Covariates
K = 2 treatments
So far, we have extended the DL, ORBD and DBCD to allow the incorporation of
covariates. This extended version of DL rule is known as DLC, while the DBCD
version is known as the RDBCD. It is worth noting that the way the ORBD allows
covariate balance is different to that of the DLC and RDBCD. The ORBD balances
36
140293481
covariates in a traditional sense, meaning that it aims to assign in such a way that
the covariate is equally represented in each treatment. It does this by skewing the
assignment probability towards the treatment in which the current covariate value
is under represented. However, it also has the ethical basis to deal with, so it
also skews the probability of assignment towards the treatment performing best so
far. On the other hand, the DLC and RDBCD incorporate the covariates on a more
ethical basis. These rules skew the probability in such a way that the best treatment
is assigned to the patient with the most favourable condition.
In this section we aim to compare these two ways of incorporating covariates.
The ORBD will be compared to the DLC and RDBCD. Recall that the DLC is able
to have a varied probability of success which depends on the covariate level, that is
we assign a patient to treatment i with the probability aUj pi where a (0, 1) is the
so-called prognostic factor index and Uj {0, . . . , G} is the covariate level for the
j th patient. We extended the ORBD and RDBCD to also have this probability of
success. Although this change does not alter the inner workings of the rule, it allows
us to fairly compare the rules in a more realistic setting. This is because in practice
the treatment might have a different probability of success depending on the value
of the the covariate and we always want to balance the treatments on the covariate
that is likely to have an impact on the treatment outcome. Here, we investigated the
G value of G = 1. We then generate n random numbers from the standard uniform
distribution, denoted by zj with each value corresponding to the j th patient. The
zj values were then categorised into G + 1 levels Uj . Since zj values come from
the uniform distribution, we expect that on average every Uj will contain the same
number of patients. Note that the lower the Uj level, the higher the probability of
a success.
Recall that the ORBD and RDBCD were able to incorporate continuous covariates, but the DLC was not. We thus use the zj values rather than Uj for the two
former rules. That is for the ORBD the regression will use the actual covariate
values and the Uj will only be used for obtaining the response outcome. Similarly,
the RDBCD will use zj with the transformation H(z) = 1 z. This is because the
g function has a higher probability of assignment when z is high and so we need
to introduce this transformation in order to correctly reflect that the probability of
success is high for low Uj .
A simulation was performed using the above adjustments to the rules and the
results are given in Table 14. We chose a = 0.7, as suggested by Bandyopadhyay
et al. (2009). We also chose RDBCD to target RSIHR allocation. This is because
the DLC can only target urn allocation and so this will also give us a comparison
between the two target allocations. In addition, we have seen that in general the
DL is less variable than DBCD when targeting urn allocation with no covariate
information taken into account. The table reports the allocation proportion to
37
140293481
(p1 , p2 )
(0.8,0.8)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.6)
(0.6,0.4)
(0.6,0.2)
(0.4,0.4)
(0.4,0.2)
(0.2,0.2)
100
100
100
100
100
100
100
100
100
100
DLC
AP(SD)
CI
FP
0.50(0.05) 0.50 0.32
0.53(0.04) 0.50 0.40
0.56(0.04) 0.50 0.47
0.59(0.04) 0.50 0.53
0.50(0.04) 0.50 0.49
0.53(0.04) 0.50 0.57
0.56(0.04) 0.50 0.64
0.50(0.04) 0.50 0.66
0.53(0.04) 0.50 0.74
0.50(0.03) 0.50 0.83
ORBD
AP(SD)
CI
0.50(0.15) 0.50
0.64(0.14) 0.50
0.74(0.11) 0.50
0.81(0.06) 0.50
0.50(0.15) 0.50
0.63(0.14) 0.50
0.74(0.09) 0.50
0.50(0.14) 0.50
0.65(0.12) 0.50
0.50(0.13) 0.50
RDBCD
FP AP(SD)
CI
0.32 0.50(0.09) 0.49
0.38 0.52(0.09) 0.49
0.40 0.55(0.09) 0.48
0.42 0.59(0.09) 0.48
0.49 0.50(0.09) 0.49
0.55 0.52(0.09) 0.48
0.58 0.57(0.09) 0.48
0.66 0.50(0.09) 0.49
0.72 0.54(0.09) 0.48
0.83 0.50(0.10) 0.49
FP
0.32
0.40
0.47
0.52
0.49
0.57
0.63
0.66
0.74
0.83
Table 14: The allocation proportion and its standard deviation, covariate information and failure proportion for the DLC, ORBD and RDBCD with K = 2 treatments. We chose a = 0.7 and RDBCD to target RSIHR allocation. This simulation
used 5,000 replications.
i = 1 and its standard deviation, the failure proportion and covariate information
(CI). The latter represents the average value of zi in treatment i = 1. We do not
report the standard deviation for the failure proportion or covariate information as
there was no significant difference between the rules and pi values.
When p1 = p2 the allocation proportion is roughly the same for all choices of pi .
The standard deviation of this allocation proportion is the highest for the ORBD
rule, as was seen for rules not incorporating covariates. The RDBCD is also highly
variable, while the DLC exhibits the lowest variability out of all the rules. We
also notice that the failure proportion is similar for all the rules. The covariate
information is the same for the DLC and ORBD but the RDBCD seems to have
slightly unbalanced treatments in terms of covariates.
In the case p1 6= p2 we see that all the rules assign more patients to the better
treatment. The ORBD has the most ethically desirable allocation. The DLC and
RDBCD skew the allocation only slightly, even for a high difference in treatment
success probability. This has the usual translation into the failure proportion. When
the difference between pi is high, the failure proportion for the ORBD is much lower
than for the other rules. When this difference is small, e.g. p = (0.4, 0.2) this
difference is quite small. We notice that the covariate information for the DLC and
ORBD is equal for all pi whilst for the RDBCD it is slightly skewed towards the
worse treatment.
We now perform an investigation into the power of these designs. We used the
Wald test as before and the results are shown in Table 15. We use the same n
values as before. We can see that the significance level for all the rules are near the
expected 0.05 level we used to perform the test. We see that in general the ORBD
is the least powerful. The RDBCD and DLC maintain slightly higher power than
the ORBD.
38
140293481
(p1 , p2 )
(0.8,0.8)
(0.6,0.6)
(0.4,0.4)
(0.2,0.2)
(0.8,0.6)
(0.8,0.4)
(0.8,0.2)
(0.6,0.4)
(0.6,0.2)
(0.4,0.2)
n DLC
100 0.05
100 0.05
100 0.04
100 0.04
206 0.70
62 0.77
27 0.76
256 0.79
57 0.77
217 0.82
ORBD
0.06
0.06
0.06
0.04
0.65
0.68
0.68
0.75
0.72
0.77
RDBCD
0.07
0.05
0.05
0.03
0.73
0.78
0.76
0.79
0.78
0.82
Table 15: Simulated power of various RAR designs for a clinical trial with K = 3
treatments and incorporating covariates. Three different values for the response
delay were investigated. The results were obtained using a simulation with 5,000
replications.
3.5.2
K = 3 treatments
We now consider covariates for rules with K = 3 treatments. Thus, we will compare
the DLC and ORBD. Recall that when we introduced the RDBCD rule in Section
2.4.3, we did not define the g function for K > 2 treatments so we cannot use this
rule. We used a similar approach as before to obtain the allocation proportion and
its standard deviation as well as the failure proportion. The results can be seen in
Table 16. We no longer include the column for covariate information as this was the
same for both the rules. Similarly as before, the table only reports the allocation to
the first two treatments as the allocation proportion of the i = 3 treatment can be
obtained by subtraction.
We start the analysis by considering the rules when pi are equal. We notice that
the allocation is the same for both the rules, as expected. The standard deviation
of this allocation is lower for the DLC on both treatments for all pi . The failure
proportion is also very similar.
When pi are unequal, the ORBD assigns more patients than the DLC to the
best treatment. For the DLC the amount the allocation is skewed by is very small.
We notice a similar pattern as in the K = 2 case, that is the DLC maintains a
comparable level of treatment failures as the ORBD, despite a less ethical allocation.
Once again, the only difference of note is when the difference between pi is the largest,
that is pi = (0.8, 0.8, 0.2). We also notice that the DLC is much less variable than
the ORBD.
We now consider the power and significance level of the ORBD and DLC incorporating covariates and with K = 3 treatments. A simulation was run using
all parameters as before, and the results are shown in Table 17. As before, for the
power simulation we used n values as was used for the equivalent rules that are not
covariate-adaptive.
39
140293481
(p1 , p2 )
(0.8,0.8,0.8)
(0.8,0.8,0.6)
(0.8,0.8,0.4)
(0.8,0.8,0.2)
(0.6,0.6,0.6)
(0.6,0.6,0.4)
(0.6,0.6,0.2)
(0.4,0.4,0.4)
(0.4,0.4,0.2)
(0.2,0.2,0.2)
100
100
100
100
100
100
100
100
100
100
DLC
AP(SD) to i = 1, 2
0.33(0.04) 0.33(0.04)
0.35(0.04) 0.34(0.04)
0.36(0.04) 0.36(0.04)
0.37(0.04) 0.37(0.04)
0.34(0.04) 0.33(0.04)
0.35(0.04) 0.34(0.04)
0.35(0.04) 0.36(0.04)
0.33(0.04) 0.33(0.04)
0.34(0.03) 0.34(0.04)
0.33(0.03) 0.33(0.03)
FP
0.32
0.37
0.42
0.45
0.49
0.54
0.59
0.66
0.71
0.83
ORBD
AP(SD) to i = 1, 2
0.33(0.13) 0.30(0.09)
0.37(0.14) 0.33(0.10)
0.41(0.14) 0.36(0.09)
0.42(0.14) 0.37(0.09)
0.33(0.14) 0.29(0.10)
0.38(0.14) 0.33(0.09)
0.40(0.14) 0.36(0.09)
0.33(0.13) 0.29(0.09)
0.38(0.13) 0.34(0.09)
0.33(0.10) 0.31(0.07)
FP
0.32
0.37
0.40
0.42
0.49
0.54
0.57
0.66
0.71
0.83
Table 16: The allocation proportion and its standard deviation, covariate information and failure proportion for the DLC, ORBD and RDBCD with K = 3 treatments. We chose a = 0.7 and RDBCD to target RSIHR allocation. This simulation
used 5,000 replications.
(p1 , p2 , p3 )
(0.8,0.8,0.8)
(0.6,0.6,0.6)
(0.4,0.4,0.4)
(0.2,0.2,0.2)
(0.8,0.8,0.6)
(0.8,0.8,0.4)
(0.8,0.8,0.2)
(0.6,0.6,0.4)
(0.6,0.6,0.2)
(0.4,0.4,0.2)
n DLC
100 0.05
100 0.03
100 0.04
100 0.04
206 0.66
62 0.70
27 0.65
256 0.75
57 0.74
217 0.81
ORBD
0.06
0.06
0.05
0.03
0.60
0.59
0.61
0.67
0.63
0.71
Table 17: Simulated power of various RAR designs for a clinical trial with K = 3
treatments and incorporating covariates. Three different values for the response
delay were investigated. The results were obtained using a simulation with 5,000
replications.
We start by noticing that there is no meaningful difference in the significance
level between the designs. When comparing the power, we can say that the DLC
maintains higher power than the ORBD.
We conclude that under covariate-adaptiveness, the ORBD has the most ethical
allocation proportion which results in the smallest failure proportion. However, the
trade-off is that this rule is the most variable and has the lowest power. The DLC
and RDBCD have less ethical allocations, but are less variable and have higher
power. Out of these two rules, the DLC is much less variable although the two rules
have similar power.
40
140293481
Conclusion
The initial intention of this dissertation has been to define some well studied RAR
designs, explore the extensions of these RAR design in a practical settings and then
to investigate the behaviour of these designs under those practical settings. We
have achieved this by considering the extensions of a number of rules to multiple
treatments, delayed responses and covariate-adaptiveness.
Throughout this investigation, it has been seen that the performance of a RAR
scheme significantly depends on a number of factors such as (i) target allocation
(ii) value of pi (iii) delay in response (iv) covariates. This means that there is no
universal design that is superior in all aspects the others. We have seen that for
the K = 2 treatments design with no delay and not incorporating covariates, the
rules that are able to target any given allocation proportion i.e. GDL, DBCD and
ERADE perform the best in terms of maintaining high power and having a less
variable allocation proportion. This can be seen particularly in the case when these
rules target RSIHR allocation. However, the ORBD performs the best in terms of
the most ethical allocation. We also found that rules targeting urn allocation are
less variable when pi is low whilst RSIHR is less variable for high pi , meaning that a
suitable allocation proportion should also be chosen depending on the approximate
pi levels expected. Similar results have been shown for the same designs with K = 3
treatments.
We then investigated the behaviour of the rules under delayed responses. We
saw that in general the performance of the procedures is not significantly affected
by delayed responses, as long as the delay is moderate.
When RAR procedures incorporate covariates, the ORBD has the most favourable
allocation and failure proportion, but is highly variable. On the other hand, the DLC
is much less variable but skews the allocation only slightly. One limitation of the
RAR designs studied here is the extension so that the rules have varied success
probability under difference covariate levels. Although this is a more realistic setting in practice, it meant that we were unable to compare the covariate-adaptive
procedures to the same procedures without covariates. Thus, if this investigation
was to be done again, this could be a suitable alternative.
The subject of randomisation in clinical trial is a rapidly growing field of research and therefore many extensions to the work presented here could have been
considered. One rule that could have been covered here in more detail is ERADE.
We saw that for K = 2 treatments it performed very well so investigating the rule
under all criteria considered for the other rules (e.g. delayed responses, covariateadaptiveness) would be a suitable extension. It would also be interesting to consider
all the rules incorporating covariates and delayed responses as this could be a common scenario in practice. If more time was available, responses that are not binary
could also have been considered.
41
140293481
References
Atkinson, A. C. and A. Biswas (2013). Randomised Response-Adaptive Designs
in Clinical Trials. Chapman & Hall/CRC Monographs on Statistics & Applied
Probability. Boca Raton: Taylor & Francis Group.
Baldi Antognini, A. and M. Zagoraiou (2012). Multi-objective optimal designs in
comparative clinical trials with covariates: The reinforced doubly adaptive biased
coin design. Ann. Statist. 40 (3), 13151345.
Bandyopadhyay, U. and A. Biswas (1999). Allocation by randomized play-thewinner rule in the presence of prognostic factors. Sankhy: The Indian Journal of
Statistics, Series B (1960-2002) 61 (3), 397412.
Bandyopadhyay, U., A. Biswas, and R. Bhattacharya (2009). Drop-the-loser design
in the presence of covariates. Metrika 69 (1), 115.
Bartlett, R. H., D. W. Roloff, R. G. Cornell, A. F. Andrews, P. W. Dillon, and J. B.
Zwischenberger (1985). Extracorporeal circulation in neonatal respiratory failure:
A prospective randomized study. Pediatrics 76 (4), 479487.
Biswas, A. (1999). Delayed response in randomized play-the-winner rule revisited.
Communications in Statistics - Simulation and Computation 28 (3), 715731.
Eisele, J. R. (1994). The doubly adaptive biased coin design for sequential clinical
trials. Journal of Statistical Planning and Inference 38 (2), 249 261.
Eisele, J. R. and M. B. Woodroofe (1995). Central limit theorems for doubly adaptive
biased coin designs. Ann. Statist. 23 (1), 234254.
Hu, F. and W. F. Rosenberger (2003). Optimality, variability, power. Journal of
the American Statistical Association 98, 671678.
Hu, F., W. F. Rosenberger, and L.-X. Zhang (2006). Asymptotically best responseadaptive randomization procedures. Journal of Statistical Planning and Inference 136 (6), 1911 1922.
Hu, F. and L.-X. Zhang (2004). Asymptotic properties of doubly adaptive biased
coin designs for multitreatment clinical trials. Ann. Statist. 32 (1), 268301.
Hu, F., L.-X. Zhang, and X. He (2009). Efficient randomized-adaptive designs. Ann.
Statist. 37 (5A), 25432560.
Ivanova, A. (2003). A play-the-winner-type urn design with reduced variability.
Metrika 58 (1), 113.
Ivanova, A. and C. Flournoy (2001). A birth and death urn for ternary outcomes:
Stochastic processes applied to urn models. In C. A. Charalambides, M. V.
Koutras, and N. Balakrishnan (Eds.), Probability and Statistical Models with Applications, pp. 583600. Boca Raton: Chapman and Hall/CRC Press.
Matthews, P. C. and W. F. Rosenberger (1997). Variance in randomized play-thewinner clinical trials. Statistics & Probability Letters 35 (3), 233 240.
42
140293481
43
140293481
Note
The code given in the sections below performs a simple execution of all rules considered throughout this dissertation. Each rule was defined in such a way that it allows
any number of treatments, apart from RPW, RDBCD and ERADE which were not
extended to K > 2 treatments. The designs given here also do not allow delayed
responses, but a description of how to extend the procedures is given in Section 3.4.
The ORBD given here allows covariate adaptiveness but can be altered to disregard
this, as was outlined in Section 2.3.1.
A.1
RPW
A.2
DL
44
140293481
A.3
GDL
45
140293481
}
}else{
z[allocated[i]]<-z[allocated[i]]-1 #draw a ball
}
}
#Response
t_choice<-runif(1)
if (t_choice < p[allocated[i]-1]){ #success
s[allocated[i]-1]<-s[allocated[i]-1]+1
z[allocated[i]] <- z[allocated[i]] + D
#if target allocation is rsihr D=0, if
}else{ #failure
f[allocated[i]-1]<-f[allocated[i]-1]+1
}
#update successes
#update urn composition
urn allocation D=1
#update failures
}
return(append(sum(allocated==2)/n,sum(f)/n)) #return AP and FP
}
A.4
DLC
46
140293481
if (t_choice<((a^(u_lvl[i]-1))*(p[allocated[i]-1]))){
#success, using a parameter
s[allocated[i]-1]<-s[allocated[i]-1] + 1 #update success
if (replace_dec < pi_est[u_lvl[i]]){
#replace ball with corresponding probability pi
z[allocated[i]] <- z[allocated[i]] +1
}
}else{ #failure
f[allocated[i]-1]<-f[allocated[i]-1] + 1 #update failure
#replace ball with corresponding probability 1-pi(G-j)
if (replace_dec < (1-pi_est[(G+2)-u_lvl[i]])){
z[allocated[i]] <- z[allocated[i]] +1
}
}
}
}
}
return(append(sum(allocated==2)/n,sum(f)/n)) #return AP and FP
}
A.5
ORBD
47
140293481
}
t_outcome<-runif(1); #random number to decide outcome
if(t_outcome<p[allocated[i]]){ #success
s[allocated[i]]<-s[allocated[i]]+1 #update successes
}else{ #failure
f[allocated[i]]<-f[allocated[i]]+1 #update failures
}
#update variables for the logistic model
tmnt<-append(tmnt,toString(allocated[i]))
response<-append(response, (s[allocated[i]]/
(s[allocated[i]]+f[allocated[i]])))
cov_tbl<-append(cov_tbl, u[i])
}
return(append(sum(allocated==1)/n,sum(f)/n)) #return AP and FP
}
A.6
DBCD
48
140293481
for(k in 1:t){
if(fair_coin>sum(bounds[1:k]) && fair_coin<sum(bounds[1:(k+1)])){
#assign tmnt based on probabilities above
allocated[i]<-k
}
}
response_dec <- runif(1) #random number to obtain response
if(response_dec < p[allocated[i]]){ #success
s[allocated[i]]<-s[allocated[i]] + 1 #update number of successes
}else{ #failure
f[allocated[i]]<-f[allocated[i]] + 1 #update number of failures
}
}
return(append(sum(allocated==1)/n,sum(f)/n)) #return AP and FP
}
A.7
RDBCD
49
140293481
A.8
ERADE
50