Sei sulla pagina 1di 15

Computer Methods and Programs in Biomedicine 70 (2003) 21 35

www.elsevier.com/locate/cmpb

Assessment of average, population and individual


bioequivalence in two- and four-period crossover studies
Herman P. Wijnand *
Sibeliuspark 754, 5343 BS Oss, The Netherlands
Received 3 July 2001; received in revised form 18 October 2001; accepted 25 January 2002

Abstract

The aim of bioequivalence studies is to assess the equivalence of two pharmaceutical formulations of the same
active drug substance. Currently three types of bioequivalence are distinguished: average, population and individual
bioequivalence. Average and population bioequivalence can be assessed in two-period (non-replicated) crossover
studies, whereas individual bioequivalence requires three- or four-period replicated studies, with a preference for
four-period studies. The PC-program BIOEQV80 is presented for the statistical analysis of average and population
bioequivalence from two-period crossover studies. The program BIOEQ2X2 is presented for the statistical analysis of
all three types of bioequivalence from four-period replicated crossover studies. The statistical aspects of population
and individual bioequivalence are based on a recent Guidance issued by the US Food and Drug Administration.
2002 Elsevier Science Ireland Ltd. All rights reserved.

Keywords: Average bioequivalence; Population bioequivalence; Individual bioequivalence; Two-period crossovers; Bootstrapping;
Four-period crossovers

1. Introduction parameter. Since the decision depends on a com-


parison of average values, the result is termed
In bioequivalence studies a new pharmaceutical average bioequivalence (ABE). In the early 1990s
formulation (the Test, further abbreviated as T) the regulatory requirements for ABE have been
of an active drug substance is compared with a formalised starting in the USA [1] and the Eu-
standard formulation (the Reference, further re- ropean Union [2].
ferred to as R). If a 90%-confidence interval for During the last decade two new concepts, popu-
the median T/R ratio of a parameter of interest lation bioequivalence (PBE) and individual bioe-
lies fully within the predetermined bioequivalence quivalence (IBE), have given rise to a vivid
range (usually 80125%), the formulations are discussion in the scientific community. This re-
concluded bioequivalent with respect to that sulted in numerous publications on study designs
and statistical models. In 1997 the US Food and
Drug Administration (FDA) issued the first Draft
* Tel./fax: + 31-412-626693.
E-mail address: dr.wijnand@planet.nl (H.P. Wijnand). Guidance on this matter [3], mainly based on a

0169-2607/02/$ - see front matter 2002 Elsevier Science Ireland Ltd. All rights reserved.
PII: S 0 1 6 9 - 2 6 0 7 ( 0 2 ) 0 0 0 1 9 - 6
22 H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135

publication by Schall and Luus [4]. A second 3. Theory for two-period bioequivalence studies
Draft Guidance [5] followed this in 1999. The
final FDA General Considerations [6] for orally 3.1. A6erage bioequi6alence from two-period
administered drug products appeared in October studies
2000, including criteria for the selection of ABE,
PBE or IBE. It was followed by a separate Guid- All theoretical aspects of statistical models for
ance on statistical approaches of replicate studies the assessment of ABE from two-period crossover
[7] in January 2001. studies have been published earlier in this Journal
Both ABE and PBE can be assessed by two- in a series of papers, the last of which [8] appeared
treatment two-period crossover studies, for which in 1994.
the PC-program BIOEQV80 is presented. All three
types of bioequivalence can be assessed by four- 3.2. Population bioequi6alence from two-period
period replicate crossover studies, for which the studies
PC-program BIOEQ2X2 is presented.
In the two Draft Guidances [3,5] and in the
final Guidance [7] issued by the US FDA, the
2. Relevance for pharmaceutical companies statistical criteria for PBE have remained essen-
tially the same. In all equations to be given in this
Replicate study designs such as four-period paper, Greek letters denote population statistics
studies in which both pharmaceutical formula- and roman letters denote sample statistics. For
tions are administered twice to the same subject, log-transformed observations (e.g. the Area-Un-
are recommended [6] for modified-release dosage der-the-plasma-concentration-Curve (AUC), or
forms and for highly variable drug products. the plasma peak concentration, Cmax), two criteria
PBE is important for pharmaceutical compa- are distinguished.
nies when developing new drug substances. This is Reference-scaled:
because in popular terms as long as a new
(vT vR )2 + (| 2TT | 2TR)
drug substance has not been marketed, bioequiva- q1 = 5 qp (1)
lence studies on different pharmaceutical formula- | 2TR
tions of the same active drug substance should
satisfy requirements of PBE. In such cases it is not Constant-scaled:
necessary to satisfy the requirements of ABE
(vT vR )2 + (| 2TT | 2TR)
based on 90%-confidence intervals. The much q1 = 5 qp (2)
weaker requirement [6,7] is that it is sufficient that | 2T0
the point estimate of the geometric mean T/R where vT is population average response of the
ratio falls within 80 125%. The additional re- log-transformed measure for T; vR, population
quirement of PBE should of course be satisfied. It average response of the log-transformed mea-
is interesting to note that in such cases two-period sure for R; | 2TT, total variance (sum of within-
studies may be sufficient. and between-subject variances) of T; | 2TR, total
IBE is important as soon as the active drug variance (sum of within- and between-subject
substance has already been marketed. In this situ- variances) of R; | 2T0, specified constant total
ation the same weaker requirements of ABE hold variance and qP is the PBE limit.
[6,7] and the additional requirement of IBE In Eq. (1), reference-scaling means that the
should be satisfied. This implies the use of study criterion used is scaled to the variability of the
designs with more than two periods, preferably reference product, which effectively widens the
four. bioequivalence limit for more variable reference
The selection of the bioequivalence criterion products. This could unnecessarily narrow the
(ABE, PBE or IBE) should be documented in the bioequivalence limit for drugs or drug products
study protocol [6]. with low variability but wide therapeutic range.
H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135 23

Therefore, a form of constant scaling has been 3.3. Computational methods for PBE from
introduced in Eq. (2), allowing larger values for two-period studies
the reference variance.
The determination of both qP and | 2T0 is based In FDAs first Draft Guidance [3], as well as in
on the consideration of the ABE criterion and the Schall and Luus publication [4], the method of
addition of variance terms to the PBE criterion: bootstrapping has been recommended for evaluat-
ABE limit +variance factor (ln 1.25)2 +mp ing the criteria for PBE. In FDAs second Draft
qp = = Guidance [5] and in the final Guidance [7] calcula-
variance | 2T0
(3) tion schemes based on the method of moments
were given for a four-period replicated crossover
The US FDA recommends [7] contacting them
design, but not for a two-period non-replicated
for further information on mP and | 2T0. In the
crossover design. This leaves bootstrapping as the
appendices to the Draft Guidances [3,5] and the
validated method for evaluating the criteria for
final Guidance [7], values of 0.02 for mP and 0.04
PBE. A lucid introduction to bootstrapping was
for | 2T0 have been used in a number of numerical
examples. given by Efron and Tibshirani [9].
In actual practice the population parameters Eqs. (5) and (6) are scaled with respect to the
vT, vR, |TT and |TR are unknown. They are variance of the reference formulation, whereas
estimated by the sample statistics MT, MR, sT and Schall and Luus original equation for PBE [4] is
sR, respectively. Let n1 denote the number of an unscaled criterion. These authors, however,
subjects with sequence RT and n2 the number of clearly discussed the possibility of extending their
subjects with sequence TR, and let n =n1 +n2. criterion to scaling with respect to the variance of
Then MT and MR, adjusted for unequal group the reference. Schall also explored this in a later
sizes n1 and n2, are unbiased estimates of vT and publication on scaling [10]. The procedures of
vR, respectively. The sample statistics s 2T and s 2R, bootstrapping specified by the FDA [3] and those
pooled across sequences and each, therefore, given by Schall and Luus [4] are essentially equal.
known with n1 +n2 2 degrees of freedom, are 1. Generate a bootstrap sample (often also called
unbiased estimates of | 2TT and | 2TR, respectively. bootstrap replicate) by sampling with replace-
As stated explicitly by Schall and Luus [4], an ment n1 pairs of observations (e.g. AUC or
unbiased estimate of the two identical numerators Cmax) from sequence RT and n2 pairs of obser-
in Eqs. (1) and (2) is (MT MR )2 s 2d/n + s 2T vations from sequence TR. This means that
s 2R, so that:
! "
for each subject the intra-subject pair of obser-
s 2d vations remains unchanged. Furthermore, a
E (MT MR )2 + s 2T s 2R
n subject may occur more than once, or may not
occur at all, in a bootstrap replicate, due to
= (vT vR )2 + (| 2TT | 2TR) (4)
sampling with replacement.
2
where s is the sample variance of the intra-sub-
d 2. Calculate the estimate of q1 in Eqs. (5) and (6),
ject differences. The criteria set by Eqs. (1) and (2) where the denominator s 2R or | 2T0 should keep
now transfer into: its value from the original non-bootstrapped
observations or the specified constant vari-
Reference-scaled: ance, respectively.
3. Repeat steps 12 a large number of times
(MT MR )2 s 2d/n + s 2T s 2R
q1 = 5qp (5) (some thousands of replicates are usually ade-
s 2R quate). Let B denote the total number of
Constant-scaled:
bootstrap replicates.
(MT MR )2 s 2d/n + s 2T s 2R 4. Sort the B estimates of q1 in ascending order.
q1 = 5qp (6)
| 2T0 The 100(1 h) percentile of the B ordered
24 H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135

values of q1 is the upper bound of a one-sided tained from two pairs of cells. Within each pair of
100(1 h) per cent confidence interval for q1. cells, one cell mean changes by the addition of the
Usually h is chosen as 0.05, which results in a same constant, whereas the other cell mean re-
one-sided 95%-confidence interval. mains unchanged. Therefore, the difference of the
sample means, MT MR, remains unchanged.
3.4. The influence of period effects It follows that, by pooling across sequences, the
denominator and all terms in the nominator of
The statistical distribution function of log- the Eqs. (5) and (6) remain unaffected by fixed
transformed characteristics is multiplicative in the period effects, so that the criteria for PBE are not
untransformed domain. A period effect in the sensitive to such effects.
untransformed domain originates when all values This can easily be verified numerically. Table 1
of one period are multiplied by the same factor. shows the AUC-values of theophylline used as an
Their log-transforms then become the logarithms example in papers referenced earlier [8]. The anal-
of the original values plus or minus the logarithm ysis of variance (ANOVA) on log-transforms
of a constant. If a cell is defined as a set of values given in Table 2 shows the absence of a significant
in a unique combination of sequence and period, period effect. In ten data sets significant period
the variance of the cell values will not change by effects were introduced by multiplying all AUC-
adding a constant to each value. Since the sample values of the second period by 0.5, 0.6, 0.7, 0.8,
variances s 2T and s 2R are pooled across sequences, 0.9, 1.1, 1.2, 1.3, 1.4 and 1.5. In this way the
they will not change either. Similarly, s 2d/n will not Period 2/Period 1 ratio, which originally was 1.07,
change. ranged from 0.54 to 1.61 in the ten simulated data
The sample means MT and MR do change in sets, with probabilities going down to as low as
the logarithmic domain. They are, however, ob- B 0.00001. In the logarithmic domain the sample
mean MT ranged from 5.08399 to 5.63329 and the
sample mean MR ranged from 5.08209 to 5.63139,
Table 1
Theophylline AUC(0-inf) data from a two-period crossover
but their difference was 0.00190 for all data sets.
study [8] Similarly, the CV remained 13.75%, and the vari-
ances, s 2T and s 2R remained 0.036253 and 0.051972,
Subject Reference Test Ratio respectively.
Each data set was bootstrapped in ten runs
(Period 1) (Period 2)
2 339.03 329.76 0.97266 each with 8000 bootstrap replicates. The resulting
4 242.64 258.19 1.06409 mean q1(0.95) from 80 000 bootstrap replicates
5 249.94 201.56 0.80643 was approximately 0.37 for all ten data sets (plus
8 184.32 249.64 1.35438 the original data set). No statistically significant
12 209.30 231.98 1.10836
differences between data sets were found, given
13 207.40 234.19 1.12917
15 239.84 241.25 1.00588 the variability between runs within data sets.
16 211.24 255.60 1.21000
18 230.36 256.55 1.11369 3.5. The influence of group ties
(Period 2) (Period 1)
In small sequence groups an interesting type of
1 288.79 228.04 0.78964
3 343.37 288.21 0.83936 tie can be found. For example, the probability
6 225.77 217.97 0.96545 that sampling with replacement from a sequence
7 235.89 133.13 0.56437 group of four subjects generates a bootstrap repli-
9 215.14 213.78 0.99368 cate of four identical subjects is (1/4)4 1 = 0.0156.
10 245.48 248.98 1.01426
In general, for sequence groups with ni subjects
11 134.89 163.93 1.21529
14 223.39 245.92 1.10086 the probability that a group replicate will consist
17 169.70 188.05 1.10813 of ni identical subjects, is (1/ni )ni 1. Since boot-
strapping with the aim of obtaining confidence
H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135 25

Table 2
ANOVA on ln-transforms of the theophylline AUC(0-inf) data given in Table 1, with 90%- and 95%-confidence intervals for the
median treatment and period ratios, as obtained by the program BIOEQV80

Arithmetic means of ln-transforms:


Treatment number 1 +5.42866E+00
(Reference)
Treatment number 2 +5.43056E+00
(Test)
Geometric means of untransformed 6alues:
Treatment number 1 +2.27844E+02
(Reference)
Treatment number 2 +2.28277E+02 100.19% of reference mean
(Test)
Period 1 +2.20167E+02
Period 2 +2.36237E+02 107.30% of period 1 mean
Variable source DF Sum of squares Mean square F-value Prob(F)

Subjects 17 +1.208092E+00 +7.106423E02 3.7915 0.00530


Subject/Gr. 16 +1.111719E+00 +6.948243E02 3.7071 0.00626
Groups 1 +9.637288E02 +9.637288E02 1.3870 0.25613
Periods 1 +4.466680E02 +4.466680E02 2.3831 0.14220
Treatments 1 +3.247940E05 +3.247940E05 0.0017 \0.30
Residue 16 +2.998917E01 +1.874323E02
Total 35 +1.552683E+00
Residual variation 13.75% of untransformed
coefficient reference mean
Test-to-Reference ratio Period 2-to-Period 1 ratio

Point estimate 1.0019 1.0730


90%-Confidence Interval 0.9252; 1.0850 0.9908; 1.1620
95%-Confidence Interval 0.9095; 1.1037 0.9740; 1.1820

intervals commonly employs some thousands of a minimum number of 12 evaluable subjects [7].
bootstrap replicates [9], the occurrence of boot- With two groups each of six subjects the proba-
strapped sequence groups containing only one bility of a group tie to occur is 2(1/6)5 =
re-sampled subject is almost inevitable when 0.00026, so that only one group tie in 3900
handling small groups. In such cases all inter- bootstrap replicates is expected. With two
subject differences and variances vanish and the groups each of seven subjects, one group tie in
cell mean reduces to the re-sampled subjects no less than 59 000 bootstrap replicates is ex-
Test or Reference value. Such replicates of a pected.
sequence group are considered as biased and Obviously such small numbers of rejected and
non-informative. In the program to be discussed re-sampled sequence groups cannot seriously
in Section 5.1 they are rejected and sampling bias the final results of bootstrapping. They
with replacement is restarted. might, however, be of interest when re-analysing
Fortunately, such unwanted group ties can older studies with less than six subjects per
only occur when handling small sequence group. It may add to the completeness of infor-
groups. Current standards of bioequivalence mation that the number of group ties be stated
testing with two-period crossover studies require in the results of each bootstrap run.
26 H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135

4. Theory for four-period studies variance between formulations (DF=1), the vari-
ance between periods (DF=3), and the Subject-
4.1. General by-Formulation interaction variance across
sequences (DF= N s), all tested in the F-test
In four-period two-treatment crossover studies against the residual variance (DF= 2N+ s 4).
with replication of the formulations T and R, the The point estimate of the difference of the mean
six possible sequence groups are RTTR, TRRT, Test and the mean Reference, as well as the
TRTR, RTRT, RRTT and TTRR. For the sake 90%-confidence interval of this difference, are cal-
of completeness, the program to be discussed in culated in the ln-logarithmic domain. Back trans-
Section 5.2 allows the inclusion of all possible formation by exponentiation results in the
sequences, consecutively numbered 1 to 6, with geometric mean T/R ratio and its 90%-confidence
group sizes n1 to n6 and total study size N =ni, interval.
with N5 300. Within each participating subject
no missing values are allowed. The number of 4.3. Population bioequi6alence from four-period
non-empty groups is denoted as s. It is obvious studies
that the analysis requires that 25s 5 6.
The US FDA [7] recommends studies with only In the two Draft Guidances [3,5] and in the
two sequence groups, TRTR and RTRT. In this final Guidance [7] issued by the US FDA, the
case, N = n3 + n4 and s =2. statistical criteria for PBE from four-period stud-
The analyses of the three types of bioequiva- ies have remained essentially equal. For popula-
lence (on ln-transforms) have in common that the tion statistics from four-period studies the
formulation means, MT and MR, are adjusted for equations are the same as Eqs. (1) and (2) given in
unequal sequence group sizes by dividing the sum Section 3.2 for two-period studies. The same
of the sequence group means of either formula- holds true for the numerical values of the specified
tion by the number of non-empty groups, s. This constant total variance | 2T0 and the PBE limit qP.
results in unbiased estimates of vT and vR. It For four-period studies the Eqs. (1) and (2) are
should be noted that calculation schemes for rearranged to result into the following linearised
statistics other than MT and MR in the three types criteria:
of bioequivalence analysis are fully independent
of one another. Reference-scaled:
p1 = (vT vR )2 + (| 2TT | 2TR) qP| 2TR 5 0
4.2. A6erage bioequi6alence from four-period
(7)
studies
Constant-scaled:
p2 = (vT vR )2 + (| 2TT | 2TR) qP| 2T0 5 0 (8)
The assessment of ABE consists of performing
an ANOVA on the ln-transformed characteristics
(e.g. AUC, Cmax), followed by computing the In actual practice the population parameters
point estimate of the geometric mean T/R ratio vT, vR, | 2TT and | 2TR are unknown. They are
and the 90%-confidence interval of this ratio. estimated by the sample statistics MT, MR, s 2TT
In the ANOVA the total variance (with DF= and s 2TR, respectively. In addition to the sample
degrees of freedom= 4N 1) is split up into the statistics s 2TT and s 2TR, the following sample vari-
variance between subjects (DF=N 1) and the ances are required for four-period studies [7]: s 2WT
variance within subjects (DF=3N). and s 2WR are the within-subject variances of T and
The variance between subjects distinguishes the R respectively, and s 2BT and s 2BR are the between-
variance between groups (DF=s 1), tested in subject variances of T and R, respectively. Each of
the F-test against the variance within groups these variances is obtained by pooling across se-
(DF =N s). quence groups and is, therefore, known with N
The variance within subjects distinguishes the s degrees of freedom.
H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135 27

Since s 2TT = s 2BT +0.5s 2WT [7] and similarly for the term (MT MR )2 and the 2-distribution
s =s 2BR +0.5s 2WR, the linearised criteria of Eqs.
2
TR is used for the variance terms. Each term itself is
(7) and (8) are estimated by: taken as point estimate Ei. Next, for each term the
value of Ui = (Hi Ei )2 is calculated. Ultimately,
Reference-scaled: the 95% upper confidence bounds, designated
H(eta1) and H(eta2) for PBE, are obtained as
eta1 = (MT MR )2 +s 2BT +0.5s 2WT
follows:
(1+ qP)s 2BR 0.5(1 +qP)s 2WR 50 (9)
Reference-scaled:
Constant-scaled:
H(eta1)= %Ei + (%Ui )1/2 (11)
eta2 = (MT MR )2 +s 2BT +0.5s 2WT s 2BR
0.5s 2WR qP| 2T0 50 (10) Constant-scaled:

As specified in Appendix F of FDAs Guideline H(eta2)= %Ei + (%Ui )1/2 qP| 2T0 (12)
[7], the calculation of the 95% upper confidence
bound for eta1 or eta2 starts with calculating the
95% upper confidence bound Hi for each of the Using the mixed-scaling approach [7], the selec-
five terms in the expression of eta1 in Eq. (9) or tion of either the reference-scaled or the constant-
for each of the first five terms in the expression of scaled approach depends on the study estimate of
eta2 in Eq. (10). The central t-distribution is used the total variance of the reference, estimated by
s 2BR + 0.5s 2WR in the four-period design. If this
variance is 5| 2T0 (in many examples of the FDA
Table 3
Results of one bootstrapping run for the assessment of PBE of
a value of 0.04 has been used), the constant-scaled
the theophylline AUC(0-inf) data given in Table 1, as obtained criterion and its confidence bound H(eta2) should
by the program BIOEQV80 be computed. Otherwise the reference-scaled crite-
rion and its confidence bound H(eta1) should be
Calculations are Reference-scaled Run number 1 computed. If the upper bound for the appropriate
(ln-metric)
criterion is 5 0, PBE is concluded. Otherwise
Variance of original Ref. formulation 5.19723E02 PBE is not concluded.
(ln-metric) (DF= 16)
Variance of original Test formulation 3.62534E02 4.4. Indi6idual bioequi6alence from four-period
(ln-metric) (DF = 16) studies
Test-to-Reference variance ratio 0.6976
(ln-metric)
Variance factor =Epsilon (for critical 0.020 In Appendix G of FDAs Guidance on replicate
Theta) models and analysis [7] the statistical criteria for
Constant total variance (for critical 0.040 IBE have been specified, based on a publication
Theta) by Hyslop et al. [11]. For ln-transformed observa-
Theta1(Min.) 1.3623
Theta1(0.025) 0.9213
tions (e.g. AUC, Cmax) two criteria are distin-
Theta1(0.050) 0.8330 guished, based on a reparametrisation. This
Theta1(0.500) 0.2897 reparametrisation makes the analysis for IBE fully
Theta1(0.950) 0.3705 (critical independent of the analyses for ABE and PBE
value= 1.7448) described in the previous sections.
Theta1(0.975) 0.5199
Theta1(Max.) 1.4254
Reference-scaled:
Total number of 8000 bootstrap samples successfully pro-
(vT vR )2 + | 2S F + (| 2WT | 2WR)
cessed (plus 0 rejected sequence groups containing only one 5 qI (13)
repeated subject). | 2WR
28 H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135

Table 4 The US FDA [7] recommends mI = 0.05 and


Erythromycin AUC(0-inf) data from a two-period crossover
| 2W0 = 0.04. These numerical values result in
study [19]
qI = 2.495.
Subject Reference Test Ratio The Eqs. (13) and (14) are rearranged to re-
sult in the following linearised criteria:
(Period 1) (Period 2)
10 4.98 3.19 0.64056
Reference-scaled:
11 7.14 9.83 1.37675
12 1.81 2.91 1.60773 p1 = (vT vR )2 + | 2S F + (| 2WT | 2WR)
13 7.34 4.58 0.62398
14 4.25 7.05 1.65882 qI| 2WR 5 0 (16)
15 6.66 3.41 0.51201
16 4.76 2.49 0.52311 Constant-scaled:
17 7.16 6.18 0.86313
18 5.52 2.85 0.51630 p2 = (vT vR )2 + | 2S F + (| 2WT | 2WR)

(Period 2) (Period 1)
qI| 2W0 5 0 (17)
01 5.47 2.52 0.46069
02 4.84 8.87 1.83264 The criteria p1 and p2 are estimated by eta1
03 2.25 0.79 0.35111
04 1.82 1.68 0.92308
and eta2 as follows, where all variances are
05 7.87 6.95 0.88310 pooled across sequence groups (the notation is
06 3.25 1.05 0.32308 similar to that for PBE).
07 12.39 0.99 0.07990
08 4.77 5.60 1.17400 Reference-scaled:
09 1.88 3.16 1.68085
eta1 = (MT MR )2 + s 2S F + 0.5s 2WT
(1.5+ qI)s 2WR 5 0 (18)

Constant-scaled:
Constant-scaled: eta2 = (MT MR )2 + s 2S F + 0.5s 2WT 1.5s 2WR
(vT vR )2 + | 2S F +(| 2WT | 2WR) qI| 2W0 5 0 (19)
5qI (14)
| 2W0
Similar to the calculations for PBE, the calcu-
where vT is population average response of the lation of the 95% upper confidence bound for
ln-transformed measure for T; vR, population eta1 and eta2 of IBE starts with calculating the
average response of the ln-transformed measure 95% upper confidence bound Hi for each of the
for R; | 2S F, subject-by-formulation interaction four terms in the expression of eta1 (Eq. (18)) or
variance component; | 2WT, within-subject vari- for each of the first four terms in the expression
ance of T; | 2WR, within-subject variance of R; of eta2 (Eq. (19)). The central t-distribution is
| 2W0, specified constant within-subject variance used for the term (MT MR )2 and the 2-distri-
and qI is IBE limit. bution is used for the variance terms. Each term
The determination of both qI and | 2W0 is is taken as point estimate Ei. Next, for each
based on the consideration of the ABE criterion term the value of Ui = (Hi Ei )2 is calculated
and the addition of variance terms to the IBE and the 95% upper confidence bound H(eta1)
criterion: and H(eta2) for IBE are obtained as follows:

ABE limit +variance factor (ln 1.25)2 +mI Reference-scaled:


qI = =
variance | 2W0
(15) H(eta1)= %Ei (%Ui )1/2 (20)
H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135 29

Constant-scaled: 5. Description of programs

H(eta2)= %Ei (%Ui )1/2 qI| 2W0 (21)


5.1. Program BIOEQV80 for two-period studies
Using the mixed-scaling approach [7], the selec-
tion of either the reference-scaled or the constant- The procedures outlined in Section 3 have been
scaled approach depends on the study estimate of implemented in a new version of the PC-program
the within-variance of the reference product, esti- BIOEQV36 published earlier [8] for the statistical
mated by s 2WR in the four-period design. If the analysis of two-treatment two-period crossover
study estimate of s 2WR is smaller than | 2W0 (usually studies for the assessment of ABE. The latter
taken as 0.04), the constant-scaled criterion and program has been updated several times, ulti-
its associated confidence interval should be com- mately resulting in Version 7.0 (program
puted. Otherwise, the reference-scaled criterion BIOEQV70). This version included a statistical anal-
and its confidence interval should be computed. If ysis published by Gould [12]. This author claimed
the upper confidence bound for the appropriate that, by using individual residues per sequence-by-
criterion is 5 0, IBE is concluded. If the upper period cell, all three types of bioequivalence could
bound is \0, IBE is not concluded. be assessed from two-period crossover studies. It

Table 5
ANOVA on ln-transforms of the erythromycin AUC(0-inf) data given in Table 4, with 90%- and 95%-confidence intervals for the
median treatment and period ratios, as obtained by the program BIOEQV80

Arithmetic means of ln-transforms:


Treatment number 1 +1.52090E+00
(Reference)
Treatment number 2 +1.17982E+00
(Test)
Geometric means of
untransformed values:
Treatment number 1 +4.57633E+00
(Reference)
Treatment number 2 +3.25380E+00 71.10% of reference mean
(Test)
Period 1 +3.58443E+00
Period 2 +4.15420E+00 115.90% of period 1 mean
Variable source DF Sum of squares Mean square F-value Prob(F)

Subjects 17 +9.648715E+00 +5.675715E01 1.8368 0.11543


Subject/Gr. 16 +8.343429E+00 +5.214643E01 1.6876 0.15275
Groups 1 +1.305286E+00 +1.305286E+00 2.5031 0.13319
Periods 1 +1.958572E01 +1.958572E01 1.6338 \0.30
Treatments 1 +1.046995E+00 +1.046995E+00 3.3883 0.08428
Residue 16 +4.943980E+00 +3.089988E01
Total 35 +1.583555E+01
Residual variable 13.75% of untransformed
coefficient reference mean
Test-to-Reference ratio Period 2-to-Period 1 ratio

Point estimate 0.7110 1.1590


90%-Confidence Interval 0.5145; 0.9826 0.8386; 1.6016
95%-Confidence Interval 0.4800; 1.0531 0.7825; 1.7166
30 H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135

Table 6
Results of ten bootstrap runs each with 8000 replicates for the assessment of PBE, as obtained by the program BIOEQV80, of the
erythromycin AUC(0-inf) data given in Table 4

Calculations are reference-scaled (ln-metric)

Total variance of reference formulation 3.13274E01


(ln-metric)
Variance factor Epsilon (for critical 0.020
Theta)
Constant total variance (for critical 0.040
Theta)
Number of bootstrap samples 8000 in each of
ten runs
Mean Standard Variation coefficient Variation coefficient of
deviation single single run (%) mean (%)
run

Theta1 (Min.) 0.8754 0.1187 13.56 4.29


Theta1 (0.025) 0.1138 0.0094 8.30 2.62
Theta1 (0.050) 0.0349 0.0115 32.80 10.37
Theta1 (0.500) 0.9291 0.0054 0.59 0.19
Theta1 (0.950) 1.9175 0.0064 0.33 0.11
Theta1 (0.975) 2.1213 0.0183 0.86 0.27
Theta1 (Max.) 3.8703 0.2896 7.48 2.37

may, however, be assumed from FDAs final ply out of use: Westlakes symmetrical intervals
Guidance [7], that Goulds approach has not [13], Mandallaz and Maus analysis [14], Lockes
found regulatory approval. The program version confidence intervals for untransformed character-
BIOEQV70 is still available for those interested in istics [15], Rodda and Davies probabilities [16],
exploring Goulds approach. Hauck and Andersons probabilities [17], and the
The new program BIOEQV80 is suitable for the two one-sided tests [18], the latter because confi-
analysis of both average and PBE from two-pe- dence intervals are of equal value in decision
riod studies, following the procedures outlined in making. If any of these currently excluded tech-
Section 3. It does not include Goulds analysis. niques is desired, the PC-program BIOEQV62,
The maximum total study size (n1 +n2) is 60. All which is the latest update containing these proce-
results are displayed screen by screen and can dures, is still available from the author.
optionally be saved to a user-defined results file. In the program BIOEQV80 the minimum num-
ANOVA on ln-transforms is performed on ber of bootstrap replicates per run is 200, but the
parameters such as AUC and Cmax, followed by precision of the ultimate result is then very poor.
back-transformation by exponentiation. Estab- The maximum number of bootstrap replicates
lishing 90%-confidence intervals for the median per run is 8000; in many cases this will be suffi-
Test-to-Reference ratio assesses ABE. Similar cient.
confidence intervals can be obtained for charac- The final result of the bootstrapping procedure
teristics supposed to be normally distributed and, not only consists of the upper limit of the one-
therefore, without ln-transformation. All analyses sided 95%-confidence interval, which can be sym-
for ABE can also be performed by non-paramet- bolised as q1(0.95). Each bootstrapping run of
ric methods. log-transforms, resulting in a total number of B
A number of statistical analyses for ABE from ordered bootstrap replicates ranging from
two-period studies are not included any longer, q1(min.) to q1(max.), provides the following
since they may be considered as obsolete or sim- values:
H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135 31

q1(min.), q1(0.025), q1(0.05), q1(0.50), q1(max.). However, no valid conclusions can be


q1(0.95), q1(0.975), q1(max.), drawn, since critical values of qP(0.95) have not
and the critical value of qP(0.95). been established yet for characteristics without
If q1(0.95)5qP(0.95), the two formulations are logarithmic transformation.
concluded population bioequivalent. The table of values ranging from q1(min.) to
In order to obtain an estimate of the precision q1(max.) may already give some idea about the
of the bootstrapping process, repeat runs using distribution of q1. Complete frequency distribu-
the same original data and the same auxiliary tions of each bootstrapping run can be obtained
settings are possible, from which overall means optionally in two different ways: by automatic
and variation coefficients of q1(min.) to q1(max.) and by user-defined lower and upper boundaries
are obtained. and class width. In the automatic procedure noth-
Bootstrapping can also be performed by ing has to be specified, but the resulting mid-class
BIOEQV80 with untransformed characteristics. This values are almost never smooth, so that fre-
similarly results in values ranging from q1(min.) to quency distributions of repeat runs cannot easily

Table 7
Ln-transformed Cmax-values from Shumaker & Metzlers phenytoin study [20]

Sequence (group) Subject code Test 1 Test 2 Reference 1 Reference 2

(Period 2) (Period 3) (Period 1) (Period 4)


RTTR 1 0.4383 0.7885 0.4886 0.7372
2 0.9163 0.6831 0.8154 0.8796
5 0.8329 0.8242 0.6780 0.6313
8 1.0543 0.7793 0.6152 0.8372
9 0.5933 0.6471 0.7747 0.7975
12 0.8544 0.8372 0.6881 0.5481
14 0.8020 0.6831 0.5933 0.7655
15 0.4447 0.4383 0.4511 0.5878
17 0.4187 0.4574 0.3293 0.2852
19 0.9969 0.9042 0.9243 0.6419
20 0.7655 0.8502 0.6627 0.5539
24 0.7372 0.6259 0.4383 0.5596
25 0.4383 0.7561 0.5481 0.4187

(Period 1) (Period 4) (Period 2) (Period 3)


TRRT 3 0.4055 0.4187 0.5423 0.3148
4 0.9708 0.8459 0.8838 0.7885
6 0.4824 0.7655 0.6471 0.5247
7 1.0403 1.2179 1.1725 0.9555
10 0.5539 0.7747 0.5068 0.8020
11 0.8329 0.9282 0.8109 0.9555
13 0.6523 0.7227 0.6981 0.6098
16 0.7419 0.8629 0.9478 0.8416
18 0.7975 1.0332 0.7467 0.5822
21 0.8198 0.7080 0.6729 0.6313
22 0.8065 0.9282 0.8154 0.5822
23 0.5128 0.9083 0.4824 0.7178
26 0.5653 0.4700 0.5481 0.3716
Grand mean test +7.3718E01
Grand mean ref. +6.6159E01
Delta (Test-Reference) +7.5588E02
32 H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135

Table 8
ANOVA on the ln-transformed data of Table 7, as obtained by the program BIOEQ2X2

Source of variation Sum of squares DF Mean square F-ratio Tail probability

Between subjects +2.6190E+00 25 +1.0476E01


Within groups +2.5266E+00 24 +1.0527E01
Between groups +9.2438E02 1 +9.2438E02 0.8781 0.358
Within subjects +1.2222E+00 78 +1.5669E02
Residual +7.2786E01 50 +1.4557E02
Between formulations +1.4855E01 1 +1.4855E01 10.2047 0.002
Between periods +6.9183E02 3 +2.3061E02 1.5842 0.205
Subjectformulations (Seq) +2.7657E01 24 +1.1524E02 0.7916 0.714
Grand total +3.8412E+00 103
Testing for ABE
(Median Test-to-Reference formulation ratio, after
back-transformation by exponentiation):
Point estimate =107.85; 90%-confidence
interval = 103.66, 112.22%

Table 9
Analysis for PBE of the ln-transformed data of Table 7, according to Appendix F of the January 2001 Guidance of the US FDA,
as obtained by the program BIOEQ2X2

Delta=Adjusted mean formulation difference +7.5588E02


DF for variances pooled across Sequences 24
Formulasubject (Seq.) interaction variance 1.1524E02
Intersubject variance of Test (across Seq.) 3.1415E02
Intersubject variance of Ref. (across Seq.) 2.6983E02
Intrasubject variance of Test (across Seq.) 1.4639E02
Intrasubject variance of Ref. (across Seq.) 1.4113E02
Total variance of reference (across Seq.) 3.4040E02
Analysis is constant-scaled
This results in criterion ThetaP 1.7448E+00
Specified constant variance 4.0000E02
Variance factor ( =Epsilon) 2.0000E02
Parameter function (S2 =variance) 95% Upper confidence bound (H) Point estimate (E) U = (HE)2

Delta +1.2456E02 +5.7135E03 +4.5463E05


Intersubject S2 Test +5.4444E02 +3.1415E02 +5.3033E04
Intrasubject S2 Test +1.2685E02 +7.3193E03 +2.8788E05
Intersubject S2 Ref. 1.7784E02 2.6983E02 +8.4630E05
Intrasubject S2 Ref. 4.6508E03 7.0566E03 +5.7879E06
(+) (+)
+1.0408E02 +6.9500E04
Eta2

Point estimate 5.9385E02


95% Upper confidence bound 3.3022E02
Critical value 0.0
H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135 33

be compared. The user-defined procedure facili- assessment of average, population and IBE. The
tates comparison of repeat runs in those cases maximum number of subjects is 300. The mini-
where repeat runs have been given the same mum and maximum number of sequence groups
boundaries and class width. are two and six, respectively. No missing values
The PC-program BIOEQV80, written and com- are allowed, since the analyses are based on the
piled with TURBOBASIC (Borland), occupies 137 assumption that each subject has been treated
kbytes when stored on disk and 388 kbytes of two times with either pharmaceutical formula-
conventional internal memory when running. In tion. All entered values and calculated results
accordance with FDAs final Guidance [7], the are saved to a user-defined results file on disk.
parameters mP and | 2T0 can be given different The three types of bioequivalence analysis are
numerical values, depending on the information only suitable for ln-transforms of characteristics
given by appropriate FDA staff to sponsors or such as AUC and Cmax. If necessary, non-trans-
applicants. A Pentium PC of ] 200 MHz is rec- formed data can be ln-transformed after data
ommended to prevent long run times for entry.
bootstrapping. As stated already in Section 4.2, the formula-
tion means of ln-transforms (MT and MR ), ad-
5.2. Program BIOEQ2X2 for four-period studies justed for unequal group sizes, are the only
statistics being the same for all three types of
The procedures outlined in Section 4 have bioequivalence analysis. These statistics are cal-
been implemented in the PC-program BIOEQ2X2 culated following the schemes of the Appendices
for the statistical analysis of two-treatment four- F and G of FDAs recent Guidance [7].
period crossover studies with replication for the The main results of each analysis are dis-

Table 10
Analysis for IBE of the ln-transformed data of Table 7, according to Appendix G of the January 2001 Guidance of the US FDA,
as obtained by the program BIOEQ2X2

Delta=Adjusted mean formulation difference +7.5588E02


DF for variances pooled across Sequences 24
FormulaSubject (Seq.) interaction variance 1.1524E02
Intrasubject variance of Test (across Seq.) 1.4639E02
Intrasubject variance of Reference (across Seq.) 1.4113E02
Analysis is constant-scaled
This results in criterion ThetaI 2.4948E+00
Specified constant intrasubject variance 4.0000E02
Variance factor ( = Epsilon) 5.0000E02
Parameter function (S2 = variance) 95% Upper confidence bound Point estimate (E) U = (HE)2
(H)

Delta +1.2456E02 +5.7135E03 +4.5463E05


Interaction S2 +1.9971E02 +1.1524E02 +7.1361E05
Intrasubject S2 Test +1.2685E02 +7.3193E03 +2.8788E05
Intrasubject S2 Reference 1.3952E02 2.1170E02 +5.2091E05
(+) (+)
+3.3868E03 +1.9770E04
Eta2

Point estimate 9.6406E02


95% Upper confidence bound 8.2345E02
Critical value 0.0
34 H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135

played on screen; all necessary details can be The results of a bootstrapping run on the origi-
found in the user-defined results file on disk. nal data of Table 1 are given in Table 3. As can
After data entry an ANOVA as scheduled in be seen from this table, the experimental value of
Section 4.2 is always performed, followed by q1(0.95) is far below the critical value of qP(0.95),
back-transformation by exponentiation of the so that the formulations satisfy the requirements
parameters of interest. Establishing the point esti- of PBE, in addition to those of ABE.
mate and the 90%-confidence interval for the me- Table 4 gives the AUC-values of erythromycin
dian T/R formulation ratio assesses ABE. used by Chow and Liu [19] as an example. The
The analyses for PBE and IBE are optional. ANOVA on log-transforms given in Table 5
These two analyses are not only fully independent shows a lack of ABE. The results of a multiple
of each other, but also independent of the preced- bootstrap run are given in Table 6. As appears
ing ANOVA. It should be noted that all variances from the large value of q1(0.95), which exceeds the
used in the assessment of population or IBE are critical value of 1.7448, the formulations are not
pooled across sequences. This is in contrast to the population bioequivalent either.
ANOVA, which except for the Subject-by-For- Table 6 also shows how the precision of boot-
mulation interaction componentis based on strapping can be estimated from repeat runs. This
overall variances within and between subjects, may be interesting since the technique of boot-
without distinguishing within-subject variances strapping has often been criticised, because of a
per formulation. lack of information on the precision of its results.
In accordance with FDAs final Guidance [7],
the parameters necessary to obtain the criterion 6.2. Sample runs from four-period studies
[P or [I can be given different numerical values,
depending on the information given by appropri- Shumaker and Metzler [20] published a two-
ate FDA-staff to sponsors or applicants. formulation four-period replicate bioequivalence
For PBE as well as for IBE the analyses ulti- study with phenytoin. They compared a single
mately result in the 95% upper confidence bound dose of 125 mg of a test phenytoin oral suspen-
Heta1 if the analysis is reference-scaled, or Heta2 sion formulation with a commercially available
if the analysis is constant-scaled. If the confidence reference drug formulation (Dilantin-125) in two
bound is 5 0, PBE or IBE, respectively, is sequence groups (RTTR and TRRT) each with 13
concluded. fasted healthy adult male volunteers. This resulted
The PC-program BIOEQ2X2, written and com- in values of AUC and Cmax without missing data.
piled with TURBOBASIC (Borland), occupies 111 The authors concluded ABE from the 90%-confi-
kbytes when stored on disk and 341 kbytes of dence intervals of the median T/R formulation
conventional internal memory when running. ratios of either parameter, based on analyses of
variance using the SAS package. In addition, they
concluded IBE based on the close similarity of the
6. Sample runs formulation means and of the within-subject vari-
ances per formulation for both parameters. Shu-
6.1. Sample runs from two-period studies maker and Metzler reasoned that the proposed
IBE criteria were not computed since there was
Table 1 shows the theophylline data used as an little difference between the two products, and the
example in a number of papers referenced earlier estimate of the interaction term was zero. This
[8]. The ANOVA as presented by BIOEQV80 is means that although they did state the original
given in Table 2, which shows that the formula- FDA criteria for IBE, they did not estimate them.
tions satisfy ABE requirements. These two tables It should, however, be noted that at the time of
have also been used in Section 3.4 to demonstrate their publication (1998) no decision had been
that the results of bootstrapping are independent made by any regulatory body with respect to the
of (deliberately introduced) period effects. best procedure of evaluating these criteria.
H.P. Wijnand / Computer Methods and Programs in Biomedicine 70 (2003) 2135 35

Table 7 shows the ln-transforms of Shumaker Individual Bioequivalence Approaches. October 1997.
and Metzlers Cmax-values. The ANOVA using the [4] R. Schall, H.G. Luus, On population and individual
bioequivalence, Stat. Med. 12 (1993) 1109 1124.
program BIOEQ2X2 is given in Table 8. The result- [5] US Food and Drug Administration, Center for Drug
ing point estimate (107.85%) and the 90%-confi- Evaluation and Research (CDER): Draft Guidance. Aver-
dence interval of the median Test-to-Reference age, Population, and Individual Approaches to Establish-
formulation ratio (103.66%, 112.22%) exactly ing Bioequivalence. August 1999.
match Shumaker and Metzlers results obtained [6] US Food and Drug Administration, Center for Drug
Evaluation and Research (CDER): Guidance: Bioavailabil-
with the SAS analysis. ity and Bioequivalence Studies for Orally Administered
The results of the analyses for PBE and IBE Drug Products General Considerations. October 2000.
from the program BIOEQ2X2 are given in Tables 9 [7] US Food and Drug Administration, Center for Drug
and 10, respectively. Since in both cases the intra- Evaluation and Research (CDER): Guidance for Industry.
subject variance of the test formulation (across Statistical Approaches to Establishing Bio-equivalence.
January 2001.
sequences) is smaller than 0.04, the analyses are [8] H.P. Wijnand, Updates of bioequivalence programs (in-
constant-scaled. In both cases the 95% upper cluding statistical power approximated by Students t),
confidence bound for eta2 is below zero, which Comput. Methods Programs Biomed. 42 (1994) 275 281.
leads to the conclusion of PBE as well as of IBE [9] B. Efron, R.J. Tibshirani, An Introduction to the Boot-
for the parameter Cmax. The results of the statisti- strap, Chapman and Hill, 1993.
[10] R. Schall, A unified view of individual, population and
cal analyses for the parameter AUC (not shown average bioequivalence, in: H.H. Blume, K.K. Midha
here) are similar. (Eds.), Bio-International 2, International Conference of
Since phenytoin is a drug that has already been F.I.P. Bio-International 94 held in Munich, Germany, June
on the market for a considerable period of time, 15 17, 1994, Medpharm Scientific Publishers, Stuttgart,
the assessment of PBE would not have been rele- 1995, pp. 91 106.
[11] T. Hyslop, F. Hsuan, D.J. Holder, A small sample confi-
vant in actual practice. Its results are included in dence interval approach to assess individual bioequiva-
order to show the details of the statistical analysis. lence, Stat. Med. 19 (2000) 2885 2897.
[12] A.L. Gould, A practical approach for evaluating popula-
tion and individual bioequivalence, Stat. Med. 19 (2000)
2721 2740.
7. Availability of programs [13] W.J. Westlake, Use of confidence intervals in analysis of
comparative bioavailability trials, J. Pharm. Sci. 61 (1972)
Free copies of the programs BIOEQV80 and 1340 1341.
BIOEQ2X2 can be obtained from the author by [14] D. Mandallaz, J. Mau, Comparison of different methods
for decision-making in bioequivalence assessment, Biomet-
sending a request either by e-mail or alternatively
rics 33 (1981) 213 222.
by a letter with a formatted blank 90 mm (3.5 in.) [15] C.S. Locke, An exact confidence interval from untrans-
1.4 Mbytes diskette. formed data for the ratio of two formulation means, J.
Pharmacokin. Biopharm. 12 (1984) 649 655.
[16] B.E. Rodda, R.L. Davis, Determining the probability of an
important difference in bioavailability, Clin. Pharmacol.
References Ther. 28 (1980) 247 252.
[17] W.W. Hauck, S. Anderson, A new statistical procedure for
[1] US Food and Drug Administration, Center for Drug testing equivalence in bioequivalence assessment, J. Phar-
Evaluation and Research (CDER): Statistical procedures macokin. Biopharm. 12 (1984) 83 91 and 657.
for bioequivalence studies using a standard two-treatment [18] D.J. Schuirmann, A comparison of the two one-sided tests
crossover design. Statement under 21 CFR 10.90, 29 June procedure and the power approach for testing equivalence
1992. in bioequivalence assessment, J. Pharmacokin. Biopharm.
[2] Commission on the European Communities, CPMP Work- 15 (1987) 657 680.
ing Party on Efficacy of Medicinal Products, Note for [19] S.-C. Chow, J.-P. Liu, Design and Analysis of Bioavailabil-
Guidance, Investigation of Bioavailability and Bioequiva- ity and Bioequivalence Studies, Marcel Dekker, 1989, pp.
lence, December 1991, III/54/89-EN. 180 182.
[3] US Food and Drug Administration, Center for Drug [20] R.C. Shumaker, C.M. Metzler, The phenytoin trial is a case
Evaluation and Research (CDER): Draft Guidance. In study of individual bioequivalence, Drug Inf. J. 32 (1998)
Vivo Bioequivalence Studies Based on Population and 1063 1072.

Potrebbero piacerti anche