Sei sulla pagina 1di 9

Sample Size Graphs for "Proving the

Null Hypothesis"
William C. Blackwelder and Marie A. Chang
National Institute of Allergy and Infectious Diseases, Bethesda, Maryland

ABSTRACT: Sample size graphs are given for clinical trials designed to test whether an ex-
perimental therapy is as effective as a standard therapy. We assume a dichotomous
outcome variable and a one-sided test of the hypothesis that the probability of success
with standard therapy is greater than the probability of success with experimental
therapy by at least some specified amount 8. Graphs are given for significance level
= 0.01, 0.025, 0.05; type II error ~ = 0.10, 0.20; and 8 = 0.10, 0.20.

KEY W O R D S : design, hypothesis, sample size, equivalence

For clinical trials d e s i g n e d to s h o w equivalence--i.e., that an experimental


t h e r a p y is as effective as a standard therapy, but not necessarily more effec-
f i v e - - t h e usual null h y p o t h e s i s of equal treatment effects is not appropriate
[1]. Rather, w e test the h y p o t h e s i s that standard t h e r a p y is actually more
effective t h a n experimental t h e r a p y by at least some specified amount. To
e m p h a s i z e this distinction from the usual situation and to aid planners of
such trials, w e p r e s e n t g r a p h s of the required sample size w h e n the t r e a t m e n t
c o m p a r i s o n is based o n binomial proportions.
In this case w e test
H0 :'rrs ~ ~e + 8
against
/-/1 : ~rs < "rre + 8,
w h e r e ~rs a n a 7re are success probabilities with standard t h e r a p y and experi-
mental therapy, respectively, and 8 is the m i n i m u m difference of practical
interest; that is, if ~'s - ~e < 8, w e consider the experimental t h e r a p y as
effective as s t a n d a r d therapy.
For studies large e n o u g h to justify use of the normal approximation to the
binomial, w e can test H0 using the statistic
z = (ps - pe - 8)/s,

Received February 22, 1983; revised September 30, 1983.


Address reprint requests to: Dr. William C. Blackwelder, National Institute of Allergy and Infectious
Diseases, Westwood Building, Room 739, Bethesda, MD 20205.
Controlled Clinical Trials 5:97-105 (1984) 97
© Elsevier Science Publishing Co., Inc. 1984 0197-2456/84/$3.00
52 Vanderbilt Ave., New York, New York 10017
98 w . c . Blackwelder and M. A. Chang

where
s = [ps(1 - ps)/ns + pc(1 - p~)/n~] 1/2,
ps a n d pe are the o b s e r v e d success proportions with standard and experimental
therapy, respectively, and ns and n~ are the corresponding n u m b e r s of pa-
tients. We reject H0 at significance level 0~ if z is less than the lower 100~%
point of the s t a n d a r d n o r m a l distribution. If the two groups are of equal size,
the n u m b e r of patients required in each g r o u p for the test to have p o w e r
1 - ~ is given b y
n = (zl-~ + z,-~)2[Trs(1 - ~s) + ~r~(1 - *r,)l/(~ - ~e - 8) 2, (1)

w h e r e z~_~ a n d z~_, are u p p e r percentage points of the standard normal


distribution, n is the n u m b e r of patients in each group, and ~t~ < ~e + 8.
Figures 1-12 give the sample size n in each group, u p to 1000, as a function
of "G for ~r¢ = 0.1 to 0.9 in increments of 0.1 and all combinations of ~ = 0.01,
0.025, 0.05; ~ = 0.10, 0.20; and 8 = 0.10, 0.20. Since the calculations are based
on the n o r m a l approximation, only sample sizes of at least 20 such that the
expected frequencies of b o t h successes a n d failures in each t r e a t m e n t g r o u p
are at least 5 are s h o w n . In most other cases, increasing the sample size
calculated from equation (1) to satisfy the above conditions should e n s u r e
that the actual significance level is sufficiently close to the nominal or theo-
rectical level, as indicated in Table 1. For values of "G not graphed, it is simplest
to use equation (1) directly; otherwise, a r o u g h approximation to n can be
obtained from interpolation in the logarithmic scale b e t w e e n the sample sizes
from the curves for the two nearest values of ~r¢.
It is i m p o r t a n t to e m p h a s i z e that, as w e have set u p the problem, ,n~ a n d
"rr~are success probabilities a n d 8 > 0. If the comparison is based o n p r o p o r t i o n
of failure rather t h a n success, t h e n w e use equation (1) or the graphs in Figures
1-12 with "rr~replaced b y ~* = 1 - "rr,a n d ~r~ replaced b y ~* = 1 - ~ , w h e r e
"rr* a n d or* are failure probabilities; 8 will still be positive.

Examples
1. S u p p o s e 0~ = 0.05, ~ = 0.10, and 8 = 0.10, and ~rs = ~re = 0.40. T h e n
from Figure 1 the sample size for each g r o u p is about 410; direct calculation
from e q u a t i o n (1) gives n = 411.
2. For oL, ~, 8, a n d ~rs as above a n d ~re = 0.45, interpolation in the logarithmic
scale b e t w e e n the sample sizes for ,rr~ = 0.40 (410) and ~'~ = 0.50 (105)
gives a sample size of about 210; this is an overestimate by about 13%, as
direct calculation from equation (1) yields n = 186.

REFERENCE

1. Blackwelder WC: "Proving the null hypothesis" in clinical trials. Controlled Clin
Trials 3:345-353, 1982
Sample Size G r a p h s 99

IOOQ
8OO
IO0
8OO
4O0

SO0

20C

rl IOC
8c
eG
s~
4C

0.0 0.| 0.2 0.$ 0-4 O.S O.e 0.? 0.8 O-g | .0
nS

Figure 1 N u m b e r of patients p e r group, n, to test H0 : ~s ~ ~r, + 0.10 at 5% signif-


icance level with 90% power.

1000
800
800
liP0
400

$00

'ZOO

I~ tO0
80
SO
SO
4.0

30

20

10
0.0 0.~ 0.2 0.3 0.4 0.5 0.6 0.? 0.8 o.g 1.0
IYS

Figure 2 N u m b e r of patients p e r group, n, to test Ho : It, 1> ~, + 0.10 at 5% signif-


icance level with 80% power.
100 W.C. Blackwelder and M. A. Chang

I000
80@

600
600
400

$00

200

100
80

60
60
40

30

20

lol
0,0 0.| 0,2 0.3 0.4 O.E 0.6 0.7 0.8 0.9 1.0
TTs

Figure 3 N u m b e r of p a t i e n t s p e r g r o u p , n, to test H0 : ~r,/> ~e + 0.20 at 5% signif-


icance level w i t h 90% power.

tO00
800

600
500
400

300

ZOO

;00
80

6G
EC
4C

3C

1(
0.0 0.1 0.2 0.3 0.4 O.E 0.6 0.7 0,8 0-8 1-0

ItS

Figure 4 N u m b e r of p a t i e n t s per g r o u p , n, to test H0 : ~s >/~e + 0.20 at 5% signif-


icance level w i t h 80% power.
Sample Size Graphs 101

@e @s q,e ~e
I000
eoo

ooo
6oo
400

300

ZOO

D 100
80

SO
60
40

30

~0
0.0 O.t 0.2 0.3 0.4 0.5 0.8 0.7 0.8 0.9 ! .0
T~s

Figure 5 N u m b e r of patients per group, n, to test Ho : ~r,/> ~, + 0.10 at 2.5% sig-


nificance level with 90% power.

1000

6OC
GO©
40G

30C

200

rl 100
80

60
60
40

3O

20

10
0.o 0.! 0.2 0.3 0-4 0.8 0.8 0.'/ 0.8 0.$ l.O
T~

Figure 6 N u m b e r of patients per group, n, to test Ho : ~r,/> ~r, + 0.10 at 2.5% sig-
nificance level with 80% power.
102 w.c. Blackwelder a n d M. A. Chang

,oo_ / ! / / - / / - / S T
,0-:G:;:f~ ~ f Lt! ! ...P i t J

, ; :l~ l . . . . . 1 i . . . . . .

. . . . . . . . i i , : .16 - 0.20
10 . . . . . . i . . . . . . .
0-0 0.1 O.Z 0.3 0.4 0.5 0.6 0-? 0.8 0,9 1.0
?is

Figure 7 N u m b e r of patients per group, n, to test H0 : ~,/> ~r< + 0.20 at 2.5% sig-
nificance level with 90% power.

~Q ~Q ~Q ~e ~Q @e ~i
I000
800

600
600
400

$00

200

n lO0
80

60
SO
40

30

20

10
0.0 0.l 0.t 0.3 0.4 0.6 0.6 0.7 0.8 " 0.9 1.0

ns

Figure 8 N u m b e r of patients per group, n, to test H0 : ~, >I ~r, + 0.20 at 2.5% sig-
nificance level with 80% power.
Sample Size Graphs 103

~/~ ~t ~ ~'~ ~.~ ~.~ ~Z~~ ,,~ ~ ~


I000
800

eO0
SO0
400

300

200

B tOO
80

60
80
40

30

ZO
. } ~ ~ E ! ; ! ! ; : ! : i i a = 0.01

1i ;~t !' ~ i :, :,, ~ : ,ii~ "°.1°o.,o


tO
0-0 0.1 0.2 0.3 0.4 0.1; 0.8 0.? 0.8 0-9 ! -0
TTs

Figure 9 N u m b e r of patients per group, n, to test H0 : ~r~/> ~, + 0.10 at 1% signif-


icance level with 90% power.

IOOG
8o¢

6oc
i~oc
4oc

$oc

2o©

n to©
80

6o
6(]
40

so

10
0-0 0.1 O. 9 0.3 0*4 0.1 0.6 0.7 0*8 O.ll Io0
T[s

Figure 10 N u m b e r of patients per group, n, to test H0 : ~r, t> 7r, + 0.10 at 1% signif-
icance level with 80% power.
104 W . C . Blackwelder a n d M. A. Chang

1000
leO0

600
Ii00
400

$00

200

n tO0
80

80
60
40

30

20

10
0,0 O.I O.t 0.3 0.4 0.6 0.6 0.? 0.8 0.9 ! -0
Tls

Figure 11 N u m b e r of patients per group, n, to test H0 : 7,/> ~e + 0.20 at 1% signif-


icance level with 90% power.

IO00
800

600
SO0
400

300

200

n tO0
80

60
60
40

$0

2O

;0
0.0 O*t 0,2 0.$ 0.4 0.6 0.6 0-7 0.8 0-9 ! .0
TTs

Figure 12 N u m b e r of patients per group, n, to test H0 : 7,/-- ~r, + 0.20 at 1% signif-


icance level with 80% power.
Sample Size Graphs 105

Table 1 A c t u a l S i g n i f i c a n c e Level of O n e - S i d e d Test of Ho : ~rs ~> ~'e + 8


at N o m i n a l S i g n i f i c a n c e Level c~
Actual significance leveP
n ~r~ ~r~ 8 ~ = 0.05 ct = 0.025 a = 0.01
20 0.6 0.5 0.1 0.050 0.028 0.011
20 0.7 0.6 0.1 0.059 0.031 0.014
25 0.8 0.7 0.1 0.056 0.030 0.014
50 0.9 0.8 0.1 0.060 0.030 0.015
100 0.95 0.85 0.1 0.061 0.034 0.016
500 0.99 0.89 0.1 0.060 0.034 0.015
20 0.6 0.4 0.2 0.041 0.021 0.009
20 0.7 0.5 0.2 0.046 0.025 0.010
25 0.8 0.6 0.2 0.054 0.026 0.013
50 0.9 0.7 0.2 0.058 0.031 0.012
100 0.95 0.75 0.2 0.060 0.030 0.014
500 0.99 0.79 0.2 0.057 0.030 0.014

'Calculated by adding the probabilities, given n, ~s, ~re,8, and a, of all possible observed outcomes
p, and p~ that yield a significant test statistic.

Potrebbero piacerti anche