Sei sulla pagina 1di 58

Analysis of matched data; plus,

diagnostic testing
Correlated Observations
 Correlated data arise when pairs or clusters
of observations are related and thus are
more similar to each other than to other
observations in the dataset.
 Ignoring correlations will:
– overestimate p-values for within-person or
within-cluster comparisons
– underestimate p-values for between-person or
between-cluster comparisons
Pair Matching: Why match?
 Pairing can control for extraneous sources
of variability and increase the power of a
statistical test.
 Match 1 control to 1 case based on potential
confounders, such as age, gender, and
smoking.
Example
 Johnson and Johnson (NEJM 287: 1122-1125,
1972) selected 85 Hodgkin’s patients who had a
sibling of the same sex who was free of the
disease and whose age was within 5 years of the
patient’s…they presented the data as….
Tonsillectomy None

Hodgkin’s 41 44

Sib control 33 52

OR=1.47; chi-square=1.53 (NS)


From John A. Rice, “Mathematical Statistics and Data Analysis.
Example
 But several letters to the editor pointed out that
those investigators had made an error by
ignoring the pairings. These are not
independent samples because the sibs are
paired…better to analyze data like this:
Control
Case Tonsillectomy None

Tonsillectomy 26 15

None 7 37

OR=2.14*; chi-square=2.91 (p=.09)


From John A. Rice, “Mathematical Statistics and Data Analysis.
Pair Matching: example

Match each MI case to an MI control based on


age and gender.
Ask about history of diabetes to find out if
diabetes increases your risk for MI.
Pair Matching: example
Just the discordant cells are
informative! MI controls
MI cases Diabetes No Diabetes
46
Diabetes 9 37

No diabetes 16 82 98

25 119 144

Which cells are informative?


Pair Matching
MI controls
MI cases Diabetes No Diabetes
46
Diabetes 9 37

No diabetes 16 82 98

25 119 144

OR estimate comes only from discordant pairs!


The question is: among the discordant pairs, what
proportion are discordant in the direction of the
case vs. the direction of the control. If more
discordant pairs “favor” the case, this indicates
OR>1.
MI controls
MI cases Diabetes No Diabetes
46
Diabetes 9 37

No diabetes 16 82 98

25 119 144

P(“favors” case/discordant pair) =

37 b 37
ˆ 
p  
37  16 b  c 53
MI controls
MI cases Diabetes No Diabetes
46
Diabetes 9 37

No diabetes 16 82 98

25 119 144

odds(“favors” case/discordant pair) =

b 37
OR  
c 16
MI controls
MI cases Diabetes No Diabetes
46
Diabetes 9 37

No diabetes 16 82 98

25 119 144

OR estimate comes only from discordant pairs!!


OR= 37/16 = 2.31
Makes Sense!
McNemar’s Test
MI controls
MI cases Diabetes No Diabetes
Diabetes 9 37
No diabetes 16 82

Null hypothesis: P(“favors” case / discordant pair) = .5


(note: equivalent to OR=1.0 or cell b=cell c)

 53   53   53 
p  value   (.5) (.5)   (.5) (.5)   (.5) 39 (.5)14  ...
37 16 38 15

 37   38   39 
McNemar’s Test
MI controls
MI cases Diabetes No Diabetes
Diabetes 9 37
No diabetes 16 82

Null hypothesis: P(“favors” case / discordant pair) = .5


(note: equivalent to OR=1.0 or cell b=cell c)

By normal approximation to binomial:


53
37  ( )
2 10.5
Z    2.88; p  .01
53(.5)(. 5) 3.64
McNemar’s Test: generally
controls
cases exp No exp
exp a b
No exp c d

By normal approximation to binomial:


bc b c
b( ) 
2 2 2 bc
Z   
(b  c )(. 5)(. 5) bc bc
4
Equivalently:
bc 2 (b  c) 2
12 ( ) 
bc bc
McNemar’s Test
MI controls
MI cases Diabetes No Diabetes
Diabetes 9 37
No diabetes 16 82

McNemar’s Test:

(37  16 ) 2 212
 12    8.32  2.88 2 ; p  .01
53 53
Example: McNemar’s EXACT
test
 Split-face trial:
– Researchers assigned 56 subjects to apply SPF
85 sunscreen to one side of their faces and SPF
50 to the other prior to engaging in 5 hours of
outdoor sports during mid-day. The outcome is
sunburn (yes/no).
– Unit of observation = side of a face
– Are the observations correlated? Yes.

Russak JE et al. JAAD 2010; 62: 348-349.


Results ignoring correlation:

Table I -- Dermatologist grading of sunburn after an average of 5 hours of


skiing/snowboarding (P = .03; Fisher’s exact test)

Sun protection factor Sunburned Not sunburned


85 1 55
50 8 48

Fisher’s exact test compares the following proportions: 1/56 versus


8/56. Note that individuals are being counted twice!
Correct analysis of data:
Table 1. Correct presentation of the data (P = .016; McNemar’s exact test).

SPF-50 side

SPF-85 side Sunburned Not sunburned


Sunburned 1 0

Not sunburned 7 48

7  7 0
P( X  0)   .5 .5  .0078
McNemar’s exact test: 0 
Null hypothesis: X~binomial (n=7, p=.5) 7  7 0
P( X  7)   .5 .5  .0078
0 
Two  sided p - value  .0156
RECALL: 95% confidence
interval for a difference in
INDEPENDENT proportions
Standard error can be estimated by: pˆ (1  pˆ )
n

Standard error of the difference of two proportions=


pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )

n1 n2

95% confidence interval for the difference between two proportions:


pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )
( pˆ1  pˆ 2 )  1.96 * 
n1 n2
95% CI for difference in
dependent proportions
Variance of the difference of two random variables is the sum
of their variances minus 2*covariance:
Var ( pˆ 1  pˆ 2 )  Var ( pˆ1 )  Var ( pˆ 2 )  2Cov ( pˆ1 , pˆ 2 )

p E / D (1  p E / D )
Var( p E / D ) 
ncases controls
p E / ~ D (1  p E / ~ D )
Var( p E / ~ D ) 
ncases controls
p E & D * p~ E & ~ D  p~ E & D * p E & ~ D
Cov( p E / ~ D , p E / D ) 
ncases controls

pE / D (1  pE / D ) pE / ~ D (1  pE / ~ D ) p *p  p~ E &D * pE &~ D
Var( pE / D  pE / ~ D )    2( E &D ~ E &~ D )
n n n
95% CI for difference in
dependent proportions
MI controls
MI cases Diabetes No Diabetes
46
Diabetes 9 37

No diabetes 16 82 98

25 119 144

46 25
pE / D  pE /~D    .32  .17  .15
144 144
Var( p E / D  pE /~D )
p E / D (1  p E / D ) p E / ~ D (1  p E / ~ D ) p * p ~ E &~ D  p ~ E & D * p E &~ D
   2( E & D
n n n
46 46 25 25 9 82 37 16
( )(1  )( )(1  )  2( *  * )
 144 144 144 144 144 144 144 144  .0024
144
 95 % CI : 0.15  1.96 ( .0024 )  0.05  0.24
The connection between McNemar
and Cochran-Mantel-Haenszel Tests
View each pair is it’s own
“age-gender” stratum
Example:
Concordant for
exposure (cell “a”
from before)
Case (MI) Control

Diabetes 1 1

No diabetes 0 0
Case (MI) Control

Diabetes 1 1
x9
No diabetes 0 0

Case (MI) Control

Diabetes 1 0 x 37
No diabetes 0 1

Case (MI) Control

Diabetes 0 1
x 16
No diabetes 1 0

Case (MI) Control

Diabetes 0 0
1
x 82
No diabetes 1
Mantel-Haenszel for pair-
matched data

We want to know the relationship between diabetes and


MI controlling for age and gender (the matching
variables).

Mantel-Haenszel methods apply.


RECALL: The Mantel-Haenszel
Summary Odds Ratio
k
ai d i

i 1 Ti
k
bi ci
i 1 Ti

Case Control

Exposed a b

Not Exposed c d
Case (MI) Control

Diabetes 1 1 ad/T = 0
x9
No diabetes 0 0 bc/T=0

Case (MI) Control

Diabetes 1 0 ad/T=1/2 x 37
No diabetes 0 1 bc/T=0
Case (MI) Control

Diabetes 0 1 ad/T=0
x 16
No diabetes 1 0 bc/T=1/2

Case (MI) Control

Diabetes 0 0 ad/T=0
1 bc/T=0 x 82
No diabetes 1
Mantel-Haenszel Summary OR

144
ai d i
 2
37 x
1
2 37
ORMH  144
i 1
 
bi ci 1 16
i 1 2
16 *
2
Mantel-Haenszel Test Statistic
(same as McNemar’s)
k
[  (a k  E (ak ))] 2

i 1
k
~ 2
1

Var(a )
i 1
k

(ak  bk ) * (ak  ck )
recall : E (ak ) 
nk
(ak  bk ) * (ck  d k ) * (ak  ck ) * (bk  d k )
Var(ak ) 
nk2 (nk  1)
Concordant cells contribute nothing to Mantel-
Haenszel statistic (observed=expected)
Case (MI) Control (2) * (1)
E ( ak )  1
Diabetes 1 1 2
a k  E ( ak )  1  1  0
No diabetes 0 0
(2)(1)(1)(0)
Var(ak )  2
0
2 (1)

Case (MI) Control (0) * (1)


E ( ak )  0
Diabetes 0 0 2
a k  E ( ak )  0  0  0
No diabetes 1 1
(0)(1)(1)(2)
Var(ak )  2
0
(row1) * (col1) 2 (1)
recall : E (ak ) 
nk
(row1) * (row2) * (col1) * (col2)
Var(ak ) 
nk2 (nk  1)
Discordant cells
(1) * (1) 1
Case (MI) Control E ( ak )  
2 2
Diabetes 1 0 1 1
ak  E ( ak )  1   
2 2
No diabetes 0 1
(1)(1)(1)(1) 1
Var(ak )  2 
2 (2  1) 4

(1) * (1) 1
Case (MI) Control E ( ak )  
2 2
Diabetes 0 1 1 1
ak  E ( ak )  0   
No diabetes 1 0 2 2
(1)(1)(1)(1) 1
Var(ak )  2 
2 (2  1) 4
(row1) * (col1)
recall : E (ak ) 
nk
(row1) * (row2) * (col1) * (col2)
Var(ak ) 
nk2 (nk  1)
k
[  (a k  E (ak ))]
2

 
2
1
i 1
k

Var(a )
i 1
k

[37 (.5)  16(.5)]2 [.5(37  16)]2


 
(37  16)(.25) (53)(.25)
.5 2 (37  16) 2 (37  16) 2
   8.32; p  .01
.25(53) 53
k
[  (a
i 1
k  E (ak ))] 2

CMH  k

Var(a )
i 1
k

[  .5    .5 ] [.5(b)  .5(c)]
2
 case disc.cells control disc.cells

 .25
disc.cells
(b  c)(.25)

.52 (b  c) 2 (b  c) 2
   McNemar' s
.25(b  c) bc
~ 12
Example: Salmonella
Outbreak in France, 1996

From: “Large outbreak of Salmonella enterica serotype


paratyphi B infection caused by a goats' milk cheese, France,
1993: a case finding and epidemiological study” BMJ 312: 91-
94; Jan 1996.
Epidemic Curve
Matched Case Control Study

Case = Salmonella gastroenteritis.


Community controls (1:1) matched for:
 age group (< 1, 1-4, 5-14, 15-34, 35-44, 45-
54, 55-64, or >= 65 years)
 gender
 city of residence
Results
In 2x2 table form: any goat’s
cheese
Controls
Cases Goat’ cheese None
46
Goat’s cheese 23 23

None 6 7 13

29 30 59

b 23
OR    3.8
c 6
In 2x2 table form: Brand A
Goat’s cheese
Controls
Cases Goat’ cheese B None
32
Goat’s cheese B 8 24

None 2 25 27

10 49 59

b 24
OR    12.0
c 2
Case (MI) Control

Brand A 1 1
0
x8
None 0

Case (MI) Control

Brand A 1 0
x24
None 0 1

Case (MI) Control

Brand A 0 1
0
x2
None 1

Case (MI) Control

Brand A 0 0
x25
None 1 1
n1 k n1k 2 *1
8 concordant exposed : 11k  E(n11k )   1
n  k 2
Observed(n11k )  11k  1  1  0 Using
n1 k n1k n2 k n 2 k 2 *1 * 0 *1 Agresti
Var(n11k )  2  0 notation
n   k (n  k  1) 4(2  1) here!
Summary: 8 concordant-exposed pairs (=strata) contribute
nothing to the numerator (observed-expected=0) and nothing to
the denominator (variance=0).
n1 k n1k 0 *1
25 concordant unexposed : 11k  E(n11k )   0
n  k 2
Observed(n11k )  11k  0  0  0
n n n n 0 *1 * 2 *1
Var(n11k )  12k 1k 2 k  2 k  0
n   k (n  k  1) 4(2  1)
Summary: 25 concordant-unexposed pairs contribute nothing to
the numerator (observed-expected=0) and nothing to the
denominator (variance=0).
(1)(1) 1
2 discordant cells favor control : 11k  
2 2
Observed(n11k )  11k  0  .5  .5
n1 k n1k n2 k n 2 k 1 *1 *1 *1 1
Var(n11k )  2  
n   k (n  k  1) 4(2  1) 4
Summary: 2 discordant “control-exposed” pairs contribute -.5
each to the numerator (observed-expected= -.5) and .25 each to
the denominator (variance= .25).
(1)(1) 1
24 discordant cells favor case : 11k  
2 2
Observed(n11k )  11k  1  .5  .5
n1 k n1k n2 k n 2 k 1 *1 *1 *1 1
Var(n11k )  2  
n   k (n  k  1) 4(2  1) 4
Summary: 24 discordant “case-exposed” pairs contribute +.5
each to the numerator (observed-expected= +.5) and .25 each to
the denominator (variance= .25).
[8(0)  25(0)  24(.5)  2(.5)]2
 CMH 
0  0  24(.25)  2(.25)
2
22 (.25) 22 2
(24  2) 2
(b  c) 2
   
26(.25) 26 26 bc
Diagnostic Testing and
Screening Tests
Characteristics of a diagnostic test
Sensitivity= Probability that, if you truly have
the disease, the diagnostic test will catch it.

Specificity=Probability that, if you truly do


not have the disease, the test will register
negative.
Calculating sensitivity and
specificity from a 2x2 table
Screening Test
+ -
Truly have disease
+ a b a+b
- c d c+d

a Among those with true


Sensitivity  disease, how many test
a  b positive?

d Among those without the


Specificity  disease, how many test
cd negative?
Hypothetical Example
Mammography
+ -
Breast cancer ( on biopsy)
+ 9 1 10
- 109 881
990

Sensitivity=9/10=.90 1 false negatives out of 10


cases

Specificity= 881/990 =.89 109 false positives out of 990


What factors determine the
effectiveness of screening?
 The prevalence (risk) of disease.
 The effectiveness of screening in preventing
illness or death.
– Is the test any good at detecting disease/precursor
(sensitivity of the test)?
– Is the test detecting a clinically relevant condition?
– Is there anything we can do if disease (or pre-disease) is
detected (cures, treatments)?
– Does detecting and treating disease at an earlier stage
really result in a better outcome?
 The risks of screening, such as false positives and
radiation.
Positive predictive value
 The probability that if you test positive for
the disease, you actually have the disease.
 Depends on the characteristics of the test
(sensitivity, specificity) and the prevalence
of disease.
Example: Mammography
 Mammography utilizes ionizing radiation to image breast
tissue.
 The examination is performed by compressing the breast
firmly between a plastic plate and an x-ray cassette that
contains special x-ray film.
 Mammography can identify breast cancers too small to
detect on physical examination.
 Early detection and treatment of breast cancer (before
metastasis) can improve a woman’s chances of survival.
 Studies show that, among 50-69 year-old women,
screening results in 20-35% reductions in mortality from
breast cancer.
Mammography
 Controversy exists over the efficacy of
mammography in reducing mortality from breast
cancer in 40-49 year old women.
 Mammography has a high rate of false positive
tests that cause anxiety and necessitate further
costly diagnostic procedures.
 Mammography exposes a woman to some
radiation, which may slightly increase the risk of
mutations in breast tissue.
Example
 A 60-year old woman has an abnormal
mammogram; what is the chance that she
has breast cancer? E.g., what is the positive
predictive value?
Calculating PPV and NPV
from a 2x2 table
Screening Test
+ -
Truly have disease
+ a b

- c d

a+c b+d
Among those who test
a
PPV  positive, how many truly have
ac the disease?

d Among those who test


NPV  negative, how many truly do
bd not have the disease?
Hypothetical Example
Mammography
+ -
Breast cancer ( on biopsy)
+ 9 1

- 109 881

118 882

PPV=9/118=7.6%

NPV=881/882=99.9%

Prevalence of disease = 10/1000 =1%


What if disease was twice as
prevalent in the population?
Mammography
+ -
Breast cancer ( on biopsy)
+ 18 2 20
- 108 872 980

sensitivity=18/20=.90

specificity=872/980=.89
Sensitivity and specificity are characteristics of the test, so they don’t
change!
What if disease was more
prevalent?
Mammography
+ -
Breast cancer ( on biopsy)
+ 18 2

- 108 872

126 874

PPV=18/126=14.3%

NPV=872/874=99.8%

Prevalence of disease = 20/1000 =2%


Conclusions
 Positive predictive value increases with
increasing prevalence of disease
 Or if you change the diagnostic tests to
improve their accuracy.

Potrebbero piacerti anche