# Tarea 1

## Fundamentos de Análisis Epidemiológico I 3009372

Diego Alejandro Muñoz Gaviria
8 de marzo de 2018

1. Para el cálculo del valor-p en el test de Fisher, se debe calcular la probabilidad de obtener
la tabla observada manteniendo los totales fijos en las filas y columnas. Asumiendo los
siguientes datos (tabla 2x2):

¿Factor? Si No Total
Si a b a+b
No c d c+d
Total a+c b+d a+b+c+d

P (T2 = a) = p =
n! a! b! c! d!

## Columna 1 Columna 2 Total

Fila 1 x r-x r
Fila 2 C-x N-r-C+x N-r
Total C N-C N

## Suponiendo H0 : P1 = P2 cierta, donde P 1: probabilidad de que una observación de la fila

1 quede clasificada en la columna 1 y P 2: probabilidad de que una observación de la fila 2
quede clasificada en la columna 1, entonces:

r N −r
 
x C−x
P (T2 = a) = N
 x = 0, 1, 2, · · · , min{r, C}
C

  
r! (N − r)!
r N −r
 
x C−x x! (r − x)! (N − r − C + x)!
N
=
N!

C
C! (N − C)!
r! (N − r)! C! (N − C)!
=
x! (r − x)! (C − x)! (N − r − C + x)! N !

haciendo a = x, b = r − x, c = C − x, n = N y d = N − r − C + x en la tabla dada,
tenemos:

r N −r
 
x C−x (a + b)! (n − a − b)! (a + c)! (n − a − c)!
N
 =
C
a! b! c! d! n!
(a + b)! (c + d)! (a + c)! (b + d)!
=
n! a! b! c! d!


2. The table below shows data from a random sample of a middle-age men taken in Kuopio,
Finland (Kauhanen et al., 1997). A beer binger is defined as someone usually drinks six or
more bottles per drinking session. This was recorded at teh outset of the study; mortality
was recorded from death certificates over an average of 7.7 years follow-up.

Cardiovascular death?
Beer binger? Yes No Total
Yes 7 63 70
No 52 1519 1571
Total 59 1582 1641

(i) Estimate the risk of cardiovascular death for bingers and for non-bingers, together
with 95 % confidence intervals.
r1 : risk of cardiovascular death for bingers
7
r1 = = 0.1
70
r
0.1(1 − 0.1)
SE
d (r1 ) = = 0.0358
70
a 95 % confidence interval for r1 : 0.1 ± 1.96 × 0.0358 : (0.029832, 0.170168)
r2 : risk of cardiovascular death for non-bingers
52
r2 = = 0.0331
1571
r
0.0331(1 − 0.0331)
SE
d (r2 ) = = 0.0045
1571
a 95 % confidence interval for r2 : 0.0331 ± 1.96 × 0.0045 : (0.02428, 0.04192)

(ii) Estimate the relative risk of cardiovascular death for bingers compared to non-bingers,
together with 95 % confidence interval.
the RR for bingers is:
r1
λ̂ = = 3.0211
r2
then, the bingers are at real risk (three more times) in front of the non-bingers of
cardiovascular death.

(iii) Estimate the odds of cardiovascular death for bingers and for non-bingers.
7
Odds for bingers: = 0.111
63
52
Odds for non-bingers: = 0.034
1519

(iv) Estimate the odds ratio for cardiovascular death for bingers compared to non-bingers,
together with a 95 % confidence interval.
(7)(1519)
Odds ratio: Ψ̂ = = 3.2457
(52)(63)
r
1 1 1 1
with SE [log(Ψ)] =
d + + + = 0.4226
7 63 52 1571
a 95 % confidence interval for odds ratio (Ψ):

## e1.1773±1.96×0.4226 : (1.41765, 7.4305)

(v) Test the null hypothesis that beer binging has no relationship whit cardiovascular
death.
H0 : beer binging and cardiovascular death are independent
Using R:
> binging=matrix(c(7,52,63,1519),ncol=2)
> chisq.test(binging,correct=F)

## Pearson’s Chi-squared test

data: binging
X-squared = 8.6532, df = 1, p-value = 0.003265

Warning message:
In chisq.test(binging, correct = F) :
Chi-squared approximation may be incorrect

Using SAS:
data Binging;
input binger \$ CD \$ Count @@;
datalines;

si si 7 si no 63
no si 52 no no 1519
;
proc freq data=Binging;
weight Count;
exact pchi;
tables binger*CD /noprint ;
run;

P r o c e d im ie n t o F R E Q

E s t a d ís t ic o s p a r a la t a b la d e B in g e r p o r C D

E s t a d ís t ic o D F V a lo r P r o b

C h i- c u a d r a d o 1 8 .6 5 3 2 0 .0 0 3 3

C h i- c u a d r a d o d e r a t io d e v e r o s im ilit u d 1 6 .0 3 8 2 0 .0 1 4 0

C h i- c u a d r a d o a d j. d e c o n t in u id a d 1 6 .8 3 0 7 0 .0 0 9 0

C h i- c u a d r a d o M a n t e l- H a e n s z e l 1 8 .6 4 7 9 0 .0 0 3 3

C o e fic ie n t e P h i 0 .0 7 2 6

C o e fic ie n t e d e c o n t in g e n c ia 0 .0 7 2 4

V d e C r a m e r 0 .0 7 2 6

W A R N IN G : 2 5 % d e la s c e ld a s t ie n e u n a c a n t id a d m e n o r q u e 5 .
( A s in t ó t ic o ) P u e d e q u e c h i- c u a d r a d o ( a s in t ó t ic o ) n o s e a u n t e s t v á lid o .

T e s t c h i- c u a d r a d o d e P e a r s o n

C h i- c u a d r a d o 8 .6 5 3 2

D F 1

P r a s in t ó t ic o > C h iS q 0 .0 0 3 3

E x a c to P r > = C h iS q 0 .0 1 1 0

T e s t e x a c t o d e F is h e r

C e ld a ( 1 ,1 ) F r e c u e n c ia ( F ) 1 5 1 9

A lin e a d o a la iz q u ie r d a P r < = F 0 .9 9 7 2

A lin e a d o a la d e r e c h a P r > = F 0 .0 1 1 0

T a b la d e p r o b a b ilid a d ( P ) 0 .0 0 8 2

D e d o s c a r a s P r < = P 0 .0 1 1 0

T a m a ñ o d e la m u e s t r a = 1 6 4 1

## is rejected H0 , there is sample evidence to infer dependence between binging and

cardiovascular death. (The beer kills!)

(vi) Estimate the attributable risk cardiovascular death for beer binging, together with a
95 % confidence interval.
59 52
1641 − 1571
Θ̂ = 59 = 0.07937
1641

a 95 % CI aproximated for Θ̂:
s
1.96(59)(1571) (7 × 1519)(1582) + 63(52)2
with µ = = 1.144546
(7 × 1519) − (63 × 52) (1641 × 52)(59)(1571)

## ((7 × 1519) − (63 × 52))e±µ

the interval is : (0.0267, 0.21309)
(1641 × 52) + ((7 × 1519) − (63 × 52))e±µ

3. From an investigation into asthma in seven primary schools in the South of England, Storr
et al. (1987) reported data on 55 pupils with asthma. Twenty of these pupils lost 10 days
or more of schooling over the previus year. Of these 20, eight had parents who provided
adequate medication. A further 35 pupils with asthma lost less than 10 days of schooling,
four of these had parents who provided adequate medication. Use Fisher’s exact test to
see whether the time lost from school is unrelated to the provision of adequate medication.

ASTHMA
time lost from school
medication ≥ 10 days < 10 days Total
Total 20 35 55
in seven primary schools in the South of England

Using R:

> fisher.test(asthma,alternative="less")

## Fisher’s Exact Test for Count Data

data: asthma
p-value = 0.9972
alternative hypothesis: true odds ratio is less than 1
95 percent confidence interval:
0.00000 21.02832
sample estimates:
odds ratio
4.993687

Using SAS:

data Asthma;
input medication\$ days\$ Count @@;
datalines;

;
proc freq data=Asthma order=data;
weight Count;
exact fisher;
tables medication*days /noprint ;
run;
m a r te s , 6 d e m a r z o d e 2 0 1 8 1 1 :3 0 :1 5 1

P r o c e d im ie n t o F R E Q

E s t a d ís t ic o s p a r a la t a b la d e m e d ic a t io n p o r d a y s

E s t a d ís t ic o D F V a lo r P r o b

C h i- c u a d r a d o 1 6 .0 9 0 8 0 .0 1 3 6

C h i- c u a d r a d o d e r a t io d e v e r o s im ilit u d 1 5 .9 0 8 4 0 .0 1 5 1

C h i- c u a d r a d o a d j. d e c o n t in u id a d 1 4 .5 3 1 0 0 .0 3 3 3

C h i- c u a d r a d o M a n t e l- H a e n s z e l 1 5 .9 8 0 1 0 .0 1 4 5

C o e fic ie n t e P h i 0 .3 3 2 8

C o e fic ie n t e d e c o n t in g e n c ia 0 .3 1 5 8

V d e C r a m e r 0 .3 3 2 8

W A R N IN G : 2 5 % d e la s c e ld a s t ie n e n u n a c a n t id a d m e n o r
q u e 5 . P u e d e q u e c h i- c u a d r a d o n o s e a u n t e s t v á lid o .

T e s t e x a c t o d e F is h e r

C e ld a ( 1 ,1 ) F r e c u e n c ia ( F ) 8

A lin e a d o a la iz q u ie r d a P r < = F 0 .9 9 7 2

A lin e a d o a la d e r e c h a P r > = F 0 .0 1 7 8

T a b la d e p r o b a b ilid a d ( P ) 0 .0 1 5 0

D e d o s c a r a s P r < = P 0 .0 1 9 7

T a m a ñ o d e la m u e s t r a = 5 5

These results affirm that there is sample evidence to infer the relationship between the
provision of adequate medication by parents of children with asthma and the time lost
from school greater than 10 days.

4. In a Danish study of helthy mothers (Tetzschner et al., 1997), urinary incontinence and
pudendal nerve terminal motor latency (PNTML) were recorded 12 weeks after delivery.
PNTML was recorded as ‘high’ if it was in excess of the normal range for the relevant
laboratory; otherwise it is ‘low’. Of the 17 women with high PNTML, 6 were incontinent;
of the women with low PNTML, 19 were incontinent and 110 were not.

Urinary Incontinence
PNTML Yes No Total
high 6 11 17
low 19 110 129
Total 25 121 146
recorded 12 weeks after delivery

(i) Calculate the relative risk for incontinence comparing high aganist low PNTML, to-
gether with a 95 % confidence interval.
the RR of urinary incontinence for PNTML is:
6
λ̂ = 17 = 2.3963
19
129
h i r1 1 1 1
with SE log(λ̂) =
d − + − = 0.3908
6 17 19 129
a 95 % confidence interval for relative risk (λ̂): e0.8739±1.96×0.3908 : (1.1139, 5.1545)

(ii) Calculate the odds ratio for incontinence comparing high aganist low PNTML, toget-
her with a 95 % confidence interval.
(6)(110)
Odds ratio: Ψ̂ = = 3.1579
(19)(11)
r
d [log(Ψ)] = 1 + 1 + 1 + 1 = 0.5651
with SE
6 11 19 110
a 95 % confidence interval for odds ratio (Ψ):

## e1.1499±1.96×0.5651 : (1, 0432, 9.5591)

(iii) Test the null hypothesis that PNTML has no effect on incontinence.
H0 : PNTML is independent than incontinence.
Using R:
> PNTML=matrix(c(6,19,11,110),ncol=2)
> chisq.test(PNTML,correct=F)

## Pearson’s Chi-squared test

data: PNTML
X-squared = 4.4765, df = 1, p-value = 0.03436

Warning message:
In chisq.test(PNTML, correct = F) :
Chi-squared approximation may be incorrect

Using SAS:

data PNTML;
input PNTML \$ Incontinence \$ Count @@;
datalines;
high yes 6 high no 11
low yes 19 low no 110
;
proc freq data=PNTML;
weight Count;
exact pchi;
tables PNTML*Incontinence /noprint ;
run;
m i é r c o le s , 7 d e m a r z o d e 2 0 1 8 1 9 : 3 7 : 4 0 1

P r o c e d im ie n t o F R E Q

E s t a d ís t ic o s p a r a la t a b la d e P N T M L p o r In c o n t in e n c e

E s t a d ís t ic o D F V a lo r P r o b

C h i- c u a d r a d o 1 4 .4 7 6 5 0 .0 3 4 4

C h i- c u a d r a d o d e r a t io d e v e r o s im ilit u d 1 3 .7 7 6 3 0 .0 5 2 0

C h i- c u a d r a d o a d j. d e c o n t in u id a d 1 3 .1 4 4 7 0 .0 7 6 2

C h i- c u a d r a d o M a n t e l- H a e n s z e l 1 4 .4 4 5 9 0 .0 3 5 0

C o e fic ie n t e P h i -0 .1 7 5 1

C o e fic ie n t e d e c o n t in g e n c ia 0 .1 7 2 5

V d e C r a m e r -0 .1 7 5 1

W A R N IN G : 2 5 % d e la s c e ld a s t ie n e u n a c a n t id a d m e n o r q u e 5 .
( A s in t ó t ic o ) P u e d e q u e c h i- c u a d r a d o ( a s in t ó t ic o ) n o s e a u n t e s t v á lid o .

T e s t c h i- c u a d r a d o d e P e a r s o n

C h i- c u a d r a d o 4 .4 7 6 5

D F 1

P r a s in t ó t ic o > C h iS q 0 .0 3 4 4

E x a c to P r > = C h iS q 0 .0 4 5 4

T e s t e x a c t o d e F is h e r

C e ld a ( 1 ,1 ) F r e c u e n c ia ( F ) 1 1

A lin e a d o a la iz q u ie r d a P r < = F 0 .0 4 5 4

A lin e a d o a la d e r e c h a P r > = F 0 .9 8 8 7

T a b la d e p r o b a b ilid a d ( P ) 0 .0 3 4 0

D e d o s c a r a s P r < = P 0 .0 7 8 6

T a m a ñ o d e la m u e s t r a = 1 4 6

## is rejected H0 , there is sample evidence to infer dependence between deficence in

PNTML and incontinence.

(iv) Calculate the attributable risk for incontinence that is ascribable to high PNTML,
together with a 95 % confidence interval.
25 19
146 − 129
Θ̂ = 25 = 0.1397
146

a 95 % CI aproximated for Θ̂:
s
1.96(25)(129) (6 × 110)(129) + 11(19)2
with µ = = 0.558436
(6 × 110) − (11 × 19) (146 × 19)(25)(129)

## ((6 × 110) − (11 × 19))e±µ

the interval is : (0.0851, 0.2213)
(146 × 19) + ((6 × 110) − (11 × 19))e±µ

5. Refer to the total columns of the Glasgow MONICA survey summary data
given in Table C.3.

## Glasgow MONICA survey totals

CVD
Factor IX yes no Total
Males
High 122 234 356
Low 111 243 354
Total 233 477 710
Females
High 154 245 399
Low 106 279 385
Total 260 524 784

(i) Calculate the prevalence risk of cardiovascular disease (CVD) by factor IX status for
each sex.

233
pRm = = 0.32817
710
260
pRf = = 0.33163
784

(ii) Calculate de prevalence relative risk for high compared to low factor IX, together with
a 95 % confidence limits for each sex.
122
356
pRRm = 111 = 1.0929
354
r
1 1 1 1
with SE
d [log(pRRm )] = − + − = 0.1076 a 95 % confidence inter-
122 356 111 354
val for prevalence relative risk (pRRm ): e 0.0888±1.96×0.1076 : (0.8851, 1.3494)

154
399
pRRf = 106 = 1.0159
279
r
1 1 1 1
with SE
d [log(pRRf )] = − + − = 0.104 a 95 % confidence interval
154 399 106 385
for prevalence relative risk (pRRf ): e0.01577±1.96×0.104 : (0.81559, 1.24559)

## (iii) Repeat (i), but for odds.

233
Odds for Males: = 0.48847
477
260
Odds for Females: = 0.49618
524

## (iv) Repeat (ii), but for the odds ratio.

(122)(243)
Odds ratio Male: Ψ̂ = = 1.1414
(234)(111)
r
1 1 1 1
with SE
d [log(Ψ)] = + + + = 0.15998
122 234 111 243
a 95 % confidence interval for odds ratio (Ψ):
e0.13225±1.96×0.15998 : (0.73084, 1.56175)

(154)(279)
Odds ratio Female: Ψ̂ = = 1.6544
(245)(106)
r
1 1 1 1
with SE
d [log(Ψ)] = + + + = 0.1536
154 245 106 279
a 95 % confidence interval for odds ratio (Ψ):
e0.50344±1.96×0.1536 : (0.74, 2.23557)

(v) Test whether factor IX has an effect on CVD for men and women separately.
Using R:
> mMONICA=matrix(c(122,111,234,243),ncol=2)
> chisq.test(mMONICA,correct=F)

## Pearson’s Chi-squared test

data: mMONICA
X-squared = 0.6835, df = 1, p-value = 0.4084

> fMONICA=matrix(c(154,106,245,279),ncol=2)
> chisq.test(fMONICA,correct=F)

## Pearson’s Chi-squared test

data: fMONICA
X-squared = 10.821, df = 1, p-value = 0.001004

Using SAS:
data mMONICA;
input Factor \$ CVD \$ Count @@;
datalines;
high yes 122 high no 234
low yes 111 low no 243
;
proc freq data=mMONICA;
weight Count;
exact pchi;
tables Factor*CVD /noprint ;
run;
m i é r c o le s , 7 d e m a r z o d e 2 0 1 8 2 3 : 5 8 : 3 9 1

P r o c e d im ie n t o F R E Q

E s t a d ís t ic o s p a r a la t a b la d e F a c to r p o r C V D

E s t a d ís t ic o D F V a lo r P r o b

C h i- c u a d r a d o 1 0 .6 8 3 5 0 .4 0 8 4

C h i- c u a d r a d o d e r a t io d e v e r o s im ilit u d 1 0 .6 8 3 7 0 .4 0 8 3

C h i- c u a d r a d o a d j. d e c o n t in u id a d 1 0 .5 5 7 7 0 .4 5 5 2

C h i- c u a d r a d o M a n t e l- H a e n s z e l 1 0 .6 8 2 5 0 .4 0 8 7

C o e fic ie n t e P h i -0 .0 3 1 0

C o e fic ie n t e d e c o n t in g e n c ia 0 .0 3 1 0

V d e C r a m e r -0 .0 3 1 0

T e s t c h i- c u a d r a d o d e P e a r s o n

C h i- c u a d r a d o 0 .6 8 3 5

D F 1

P r a s in t ó t ic o > C h iS q 0 .4 0 8 4

E x a c to P r > = C h iS q 0 .4 2 4 8

T e s t e x a c t o d e F is h e r

C e ld a ( 1 ,1 ) F r e c u e n c ia ( F ) 2 3 4

A lin e a d o a la iz q u ie r d a P r < = F 0 .2 2 7 6

A lin e a d o a la d e r e c h a P r > = F 0 .8 1 7 7

T a b la d e p r o b a b ilid a d ( P ) 0 .0 4 5 3

D e d o s c a r a s P r < = P 0 .4 2 4 8

T a m a ñ o d e la m u e s t r a = 7 1 0

data fMONICA;
input Factor \$ CVD \$ Count @@;
datalines;
high yes 154 high no 106
low yes 245 low no 279
;
proc freq data=fMONICA;
weight Count;
exact pchi;
tables Factor*CVD /noprint ;
run;

m i é r c o le s , 7 d e m a r z o d e 2 0 1 8 2 3 : 5 8 : 3 9 2

P r o c e d im ie n t o F R E Q

E s t a d ís t ic o s p a r a la t a b la d e F a c to r p o r C V D

E s t a d ís t ic o D F V a lo r P r o b

C h i- c u a d r a d o 1 1 0 .8 2 1 1 0 .0 0 1 0

C h i- c u a d r a d o d e r a t io d e v e r o s im ilit u d 1 1 0 .8 7 0 2 0 .0 0 1 0

C h i- c u a d r a d o a d j. d e c o n t in u id a d 1 1 0 .3 2 7 7 0 .0 0 1 3

C h i- c u a d r a d o M a n t e l- H a e n s z e l 1 1 0 .8 0 7 3 0 .0 0 1 0

C o e fic ie n t e P h i -0 .1 1 7 5

C o e fic ie n t e d e c o n t in g e n c ia 0 .1 1 6 7

V d e C r a m e r -0 .1 1 7 5

T e s t c h i- c u a d r a d o d e P e a r s o n

C h i- c u a d r a d o 1 0 .8 2 1 1

D F 1

P r a s in t ó t ic o > C h iS q 0 .0 0 1 0

E x a c to P r > = C h iS q 0 .0 0 1 1

T e s t e x a c t o d e F is h e r

C e ld a ( 1 ,1 ) F r e c u e n c ia ( F ) 1 0 6

A lin e a d o a la iz q u ie r d a P r < = F 0 .0 0 0 6

A lin e a d o a la d e r e c h a P r > = F 0 .9 9 9 6

T a b la d e p r o b a b ilid a d ( P ) 0 .0 0 0 3

D e d o s c a r a s P r < = P 0 .0 0 1 1

T a m a ñ o d e la m u e s t r a = 7 8 4

