Sei sulla pagina 1di 11

1

Test della media


Il concetto principale quello dellerrore standard della media, o SEM. Rappresenta la variazione di n
campioni dal loro valore medio, diciamolo m; cio: SEH = on. Questo sini!ica c"e se si prendono n
campioni estratti da una popolazione e si calcola la media, questa distri#uita secondo una distri#uzione
c"e pi$ stretta di un !attore n di quella dei campioni oriinali.
I criteri di stima della media sono #asati sul !atto c"e le medie saranno, ad esempio in una proporzione pari
al %&', distri#uite in iun intervallo compreso !ra il valore calcolato dai campioni ed un intervallo centrato su
tale media ampio ()*+,: t =
x -
0
SLM
, per cui -risolvendo. rispetto a p
0
, si ottiene c"e la media compresa !ra
x t
u2
SEH.
Il test insomma si #asa sul campione e sulla SEM estratta dal campione stesso. Il calcolo dellintervallo usa
entram#e le -statistic"e., la media e la SEM campionaria.
In R, il risultato del test indica, in una ria del tipo:
t = -5.4349, df = 21.982, p-value = 1.855e-05

il valore della varia#ile casuale t, i radi di li#ert/ ed il valore della statistica p; esso avr/ in enerale un
valore maiore o minore di quello della statistica c"e corrisponde al livello di sini!icativit/ scelto, per
de!ault pari al %&'. Quindi:
nr campioni gradi libert della t-di Student Ipotesi H0 (media assegnata lanciando la funzione) p
0n valore -piccolo. di p indica c"e poco verosimile c"e * per i campioni passati alla !unzione 1 la media
assuma il valore indicato come parametro 2e pari a 3 per de!ault4.
Sintassi
5lternative6.reater. -less. a##reviazione alt6. 7.
6
Comando in R

t.test(x, y = NULL,
alternative = c("two.sided", "less", "greater"),
mu = 0, paired = FALSE, var.equal = FALSE,
conf.level = 0.95, ...)
Examples
require(graphics)

t.test(1:10, y = c(7:20)) # P = .00001855
t.test(1:10, y = c(7:20, 200)) # P = .1245 -- NOT significant anymore

## Classical example: Student's sleep data
plot(extra ~ group, data = sleep)
## Traditional interface
with(sleep, t.test(extra[group == 1], extra[group == 2]))
## Formula interface
t.test(extra ~ group, data = sleep)
Ricordare 1 in R i comandi
indicati nella sintassi mostrano i
valori assunti per de!ault.

+


sleep
e8tra roup I9
1 3.: 1 1
+ *1.; 1 +
< *3.+ 1 <
= *1.+ 1 =
& *3.1 1 &
; <.= 1 ;
: <.: 1 :
> 3.> 1 >
% 3.3 1 %
13 +.3 1 13
11 1.% + 1
1+ 3.> + +
1< 1.1 + <
1= 3.1 + =
1& *3.1 + &
1; =.= + ;
1: &.& + :
1> 1.; + >
1% =.; + %
+3 <.= + 13
Output
t.test> require(graphics)

t.test> t.test(1:10, y = c(7:20)) # P = .00001855

Welch Two Sample t-test

data: 1:10 and c(7:20)
t = -5.4349, df = 21.982, p-value = 1.855e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-11.052802 -4.947198
sample estimates:
mean of x mean of y
5.5 13.5


t.test> t.test(1:10, y = c(7:20, 200)) # P = .1245 -- NOT significant anymore

Welch Two Sample t-test

data: 1:10 and c(7:20, 200)
t = -1.6329, df = 14.165, p-value = 0.1245
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-47.242900 6.376233
sample estimates:
mean of x mean of y
5.50000 25.93333

<


t.test> ## Classical example: Student's sleep data
t.test> plot(extra ~ group, data = sleep)
Aspetto per confermare cambio pagina...

t.test> ## Traditional interface
t.test> with(sleep, t.test(extra[group == 1], extra[group == 2]))

Welch Two Sample t-test

data: extra[group == 1] and extra[group == 2]
t = -1.8608, df = 17.776, p-value = 0.07939
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.3654832 0.2054832
sample estimates:
mean of x mean of y
0.75 2.33

t.test> ## Formula interface
t.test> t.test(extra ~ group, data = sleep)

Welch Two Sample t-test

data: extra by group
t = -1.8608, df = 17.776, p-value = 0.07939
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.3654832 0.2054832
sample estimates:
mean in group 1 mean in group 2
0.75 2.33

Interessante il ra!ico costrutito, #o8 plot sulle due medie di ruppo:

1 2
-
1
0
1
2
3
4
5
group
e
x
t
r
a

=


prop.test(x, n, p = NULL,
alternative = c("two.sided", "less", "greater"),
conf.level = 0.95, correct = TRUE)

Examples
heads <- rbinom(1, size = 100, prob = .5)
prop.test(heads, 100) # continuity correction TRUE by default
prop.test(heads, 100, correct = FALSE)

## Data from Fleiss (1981), p. 139.
## H0: The null hypothesis is that the four populations from which
## the patients were drawn have the same true proportion of smokers.
## A: The alternative is that this proportion is different in at
## least one of the populations.

smokers <- c( 83, 90, 129, 70 )
patients <- c( 86, 93, 136, 82 )
prop.test(smokers, patients)
Output

prp.ts> heads <- rbinom(1, size = 100, prob = .5)

prp.ts> prop.test(heads, 100) # continuity correction TRUE by default

1-sample proportions test with continuity correction

data: heads out of 100, null probability 0.5
X-squared = 0.01, df = 1, p-value = 0.9203
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.3894281 0.5913488
sample estimates:
p
0.49


prp.ts> prop.test(heads, 100, correct = FALSE)

1-sample proportions test without continuity correction

data: heads out of 100, null probability 0.5
X-squared = 0.04, df = 1, p-value = 0.8415
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.3942200 0.5865199
sample estimates:
p
0.49


prp.ts> ## Data from Fleiss (1981), p. 139.
prp.ts> ## H0: The null hypothesis is that the four populations from which
prp.ts> ## the patients were drawn have the same true proportion of smokers.
prp.ts> ## A: The alternative is that this proportion is different in at

&

prp.ts> ## least one of the populations.
prp.ts>
prp.ts> smokers <- c( 83, 90, 129, 70 )

prp.ts> patients <- c( 86, 93, 136, 82 )

prp.ts> prop.test(smokers, patients)

4-sample test for equality of proportions without continuity
correction

data: smokers out of patients
X-squared = 12.6004, df = 3, p-value = 0.005585
alternative hypothesis: two.sided
sample estimates:
prop 1 prop 2 prop 3 prop 4
0.9651163 0.9677419 0.9485294 0.8536585



binom.test(x, n, p = 0.5,
alternative = c("two.sided", "less", "greater"),
conf.level = 0.95)

Examples
## Conover (1971), p. 97f.
## Under (the assumption of) simple Mendelian inheritance, a cross
## between plants of two particular genotypes produces progeny 1/4 of
## which are "dwarf" and 3/4 of which are "giant", respectively.
## In an experiment to determine if this assumption is reasonable, a
## cross results in progeny having 243 dwarf and 682 giant plants.
## If "giant" is taken as success, the null hypothesis is that p =
## 3/4 and the alternative that p != 3/4.
binom.test(c(682, 243), p = 3/4)
binom.test(682, 682 + 243, p = 3/4) # The same.
## => Data are in agreement with the null hypothesis.
Output
bnm.ts> ## Conover (1971), p. 97f.
bnm.ts> ## Under (the assumption of) simple Mendelian inheritance, a cross
bnm.ts> ## between plants of two particular genotypes produces progeny 1/4 of
bnm.ts> ## which are "dwarf" and 3/4 of which are "giant", respectively.
bnm.ts> ## In an experiment to determine if this assumption is reasonable, a
bnm.ts> ## cross results in progeny having 243 dwarf and 682 giant plants.
bnm.ts> ## If "giant" is taken as success, the null hypothesis is that p =
bnm.ts> ## 3/4 and the alternative that p != 3/4.
bnm.ts> binom.test(c(682, 243), p = 3/4)

Exact binomial test

data: c(682, 243)
number of successes = 682, number of trials = 925, p-value = 0.3825
alternative hypothesis: true probability of success is not equal to 0.75
95 percent confidence interval:
0.7076683 0.7654066
sample estimates:
probability of success

;

0.7372973


bnm.ts> binom.test(682, 682 + 243, p = 3/4) # The same.

Exact binomial test

data: 682 and 682 + 243
number of successes = 682, number of trials = 925, p-value = 0.3825
alternative hypothesis: true probability of success is not equal to 0.75
95 percent confidence interval:
0.7076683 0.7654066
sample estimates:
probability of success
0.7372973


bnm.ts> ## => Data are in agreement with the null hypothesis.



0n test parametrico c"e non parte dalla assunzione c"e la distri#uzione sia normale quello di ?ilco8. Il risultato in R
pi$ strinato di quello del test it classico, perch non ci sono parametri da stimare. @unica ipotesi c"e la distri
simmetrica.


wilcox.test(x, ...)

## Default S3 method:
wilcox.test(x, y = NULL,
alternative = c("two.sided", "less", "greater"),
mu = 0, paired = FALSE, exact = NULL, correct = TRUE,
conf.int = FALSE, conf.level = 0.95, ...)
Examples
wlcx.t> require(graphics)


wlcx.t> ## One-sample test.
wlcx.t> ## Hollander & Wolfe (1973), 29f.
wlcx.t> ## Hamilton depression scale factor measurements in 9 patients with
wlcx.t> ## mixed anxiety and depression, taken at the first (x) and second
wlcx.t> ## (y) visit after initiation of a therapy (administration of a
wlcx.t> ## tranquilizer).
wlcx.t> x <- c(1.83, 0.50, 1.62, 2.48, 1.68, 1.88, 1.55, 3.06, 1.30)

wlcx.t> y <- c(0.878, 0.647, 0.598, 2.05, 1.06, 1.29, 1.06, 3.14, 1.29)

wlcx.t> wilcox.test(x, y, paired = TRUE, alternative = "greater")

Wilcoxon signed rank test

data: x and y
V = 40, p-value = 0.01953
alternative hypothesis: true location shift is greater than 0


wlcx.t> wilcox.test(y - x, alternative = "less") # The same.

Wilcoxon signed rank test

data: y - x
V = 5, p-value = 0.01953
alternative hypothesis: true location is less than 0

:



wlcx.t> wilcox.test(y - x, alternative = "less",
wlcx.t+ exact = FALSE, correct = FALSE) # H&W large sample

Wilcoxon signed rank test

data: y - x
V = 5, p-value = 0.01908
alternative hypothesis: true location is less than 0


wlcx.t> # approximation
wlcx.t>
wlcx.t> ## Two-sample test.
wlcx.t> ## Hollander & Wolfe (1973), 69f.
wlcx.t> ## Permeability constants of the human chorioamnion (a placental
wlcx.t> ## membrane) at term (x) and between 12 to 26 weeks gestational
wlcx.t> ## age (y). The alternative of interest is greater permeability
wlcx.t> ## of the human chorioamnion for the term pregnancy.
wlcx.t> x <- c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91, 1.64, 0.73, 1.46)

wlcx.t> y <- c(1.15, 0.88, 0.90, 0.74, 1.21)

wlcx.t> wilcox.test(x, y, alternative = "g") # greater

Wilcoxon rank sum test

data: x and y
W = 35, p-value = 0.1272
alternative hypothesis: true location shift is greater than 0


wlcx.t> wilcox.test(x, y, alternative = "greater",
wlcx.t+ exact = FALSE, correct = FALSE) # H&W large sample

Wilcoxon rank sum test

data: x and y
W = 35, p-value = 0.1103
alternative hypothesis: true location shift is greater than 0


wlcx.t> # approximation
wlcx.t>
wlcx.t> wilcox.test(rnorm(10), rnorm(10, 2), conf.int = TRUE)

Wilcoxon rank sum test

data: rnorm(10) and rnorm(10, 2)
W = 6, p-value = 0.0003248
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
-3.4575256 -0.9910623
sample estimates:
difference in location
-2.396906


wlcx.t> ## Formula interface.
wlcx.t> boxplot(Ozone ~ Month, data = airquality)
Aspetto per confermare cambio pagina...

wlcx.t> wilcox.test(Ozone ~ Month, data = airquality,
wlcx.t+ subset = Month %in% c(5, 8))

Wilcoxon rank sum test with continuity correction

data: Ozone by Month
W = 127.5, p-value = 0.0001208

>

alternative hypothesis: true location shift is not equal to 0

Warning message:
In wilcox.test.default(x = c(41L, 36L, 12L, 18L, 28L, 23L, 19L, :
cannot compute exact p-value with ties


Test a due campioni

@ipotesi c"e I due campioni prevenano da una distri#uzione con la stessa media.
Aampione 1 B C1, ,1
Aampione 1 B C+, ,+
D3: C1 6 C+

Si calcola:

t =
x1 -x2

SLM
, SE9M 6 (SEH
1
2
+SEH
2
2
).

Eci sono due modi per calcolare SE9M, a seconda c"e si assuma c"e la deviazione standard dei due campioni sia la
stessa o no.

5pproccio -classico. * la varianza la stessaF ; si calcola una d.s -com#inta. o pooled;
in questo caso, sotto la ipotesi nulla valida, la distri#uzione di t una t*Student con n1+n2-2 gradi di libert.

Si deve speci!icare: t.test(a~f, var.equal=T)
5pproccio alternativo 1 si stimano le due d.s. come separate, secondo il metodo di Welch. In qs caso la distri#uzione
BGB una t*Student, ma ci si approssima.

Si deve speci!icare: t.test(a~f)

Il !ormato da usare ric"iede c"e la vari#ile numerica sia in un vettore, ed il !attore c"e distinue le due varia#ili in
unHaltra; le due devono essere in un data !rame.

t.test2varInum JvarI!att4
Il risultato simile a quello del test t, si noti c"e lintervallo di confidenza riferito alla differenza delle due medie e
quindi dovre##e contenere lo 3 se le due sono uuali.


5 6 7 8 9
0
5
0
1
0
0
1
5
0

%


ANOVA

Aomando di #ase:

lm 28 J !att4

dove
8 varia#ile
!att ruppi, varia#ile di tipo !attore

anova2lm2 44

summarK 2lm244


Esempio

> anova(lm(folate~ventilation))
Analysis of Variance Table

Response: folate
Df Sum Sq Mean Sq F value Pr(>F)
ventilation 2 15516 7757.9 3.7113 0.04359 *
Residuals 19 39716 2090.3
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1


@a prima ria esprime la parte derivante dalla di!!erenza LR5 i ruppi; Sum sq somma delle rad qua del quadrato delle
di!!erenze !ra la media del ruppo e la media enerale.

@a seconda ria esprime i Residual, ovvero la parte BGB spieata dalla di!!erenza delle medie, ossia la somma deli
errori quadratici !ra sinola osservazione e media complessiva.

I@ test veri!ica il rapporto !ra le M Sq, rispetto alla ipotesi di medie dei ruppi tutte uuali !ra loro.
In tal caso, il rapporto tende a 1, perc"M entram#e sare##ero dovute al solo errore casuale; diversamente, il rapporto
proverre##e da un valore di L del livello di Ls sini!icativo, cio del quantile al %&' della distri#uzione L per N*1, B*O
Pd@.
Baturalmente, un valore di L piccolo vuol dire una varinza LR5 i ruppi piccola, cio medie poco dissimili !ra loro.

> aov(lm(folate~ventilation))
Call:
aov(formula = lm(folate ~ ventilation))

Terms:
ventilation Residuals
Sum of Squares 15515.77 39716.10
Deg. of Freedom 2 19

Residual standard error: 45.72003
Estimated effects may be unbalanced

9imostrata la sini!icativit/ del test, cerc"iamo dove essa , cio !ra Q05@I ruppi.

> summary(lm(folate~ventilation))

Call:
lm(formula = folate ~ ventilation)

Residuals:
Min 1Q Median 3Q Max
-73.625 -35.361 -4.444 35.625 75.375

13


Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 316.62 16.16 19.588 4.65e-14 ***
ventilationN2O+O2,op -60.18 22.22 -2.709 0.0139 *
ventilationO2,24h -38.62 26.06 -1.482 0.1548
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 45.72 on 19 degrees of freedom
Multiple R-squared: 0.2809, Adjusted R-squared: 0.2052
F-statistic: 3.711 on 2 and 19 DF, p-value: 0.04359


-Intercept. la media del primo ruppo; le altre ri"e sono la di!!erenza delle medie deli altri due ruppi rispetto alla
media del primo ruppo. I test t sono ri!eriti alle di!!erenze 1mo 1 +do e 1mo <zo. Manca, in qs report, la valutazione
!ra +do e <zo ruppo.

0na alternativa c"e utilizza il metodo Qon!erroni 2pi$ conservativo4 :

> pairwise.t.test(folate,ventilation,p.adj="bonferroni")

Pairwise comparisons using t tests with pooled SD

data: folate and ventilation

N2O+O2,24h N2O+O2,op
N2O+O2,op 0.042 -
O2,24h 0.464 1.000

P value adjustment method: bonferroni

A"e !ornisce una ta#ella dei test t. I valori p sono modi!icati e moltiplicati dal numero dei con!ronti.


BGR5
Se si vuole rilassare la ipotesi di varianza uuale, usare:

F oneSaK.test2!olateJventilation4

Gne*SaK analKsis o! means 2not assumin equal variances4

data: !olate and ventilation
L 6 +.%:3=, num d! 6 +.333, denom d! 6 11.3;&, p*value 6 3.3%+::

In qs caso, sem#ra c"e il test diventi BGB sini!icativo, !orse 8c" I ruppi pi$ diversi "anno maiore varianza.

Questo veri!ica se sono diverse:

F #artlett.test2!olateJventilation4

Qartlett test o! "omoeneitK o! variances

data: !olate #K ventilation
QartlettHs O*squared 6 +.3%&1, d! 6 +, p*value 6 3.<&3>

0na ra!ica:

2da 9alaard, pa. 1<=4

xbar <- tapply(folate, ventilation, mean)
s <- tapply(folate, ventilation, sd)
n <- tapply(folate, ventilation, length)

11

sem <- s/sqrt(n)
stripchart(folate~ventilation, method="jitter",
+ jitter=0.05, pch=16, vert=T)
arrows(1:3,xbar+sem,1:3,xbar-sem,angle=90,code=3,length=.1)
lines(1:3,xbar,pch=4,type="b",cex=2)

da ricordare c"e stripc"art con un aromento !ormula 8 ~ !att indica c"e il ra!ico va !atto sulla var 8 separando per cuTiascun
!attore !att.



N2O+O2,24h N2O+O2,op O2,24h
2
0
0
2
5
0
3
0
0
3
5
0
f
o
l
a
t
e

Potrebbero piacerti anche