Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
distributions:
- Comparing two means
IPS chapter 7.2
Two-sample z distribution
Two independent samples t-distribution
Two sample t-test
Two-sample t-confidence interval
Robustness
Details of the two sample t procedures
Comparing two samples
(A)
Population 1 Population 2
Sample 2
Sample 1
Which
is it? We often compare two
maybe from two distinct populations with (µ1,σ1) and (µ2,σ2). We use x1
x
and 2 to estimate the unknown µ1 and µ2.
x x
When both populations are normal, the sampling2 distribution of ( 1− 2 )
2
σ1 σ 2
+
is also normal, with standard deviation : n1 n2
( x1 − x2 ) − ( µ1 − µ 2 )
Then the two-sample z statistic z=
σ 12 σ 22
has the standard normal N(0, 1) +
n1 n2
sampling distribution.
Two independent samples t
distribution
We have two independent SRSs (simple random samples) coming
maybe from two distinct populations with (µ1,σ1) and (µ2,σ2) unknown.
x x
We use ( 1,s1) and ( 2,s2) to estimate (µ1,σ1) and (µ2,σ2) respectively.
have similar shapes and that the sample data contain no strong outliers.
The two-sample t statistic follows approximately the t distribution with a
standard error SE (spread) reflecting
s12 s22
variation from both samples: SE = +
n1 n 2
µ1-µ2 x1 −x 2
Two-sample t-test
The null hypothesis is that both population means µ1 and µ2 are equal,
thus their difference is equal to zero.
H 0: µ 1 = µ 2 <=> µ 1 − µ 2 = 0
We find how many standard errors (SE) away (x1 −x 2 ) −(µ1 −µ2 )
x x t=
from (µ1 − µ2) is ( 1− 2) by standardizing with t: SE
With df = smallest(n1 − 1, n2 − 1)
n1 n 2
Does smoking damage the lungs of children exposed
to parental smoking?
Practical use of t: t*
C is the area between −t* and t*.
n1 n 2
for df = smallest (n1−1; n2−1) and
the column for confidence level C. C
The margin of error m is:
m m
s12 s22
m =t* + = t * SE −t* t*
n1 n2
Common mistake !!!
s12 s22
CI : ( x1 − x2 ) ± m; m = t * + = 2.086 * 4.31 ≈ 8.99
n1 n2
With 95% confidence, (µ1 − µ2), falls within 9.96 ± 8.99 or 1.0 to 18.9.
Robustness
The two-sample t procedures are more robust than the one-sample t
procedures. They are the most robust when both sample sizes are
equal and both sample distributions are similar. But even when we
deviate from this, two-sample tests tend to remain quite robust.
s12 s22 2
+
n1 n 2
df =
1 s1 2 2
1 s2 2 2
+
n1 −1 n1 n 2 −1 n 2
95% confidence interval for the reading ability study using the more precise
degrees of freedom:
Table C
=TTEST(array1,array2,tails,type)
There are two versions of the two-sample t-test: one assuming equal
variance (“pooled 2-sample test”) and one not assuming equal
variance (“unequal” variance, as we have studied) for the two
populations. They have slightly different formulas and degrees of
freedom.
The pooled (equal variance) two-
sample t-test was often used before
computers because it has exactly
the t distribution for degrees of
freedom n1 + n2 − 2.
The sampling distribution for (x1 − x2) has exactly the t distribution with
(n1 + n2 − 2) degrees of freedom.