Sei sulla pagina 1di 26

CHAPTER 10

Estimation and
Hypothesis
Testing : Two
populations

LEARNING OBJECTIVES
Outline and explain the procedure for a test of
significance between two sample means
Determine when to use an independent t test and
when to use a paired t test
Calculate and interpret the results of an independent
and a paired t test
Compute a confidence interval from a set of data for
the difference between two population means

nferences about the Difference between two


population means for large and independent
samples

Independent Vs. Dependent Samples


Two samples drawn from two populations
are independent if the selection of one
sample from one population does not
affect the selection of the second sample
from the second population. Otherwise,
the samples are dependent

Inferences about the Difference


between two population means for
large and independent samples
The mean, Standard Deviation, And
Sampling Distribution of
x1 - x2

For 2 large, independent samples, taken from 2


x - x
different populations, the sampling
distribution
of
is normal with its mean and standard
2
2

deviation
(Fig
1)
1
2
x1-x2 1as
follows:

2
x - x
n1
n
1.
and
/ 2
2
2
1

s x1-x2

s1 s2

n1 n2

1=themeanofpopulation1;2=themeanofpopulation2;1=the
(1 and 2 are=thestandarddeviationofpopulation
unknown)
standarddeviationofpopulation1;
2
2;n1=thesizeofthesampledrawnfrompopulation1;n
2=thesizeofthe
x
x
sampledrawnfrompopulation2;=themeanofthesampledrawnfrom
population1;=themeanofthesampledrawnfrompopulation2;s1:
1

Interval Estimation of

1 2

for

Confidence interval
1 2for
The (1-) 100% confidence interval
( x is:
x ) z
If and are known
1

x1 x 2

( x1 x2 ) zs x1 x2

If 1 and 2 are not known

Assume that executive males earned an average of $ 538 per week and
executive females earned an average of $ 470 per week. Assume that these
means have been calculated for samples of 500 and700 workers taken from
the two populations, respectively. Further assume that the standard
deviations of weekly earnings of the 2 populations are $ 66 and $ 60,
respectively.
a) What is the point estimate of ? x1 x2 $538 $470 $68
b) Construct a 95% confidence interval for the difference between the
mean weekly earnings of the2populations? $60.70 to $ 75.30.
1

Hypothesis Testing about 1 2


Three situations:
1. H1: 1 2 ;1-2 0
2. H1: 1 > 2 ;1-2 > 0
3. H1: 1 < 2 ;1-2 < 0
Test Statistic z for x x
1

The value of the test statistic z for


as

( x1 x2 ) ( 1 2 )
z
x1 x2
z

( x1 x2 ) ( 1 2 )
s x1 x2
1 2

The value of

If 1 and 2 are known;


If 1 and 2 are not known

is substituted from H0

x1 x2

is computed

Hypothesis Testing about 1 2


Example: Test at the 1 % significance level if the
mean weekly earnings of the 2 groups of
executives are different. Three situations:
n1 500; x1 $538; 1 $66 n1 700; x1 $470; 1 $60
1 :meanweekl yearning of male executives
2 :meanweekl yearning of female executives
hypothesis

Step 1: State the null and alternative


H0: 1 = 2 ;1-2 = 0 (the mean weekly earning are not different)
H1: 1 2 ;1-2 0 (the mean weekly earning are different)

Step2: Select the distribution to use: n1> 30 and n2>30, both


sample sizes are large. Sampling distribution of
is
approximately normal, use the normal distribution to make the
x x
hypothesis test.
1

Step 3: Determine the rejection and nonrejection regions:


=0.01 ( sign in H1 indicates that test is a two-tailed test) /2
=0.005 fig 2
Critical values of z are 2.58 and -2.58.

Hypothesis Testing about 1 2


Step 4: Calculate the value of the test statistic

x1-x2

1 2

3.7222
n1 n2
2

( x1 x2 ) ( 1 2 ) (538 470) 0
z

18.27
x1 x2
3.7222
Step 5: Make a decision: Z= 18.27 falls in the
rejection region, we reject H0. Conclusion : the
mean weekly earnings of the 2 groups of executives
are different

nferences about the Difference between two population


means for small- and independent samples-Equal
tandard Deviations

1 2
When to use the t distribution to make inferences
1 2
about
The t distribution is used to make inferences about
when the following assumptions hold true.
1. The 2 populations from which the two samples are drawn
are approximately normally distributed
2.the samples are small (n1<30 and n2<30) and independent
3.the standard deviations 1 and 2 of the two populations are
unknown and they are equal , that is 1 = 2
Since is unknown , we replace it by its point estimator s p (pooled sample standard
n1 and n2 : sizes of the 2 samples
deviation).
2

(n1 1) s1 (n2 1) s2
sp
n1 n2 2

s12 and s22 are the variances of the 2


samples
n1-1: degrees of freedom for sample
1

Inferences about the Difference between two population


means for small- and independent samples-Equal
Standard Deviations

Estimator of the standard


deviation of
s x1 x2 s p

1 1

n1 n2

x1 x2

Interval Estimation of

1 2

1 for
2
Confidence interval
The (1-) 100% confidence interval
1 2for
is:
( x1 x2 ) tsx x
1

Assume that a consumer agency wanted to estimate the


difference in the mean amounts of caffeine in 2 brands of
coffee. The agency took a sample of 15 one-pound jars of
brand1 coffee that showed the mean amount of caffeine in
these jars to be 80 milligrams per jar with a standard deviation
of 5 milligrams. Another sample of 12 one-pound jars of brand II
coffee gave a mean amount of caffeine equal to 77 milligrams
per jar with a standard deviation of 6 mg .
a) Construct a 95% confidence interval for the difference the
mean amounts of caffeine in one-pound jars of these 2
brands of coffee.

Interval Estimation of

1 2

1
2
Confidence interval
for

Assume that the 2 populations are normally distributed


and that the standard deviations of the 2 populations
are equal.
1: mean amount of caffeine per jar in all one-pound jars of
brand I
2: mean amount of caffeine per jar in all one-pound jars of
x : meanamount of caffeine in sample of 15one - pound jarsof Brand Icoffee 80mg
brand II
x : meanamount of caffeine in sample of 12one - pound jarsof Brand IIcoffee 77mg
1

S1=5mg,s
n12=15,n22 =12
2
2
2=6mg;

(n 1)s (n 1)s
(15 1)(5) (12 1)(6)
sp 1 1 2 2
5.4626
n1 n2 2
15 12 2

s x1 x2 s p

1 1
1 1
(5.4626)

2.1157
n1 n2
15 12

Area in each tail: /2 =0.05/2=0.025; t value= 2.060


n1+n2-2=15+12-2=25

Interval Estimation of

95%
CI for

1 2

1 2

( x1 x2 ) tsx1 x2 (80 77) 2.060(2.1157) 1.36to7.36

Conclusion: with 95% confidence we can


state that based on these 2 sample results,
the difference in the mean amounts of
caffeine in one-pound jars of these 2 brands
of coffee lies between -1.36 and 7.36 mg

Hypothesis Testing about 1 2

Test Statistic t for x1 x2


The value of the test statistic t for x1 x2 is computed as
( x1 x2 ) ( 1 2 )
s x1 x2

The value of
is substituted
from H0
Example : The management at a supermarket wanted to
investigate whether or not a promotional campaign
increases the sales of a product. A sample of 28 days
during the promotional campaign showed that an
average of 316 units of this product are sold per day with
a SD of 18 units. A sample of 24 days before the
promotional campaign showed that an average of 282
units of this product are sold per day with a SD of 13
units. Assume that the number of units sold per day has a

Hypothesis Testing about 1 2


Example: At the 5 % significance level, test if the promotional campaign
increases the mean number of units sold per day.

n1 28; x1 316; s1 18

n2 24; x2 282; s2 13

1 :mean number of units sold during the promotiona lcampaign


2 :mean number of units sold before the promotiona lcampaign

Step 1: State the null and alternative hypothesis


H0: 1 = 2 ;1-2 = 0 (promotional campaign does not increase the mean daily
sales)
H1: 1 > 2 ;1-2 > 0 (promotional campaign does increase the mean daily sales)

Step2: Select the distribution to use: The 2 populations are normally


distributed , n1< 30 and n2<30, both sample sizes are small and
independent. The standard deviations of the 2 populations are unknown but
equal, use the t distribution
Step 3: Determine the rejection and nonrejection regions: =0.01 (>
sign in H1 indicates that test is a right-tailed test) =0.01; degrees of
freedom: n1+n2-2= 50, critical value of t 1.676 fig 3

Hypothesis Testing about 1 2

Step 4: Calculate the value of the test statistic


2

(n1 1) s1 (n2 1) s2
(28 1)(18) 2 (24 1)(13) 2
sp

15.8965
n1 n2 2
28 24 2

s x1 x2 s p
t

1 1
1
1
(15.8965)

4.4220
n1 n2
28 24

( x1 x2 ) ( 1 2 ) (316 282) 0

7.689
s x1 x2
4.4220

Step 5: Make a decision: t=7.689 falls in the


rejection region, we reject H0. conclusion :
promotional campaign does increase the mean
number of units sold per day

Inferences about the Difference between


two population means for paired samples
Paired or matched samples
Two samples are said to be paired or matched samples
when for each data value collected from one sample
there is a corresponding data value collected from the
second sample , and both these data values are
collected from the same source.
d: Paired Difference: difference between the two
data values for each element of the two samples
n: the number of paired difference values
d :The population mean of the paired differences
d: The population standard deviation of the paired differences

d : the sample mean of the paired difference s

sd: the sample standard deviation of the paired differences

Inferences about the Difference between


two population means for paired samples
Mean and standard deviation of the paired differences for samples

d
n

( d ) 2
d n
sd
n 1
2

d
Sampling Distribution, mean and standard deviation
of
If the number of paired samples is large(n30), because of the central limit
d
theorem the sampling distribution of is approximately
normal with its mean and
standard deviation as

d d

and

d
d
n

Estimate of the Standard deviation of paired differences. If:

1.n<30
d
2.
Is not known
3. The population of paired differences is approximately normally distributed
d

t distribution is used
sd to make inferences about
which is calculated as

sd

sd
n

. the standard deviation

of

is estimated by

Interval Estimation of
d
Confidence Interval for

d
d
The (1-) 100% confidence interval for
is: tsd
where the value of t is obtained from the t distribution table for
the given confidence level and n-1 degrees of freedom
Example:
A hospital is considering adopting a new procedure to decrease
waiting time incurred by patients admitted through the ER . The
hospital randomly selected seven admission staff and gathered
information on the times taken by them to admit patients using
the old procedure. Then, the same employees were asked to
admit patients using the new procedure. the following table
gives the assembly times (in minutes) for these seven staff.
Let d be the mean of the differences between the admission
times for the 2 populations . Construct a 95% CI for d . Assume
that the population of paired differences is approx. normally
distributed.

Interval Estimation of

Confidence Interval
for
d

Employ
ee

Old
Procedure
(time in
min)

New
Procedure
(time in
min)

Difference
(d)

64

60

16

71

66

25

68

66

66

69

-3

73

63

10

100

62

57

25

70

62

64

d 31

4.43
n

d2

d=31
d =31
2
( d )2
(31)
243
sd 4.1975
d n
7
s

1.5865
sd

4.1975 d
n
7
n 1
7 1
2

Interval Estimation of
d

For a 95% CI, the area in each tail of the t


distribution is:
/2=0.025; df =n-1=7-1=6
t6,0.025=2.447

d tsd 4.43 2.447(1.5865) 0.55to8.31


Conclusion: we can state wit 95%confidence that the
mean difference between the admission times for
the2 procedures is between 0.55 and8.31 minutes. In
other old procedure takes, on average, 0.55 minutes
to 8.31 minutes longer than the new procedure to
admit a patient to the ER.

Hypothesis Testing about

Test Statistic t for

The value of the test statistic t for


is computed as
d d
t
The value of
is substituted from
sd
H0
d

Example : An understaffed hospital wants to know if


attending a course on how to perform a surgical
procedure can increase the average number of
procedures performed per week. The company sent six o
its surgeons to attend this course. The following table
gives the 1-week number of procedures done by these
surgeons before and after they attended this course.

Hypothesis Testing about

# of
procedures
(Before
course)
12
18
25
9
14
18
d 25

4.17
n
6

# of
procedures
(after
course)
18
24
24
14
19
20
( d ) 2
d n
sd

n 1
2

Difference
(d)

-6
-6
1
-5
-5
-4
(25d=-25
)2
139

sd

6 2.6394 sd
n
6 1

d2

36
36
1
25
25
16
2
d
2.6394 =139
6

1.0775

Hypothesis Testing about 1 2


Example: At the 1 % significance level, can you conclude that
the mean weekly # of procedures for all surgeons increase as a
result of attending the course.
1: mean weekly # of procedures for all surgeons before the
course
2: mean weekly # of procedures for all surgeons after the course
Step 1: State the null and alternative hypothesis
H0: d=0 ;1-2 = 0 (mean weekly # of procedures do not increase)
H1: d < 0 ;1-2 < 0 (mean weekly # of procedures do increase)
Step2: Select the distribution to use: The population of paired
differences is normally distributed , n< 30 and the standard
deviations d is unknown, use the t distribution
Step 3: Determine the rejection and nonrejection regions:
=0.01 (< sign in H1 indicates that test is a left-tailed test)
area in the left tail= =0.01, ; degrees of freedom: n-1=
6-1=5, critical value of t -3.365 fig 4

Hypothesis Testing about 1 2


Step

4: Calculate the value of the test statistic


d d 4.17 0

3.870
sd
1.0775

Step 5: Make a decision: t=-3.870 falls in the rejection


region, we reject H0. conclusion : mean weekly sales for
all salespersons increase as a result of this course.

CONCLUSION
Tosolveatwo-samplehypothesisproblem,itisfirst
necessarytodeterminewhetherthesamplesare
independentorpairedandwhetherthetestisone-or
two-tailed.Nextchooseasignificanceleveland
calculatethe tstatistic.Thendeterminewhether
yourresultsaresignificant.Finally,calculateand
interprettheconfidenceintervals.Rememberthat
two-sampleconfidenceintervalsrepresentthe
differencebetweenthemeans.

Potrebbero piacerti anche