Sei sulla pagina 1di 8

Important Formulas used in this Chapter:

1. Difference between two Sample Means with known population Variances and Normal
Distributions: Confidence Interval and Test of Hypothesis:

= {( 12/ n1) +( 22/ n2)}

2. Sampling Distribution of Difference between two Sample Means When Population


Variances are Not Known but are Assumed to be Equal: Confidence Interval and Test of
Hypothesis
pooled sample variance Sp2 = {(n1-1) S12 + (n2-1) S22}/( n1+ n2 2) and

= [Sp2{

}].

3. Welchs Adj df = [S12/ n1+ S22/ n2]2/[(S12/ n1)2/( n1-1) + (S22/ n2)2/( n2-1)]
The adjusted degrees of freedom are rounded to the next lower integer to be conservative and

= {( s12/ n1) +( s22/ n2)}

4. Paired Difference Experiments: The Case of Matched Pairs


The mean difference , the variance of difference sd2 and standard deviation sd for the sample of
n differences are calculated using the usual formulas for mean, sample variance and standard
deviation. Since the population variance of the difference is usually unknown, we perform t-test
using n-1 degrees of freedom for hypotheses about the mean difference: . The standard error of
the sample mean difference is sd/n.

5. Comparing Two Population Proportions Using large, Independent Samples


Pooled sample proportions p0 = (X1 + X2)/(n1 + n2), or = (n1*p1 + n2*p2)/(n1 + n2)
p1-p2 = [ p0(1- p0){(1/ n1) + (1/ n2)}]
6. Test for a Specified Difference between two Proportions
p1-p2 = [{p1(1-p1)/n1} + {p2(1-p2)/n2}]

Solved Examples/Formulas for Chapter 10


Example 1: Independent random samples of Quantitative Techniques (including Statistics) and
Accounting professors were asked to provide the number of hours they spend in preparation for
each class. The sample of 321 Quantitative professors had a mean time of 3.09 preparation hours
and the sample of 94 Accounting professors had a mean rating of 2.88 hours. From similar past
1

studies the population standard deviation for the Quantitative professors is assumed to be 1.09,
and the standard deviation for the Accounting professors is 1.01. Find the 95% confidence
interval for the difference in the mean hours of preparation for the populations of the two groups
of professors. Also test the hypothesis that the quantitative professors spend more time in
preparation compared to Accounting professors.
Answer: n1 = 321 1= 3.09 1= 1.09 n2 = 94

2 = 2.88

2 = 1.01

Z/2 = Z.025 = 1.96 for 95% confidence interval. Confidence interval is:
(3.09-2.88) 1.96* = (3.09-2.88) 1.96*{( 1.092/ 321) +( 1.012/ 94)} = 0.21
1.96*0.1224 = 0.21 0.24 or between -0.03 and 0.45.
Now let us test (as the question asks) the hypothesis that Quantitative professors spend more
time in preparation for the class. To test the assumption that Quantitative professors spend more
time we form the following hypotheses:
H0: 1- 2 0 (do not spend more time on average)
H1: 1- 2 > 0 (spend more time): right tailed test.
In one tailed test the Z- value is smaller than the corresponding two-tailed value (because we do
not halve the ). It is 1.28 for 10% level test, 1.645 for 5% level test (instead of 1.96) and 2.33
for 1% level test. The calculated Z value from the results shown above is: 0.21/0.1224 = 1.716.
The calculated Z value exceeds the critical Z values at 5% and 10% levels. Therefore, we
conclude that we can reject Null hypothesis at 5% level of significance and can say with 95%
confidence that Quantitative professors spend more time compared to Accounting professors
based on the sample results. But we cannot have 99% confidence in this statement. There is some
evidence but not strong evidence.
Use of MegaStat: First, type two columns in Excel with headings Sample1 and Sample2
(arbitrary names) followed by the respective means, standard deviations and sample sizes like
this:
sample1 sample2
3.09
2.88
1.09
1.01
321
94
From Excel go to Add-Ins, then to MegaStat, then Hypothesis test-compare two independent
groups- then select summary input in the dialogue box, then by clicking and dragging the mouse
indicate the ranges of inputs for the two samples, select z-test, and indicate the direction of
hypotheses (you can also put any hypothesized value for the difference of the mean in place of 0,
if you have in the question).
The one tailed test results look like this:
2

Hypothesis Test: Independent Groups (z-test)


sample1
sample2
3.09
2.88
1.09
1.01
321
94
0.21000
0.12064
0
1.74
.0409

mean
std. dev.
n
difference (sample1 - sample2)
standard error of difference
hypothesized difference (D0)
z
p-value (one-tailed, upper)

If the question asked to Test whether Quantitative Professors spent 0.10 hours more than
Accounting Professors, then the given D0 = 0.10 which would have to be taken out from the
difference to calculate the test statistic. In that case the hypotheses would look like this:
H0: 1- 2 0.10 (do not spend 0.10 hours more time than Accounting)
H1: 1- 2 > 0.10 (spend 0.10 hours or more additional time): a right tailed test.
The computer printout would be like this:
Hypothesis Test: Independent Groups (z-test)
sample1
sample2
3.09
2.88
1.09
1.01
321
94
0.21000
0.12064
0.1
0.91
.1809

mean
std. dev.
n
difference (sample1 - sample2)
standard error of difference
hypothesized difference (D0)
z
p-value (one-tailed, upper)

Because of the hypothesized difference (0.10 instead of 0) the Z-value dropped to 0.91 from 1.74
and the p-value increased to 0.1809, which implies that the Null cannot be rejected even at 10%
significance level. The quantitative Professors may be spending a little more time but not greater
than 0.10 hours more compared to the Accounting Professors, on average.
Example 2: For a random sample of 10 bulbs the mean bulb life is 4,250 hrs with standard
deviation of 250 hr. For another brand of bulbs the mean bulb life and standard deviation
calculated from a sample of 8 bulbs are 4000 hrs and 200 hr, respectively. The bulb life for both
brands is assumed to be normally distributed with equal variance. Construct 90% and 95%
confidence intervals and also test the hypothesis that the two brands have the same average lives.
Answer: n1=10

n2 = 8 1= 4250

2= 4000

s1= 250

s2 = 200

This is obviously a case with unknown but equal variances requiring pooled sample variance and
t-distribution with degree of freedom (df) = 10 + 8 -2 = 16
We have, 1- 2 = 4250 4000 = 250;
The standard error

= [Sp2{

Sp2 = (9*2502 + 7*2002)/(10 + 8 2) = 52656.25.


+

}] = [52656.25{

+ }]= 108.85

The two tailed t-values are: 1.746 for 90% confidence interval and 2.120 for 95% confidence
interval. Therefore, the 90% confidence interval is 250 1.746* 108.85 and 95% confidence
interval is 250 2.120* 108.85.
We see that even the 95% interval does not contain the zero value. So there is some evidence that
the two average lives of bulbs are not equal. But how strong is the evidence that the two brands
have different average lives?
H0: 1- 2 =0 (the two brands have same average lives)
H1: 1- 2 0 : two-tailed test
For two-tailed test the t-values for 16 df are: 2.120 and 2.921. The calculated value of
the test statistic is 250/108.85 = 2.297. Thus the Null hypothesis is rejected at 5%
level but not at 1% level implying that we have 95 % confidence in saying that the
two average lives are different but not 99% confidence.
MegaStat Results for two tailed testHypothesis Test: Independent Groups (t-test, pooled
variance)
sample1
4250
250
10

sample2
4000
200
8
16
250.000
52,656.250
229.469
108.847
0
2.30
.0355
19.255
480.745
230.745

mean
std. dev.
n
df
difference (sample1 - sample2)
pooled variance
pooled std. dev.
standard error of difference
hypothesized difference
t
p-value (two-tailed)
confidence interval 95.% lower
confidence interval 95.% upper
margin of error

If I had asked you to test whether the first brand has more average life than the second then the
hypotheses would be:
H0: 1- 2 0 (the first brand does not have more life than the second)
H1: 1- 2 > 0 : (the first brand has more life): right-tailed test
In this case the t-values would be 1.746 and 2.583.
4

Example 3: Suppose data on retail prices of a cholesterol reducing drug were collected from
randomly selected Pharmacies of two states (16 pharmacies from State1 and 13 from State 2)
with the following values:
State 1

State2

125.05

145.32

137.56

131.19

142.5

151.65

145.95

141.55

117.49

125.99

142.75

126.29

121.99

139.19

117.49

156

141.64

137.56

128.69

154.1

130.29

126.41

142.39

114

121.99

144.99

141.3
153.43
133.39

Given the two samples can we conclude that the average retail prices do not significantly differ
by State? Give conclusion for two cases (a) population variances are equal; and (b) population
variances are unequal.
This time I have given raw data instead of calculated sample means and sample variances.
Therefore, you have to first find the sample means and sample variances, then pooled sample
variance if equal variance is assumed, then find the standard errors using the formulas given
above, and in case of unequal variances also find the adjusted degree of freedom using the
Welchs Adjusted df formula. This can take a long time.
With MegaStat your life becomes much easier. First enter the raw data in Excel as shown above.
Then go to MegaStat- hypothesis test- compare two independent groups- then select t-pooled or
t-unequal variance, select data input instead of summary input, specify the direction of
hypothesis and the rest is done by MegaStat. The results are given below:

Hypothesis Test:
Independent Groups (t-test, pooled
variance)
5

State 1
133.9938
11.0150
16

State2
138.0185
12.6631
13
27
-4.02471
138.6737
4
11.77598
4.39708
0
-0.92
.3681
-13.04678
4.99735
9.02206

mean
std. dev.
n
df
difference (State 1 - State2)
pooled variance
pooled std. dev.
standard error of difference
hypothesized difference
t
p-value (two-tailed)
confidence interval 95.% lower
confidence interval 95.% upper
margin of error

Hypothesis Test: Independent Groups (t-test, unequal variance)


State 1
State2
133.9938
138.0185 mean
std.
11.0150
12.6631 dev.
16
13 n
24 df
-4.02471 difference (State 1 - State2)
4.46296 standard error of difference
0 hypothesized difference
-0.90 t
.3761 p-value (two-tailed)
-13.23581 confidence interval 95.% lower
5.18639 confidence interval 95.% upper
9.21110 margin of error

Since these p-values are (much) larger than 0.10 the Null hypothesis of equality of the two
means cannot be rejected at any reasonable level of significance. Although the computer saves a
lot of time and energy by doing all the calculations for us, we still have to know what formulas
or theories were involved to be able to properly interpret and explain the results. Therefore,
learning both to use calculator with formulas and the computer is the best way to master this
subject.
Example 4: Suppose a new weight loss program was sponsored by the local health and fitness
club in a small town. The program claims to make people lose more than five pounds in a month.
Twelve participants were selected and their weights were recorded. After a month of the
program their weights were recorded again. It was found that many participants lost some
weight. But there was variation. Some lost more and some lost less, and some did not lose any
pound at all. The measurements were as follows:
6

Obs

before

after

217

202.5

188

178

225

210

168

157

178

169

182

180

174.5

163.5

161.5

153

177.5

178

10

358.5

336

11

181

174

12

210

197.5

Test the claim of the program sponsors at reasonable levels of significance.


We can use calculator to find the paired differences, then find its mean and standard deviation,
then standard error of the sample mean difference, and then perform the t-test using t- table.
Hypothesis Test: Paired Observations
5.0000 hypothesized value
201.7500 mean before
191.5417 mean after
10.2083 mean difference (before - after)
5.9979 std. dev.
1.7315 std. error
12 n
11 df
3.01 t
.0060 p-value (one-tailed, upper)

The p-value shows that the Null hypothesis that the weight loss is less than or equal to five
pounds is strongly rejected even at 1% significance level. Therefore, the claim of this
hypothetical weight loss programs has strong support from this sample.
Example 5: A study was conducted to determine if there was a difference in humor content in
British and American trade magazine advertisements. An independent random sample of 203
British trade magazines contained 54 humorous ads while the other independent sample of 270
American trade magazine advertisements contained 56 humors. Does this data set provide
evidence that there are more frequent humorous content in ads of British compared to American
trade magazines? The MegaStat results are given below.

Hypothesis test for two independent proportions


p1
p2
pc
0.266
0.207
0.2323 p (as decimal)
54/203
56/270
110/473 p (as fraction)
54
56
110 X
203
270
473 n
0.059 difference
0. hypothesized difference
0.0392 std. error
1.50 z
.0663 p-value (one-tailed, upper)
-0.0187 confidence interval 95.% lower
0.1367 confidence interval 95.% upper
0.0777 margin of error

The p- value clearly shows that we cannot reject the Null of no difference at 5% level test
but can reject at 10% level test. Thus there is some evidence, but not very strong that
British trade ads contain relatively more humors.
Suppose we want to test whether the proportion of British Humor in magazine ads exceeds
corresponding American ads by more than 2 percentage points. Then we use the second formula.
The MegaStat results would be:
Hypothesis test for two independent proportions
p1
0.266
54/203
54.
203

p2
0.2074
56/270
56.
270
0.0586
0.02
0.0396
0.97
.1650

p (as decimal)
p (as fraction)
X
n
difference
hypothesized difference
std. error
z
p-value (one-tailed, upper)

Potrebbero piacerti anche