Sei sulla pagina 1di 44

Two sample and Paired t-tests

The t Test for Differences


in Population Means
• Each of the two populations is normally
distributed.
• The two samples are independent.
• At least one of the samples is small,
n < 30.
• The values of the population variances are
unknown.
• The variances of the two populations are equal,
1 2 = 2 2
t Formula to Test the Difference in
Means Assuming 12 = 22

 X  X      
1 2
t
1 2

S n  1  S n 
2 2
1 1 1

1 1 2 2

n n 21 2 n n 1 2
ABC Manufacturing Company (Part 2)

Training Method A Training Method B


56 51 45 59 57 53
47 52 43 52 56 65
42 53 52 53 55 53
50 42 48 54 64 57
47 44 44

n  15
1 n  12
2

X  47.731 X  56.5
2

S  19.495
2
S  18.273
2
1 2
ABC Manufacturing Company (Part 1)

Ho: 1   2  0

Ha: 1   2  0
Rejection Rejection
Region Region


.025
2
 .05
 .025 
2 2 .025
2
df  n1  n2  2  15  12  2  25
Nonrejection Region
t 0.25, 25
 2.060
t .025, 25
 2.060
t  2.060
0 .025, 25

If t < - 2.060 or t > 2.060, reject Ho. Critical Values

If - 2.060  t  2.060, do not reject Ho.


ABC Manufacturing Company (Part 3)

 X  X      
1 2
t
1 2

S n  1  S n 1 
2 2
1 1

1 1 2 2

n1  n2  2 n n 1 2

 47.73  56.50  0

19.49514  18.27311 1 1

15  12  2 15 12
 5.20

If t < - 2.060 or t > 2.060, reject Ho. Since t = -5.20 < -2.060, reject Ho.

If - 2.060  t  2.060, do not reject Ho.


D(differen
Dependent Samples Individual Before After
ce)

1 32 39 -7

2 11 15 -4

• Before and After 3 21 35 -14

Measurements on the 4 17 13 4
same individual. 5 30 41 -11

• t=( d ̅) √n/s ~ t (n-1)d.f 6 38 39 -1

• n=sample size
d 
d
n

 d  d 
2

S 
d
n 1

  d
2

d 2

n

n 1
t-test contd…
t-test for testing the observed correlation coefficient:
t= (r/sqrt(1-r^2))*sqrt(n-2)~t (n-2) d.f.
Problem1
1)Below are given the gain in weights (in kg’s) of plus 2 students on two diets A and
B.
Diet A:25,32,30,34,24,14,32,24,30,31,35,25
Diet B :44,34,22,10,47,31,40,30,32,35,18,21,35,29,22
Test the two diets differ significantly as regards their effect on increase
In weight.

2)Samples of two types of electric light bulbs were tested for length of life and
following data were obtained:
Type I : n1=8,mean1=1,234 hrs,s1=36hrs.
Type II: n2 =7.mean2=1036 hrs,s2=40hrs.
Is the difference in the means sufficient to warrant that type I is superior to type II
Regarding length of life.
Problems contd….
1)A certain stimulus administered to each of the 12 patients resulted in the
following difference in increase of blood pressure:
5,2,8,-1,3,0,-2,1,5,0,4 and 6
Can it be concluded that the stimulus will, in general, be accompanied by an
increase in blood pressure?
2)In a certain experiment to compare two types of animal foods A and B,the
following results of increase in weights were observed in animals:
1 2 3 4 5 6 7 8 TOTAL

Incr in Food A 49 53 51 52 47 50 52 53 407

weight Food B 52 55 52 53 50 54 54 53 423

a)Assuming that the two samples of animals are independent ,can we conclude
that food B is better than food A?
b)Also, examine the case when the same set of eight animals were used in both
the foods.
Inferences About Population Variances
Importance of Variance

Variance is an important part of the decision-making process for the data.

• The average is 1 litre as claimed by the bottling plant but what about the variance?

• The average strength in the production of 1mg Amaryl(diabetes drug) is 1 mg but what about the variance
of drug weight?

• Risk measurement is a primary focus in the Finance industry and the most common metric is the variance.

• The call center was erratic in Average handling time(AHT) what is the variation in the AHT.
Chi-Square Distribution χ2
• Constructing a (chi-square) χ2 distribution: σ 𝑧𝑖2
• Suppose the population is normally distributed
• A sample of size n is taken
• The squared z scores are computed
• The sum is from a χ2 distribution
• The sampling distribution of (n - 1)s2/2 has a χ2 distribution
with (n-1) degrees of freedom, whenever a simple random
sample of size n is selected from a normal population.
χ2 distribution: (n - 1)s2/ 2

χ 2α : The χ2 value for the area to the right is 


Reading the χ2 Table

/2
(1 - ) /2

0 2
Χ1−𝛼 Χ𝛼2 2
2 2

𝟐
df  𝚾𝟏−𝛂 𝚾𝛂𝟐
𝟐 𝟐
3 .10 0.352 7.815
10 .05 3.247 20.483
Hypothesis Testing About a Population
Variance
Testing whether the variance of X is =, ≤, or ≥ a 𝜎02

Sampling Distribution: (n - 1)s2/𝜎02 ~ χ2 with (n-1) degrees of freedom

Test Statistics: (n - 1)s2/𝜎02

The test is very sensitive to the Normality condition on X.


Hypothesis Testing – Example 1
•The bottle-filling machine dispenses the soft drink into every bottle passing through it. Due
to mechanical imperfections and steady wear of various moving parts, the amount of fluid
dispensed into each bottle is a random variable, and over time this quantity becomes more
and more variable due to wear on the moving parts. The decision to shut the machine down
for major servicing is based on a hypothesis test of the variance of this volume dispensed.
The rule is that if it can be established that the variance of the amount of fluid dispensed
per bottle is greater than 250 ml2 at a level of significance of 0.01 based on a sample of 15
randomly selected filled bottles, then the machine is to be shut down for servicing. On a
particular day, such a random sample of 15 bottles yields a mean of 357.2 ml and a standard
deviation of 16.9 ml. Should the machine be shut down for major servicing? (You may
assume that long-term records indicate that the distribution of the amount of fluid
dispensed by this machine into each bottle is consistent with the theoretical normal
distribution.)
Solution: The p-value approach
1. The Hypothesis: H0 : 𝜎02 ≤ 250 ml2 Vs Ha : 𝜎02 > 250 ml2
2. Data: n = 15, s2 = 16.92 ml2 = 285.61 ml2,  = 1%.
3. Right-Tail Test
4. Sampling distribution is (n - 1)s2/ 𝜎02 = 14s2/250 ~ χ2 with 14 degrees of freedom
5. Test Statistic: 14*285.61/250 = 15.99
6. p-value = P(χ2 > 15.99) > .01 = P(χ2 > 29.141)
7. Since p-value > , do not reject H0.
Hypothesis Testing – Example 2
•The bottle-filling machine was shut down for servicing. After servicing, a test run has to be performed with a
sample of 100 bottles. It will assessed ready for production only if the sample provides evidence that the
variance of the volumes dispensed is less than 150 ml2 at a level of significance of 0.01. Suppose the sample
variance was found to be 105 ml2. Should it be put back into production?

1. The Hypothesis: H0: 𝜎02 ≥ 150ml2 Vs Ha : 𝜎02 < 150ml2


2. Data: n = 100, s2 = 100ml2,  = 1%.
3. Left-Tail Test
4. Sampling distribution is (n - 1)s2/ 𝜎02 = 99s2/150 ~ χ2 with
99 degrees of freedom
5. Test Statistic: 99*100/150 = 66
6. p-value = P(χ2 < 66) < .01 = P(χ2 < 70.065
7. Since p-value < , Reject H0.
The m/c may be deployed
Times Example – Under Normality Condition
15.7
16.9 The State Transport started focusing on punctuality of its buses. First it wanted to establish
12.8 consistent schedules and decided that the arrival times at any bus stop should have a variance of 4
15.6 minutes. On the first day, it collected 24 arrivals at its main terminus. The following is an R Output.
14
Comment.
16.2
14.8 Shapiro-Wilk normality test
19.8 data: mydata$Times
13.2
W = 0.99037, p-value = 0.997
15.3
14.7
10.2
16.7 One sample Chi-squared test for variance
13.6
14 data: mydata$Times
18.2 X-squared = 28.18, df = 23, p-value = 0.4181
15.7
alternative hypothesis: true variance is not equal to 4
14.2
95 percent confidence interval:
11.1
16.8
2.960380 9.643481
11.8 sample estimates:
16 var of mydata$Times
14.2 4.900797
12.7
Inferences About Two Population Variances

• We may want to compare the variances in:


• AHT(Average Handling Time) from two different call centers
• Temperatures for two heating devices
• Assembly times for two assembly methods
• We collect data from two independent random sample, one
from population 1 and another from population 2.
• The two sample variances will be the basis for making
inferences about the two population variances.
Two Population Variances
X1 and X2 are Normal:
2
Sampling Distribution: s1ൗs2~ F with n1 - 1 (numerator) and n2 – 1 (denominator) d.f
2
H0: 𝜎12 ≠ 𝜎22 OR
H0: 𝜎12 ≤ 𝜎22 ,

In both cases, the population providing the larger sample variance is population 1.
s21
Test Statistic: F = ൗs2
2
The F-Distribution
•Assume we repeatedly select a random sample of size n1 from one normal
population and another random sample n2 from another normal population.
s21
•And each time, we compute ൗs2
2
•If we do this , we would arrive at the distribution of the ratio of two variances:
s21
F = ൗs2 .
2
•The distribution formed in this manner approximates an F distribution with the
following degrees of freedom:
v1 = n1 - 1 and v2 = n2 - 1
P(F(6,9) > 3.37) = 0.05
P(F(6.9) < 3.37) = 0.95.

F(6,12) > f) = 0.05  f = 3


F(12,6) > f) = 0.05  f = 4

F(6, 12) > 0.25) = 0.95


Hypothesis Testing on Variances –
Example
•The firm wants to reduce the variability in the
amount of impurities present in a batch of
1. The hypotheses: H0: σ2old ≤ σ2new Vs Ha:σ2old >σ2new
chemicals. A new process has been proposed for
this reason. or H0: σ12 ≤ σ22 Vs Ha:σ12 >σ22
•Two samples of size 25 were drawn from both
processes The sample variances were 1.04 and 2. Data: n1 = n2 = 25, s12 = 1.04, s22 = 0.50,  = 5%
0.50 for the old and new processes respectively. 3. Right-Tail Test
•Should the new process be accepted?
s21
4. Sampling Distribution: ൗs2~ F(24, 24)
2
2
5. Test Statistic:s1ൗs2= 1.04Τ.50=2.08
2

6. p –value: P(F(24, 24) > 2.08) < 0.05 = P(F(24, 24)>1.98


7. Since p-value <  = .05, reject H0.
The new process may be deployed
1.98
Hypothesis Testing on Variances –
Example
•The firm needed to choose between two 1.The hypotheses: H0: 𝜎𝐴2 = 𝜎𝐵2 Vs Ha:𝜎𝐴2 ≠𝜎𝐵2
vendors for placing a large order for thermostats H0: 𝜎12 = 𝜎22 Vs Ha:𝜎12 ≠𝜎22
for their incubators. 2. Data: n1 = 13, n2 = 8, 𝑠12 = 1.8, 𝑠22 = 0.7,  = 0.02
•Vendors were asked to send samples. These 3. Two-Tail Test.
were placed in a test room maintained at 37oC s21
4. Sampling Distribution: ൗs2~ F(12, 7)
and the temperature readings were noted. The 2
processed data is shown below. s21
5. Test Statistic: ൗs2= 2.53.
2
• Vendor A Vendor B 6. p –value: P(F(12,7)>2.53) > 0.01 = (F(12,7)>6.47)
•n 13 8  p-value = 2 * P(F > 2,53) = .02
7. Do not reject H0.
•s2 1.8 0.7
•Conduct a hypothesis test with  = 2% to see if
the variances are equal.
APPENDIX

1. Interval Estimate for χ2


2. Examples
95% Confidence Interval 2 and 
2 2
P(Χ.975 ≤ Χ2 = ≤ Χ.025 ) = 0.95
2 2
P(Χ.975 ≤ (n - 1)s2/2 = ≤ Χ.025 ) = 95%
 If we sample ad nauseam and for each sample compute n -
1)s2/2 (as if  were known) then 95% of the time
2 2
Χ .975 ≤ (n - 1)s2/2 = ≤ Χ.025
.025 .025  95% of the time
95%
(𝑛 − 1)𝑠 2 𝑛 − 1 𝑠 2
2 ≤ 𝜎2 ≤ 2
Χ.025 Χ.975
(n – 1)s2
 95% of all  will lie in the interval χ 2.025 ≤ 𝜎 ≤
0 2
Χ.975 2
Χ,025 2
(n – 1)s2
χ 2.975
Interval Estimation for χ2
2 2
𝛼 ≤ Χ = ≤ Χ𝛼 ) = 1 - 
P(Χ1− 2
2 2
2
P(Χ1−𝛼 ≤ (n - 1)s2/2 = ≤ Χ𝛼2 ) = 1 - 
2 2
2
 (1-) % of all (n - 1)s2/2 will lie in the interval (Χ1−𝛼,
2
2
Χ𝛼 )
2
   (1-) % of all 2 will lie in the interval
(1 - )
(𝑛 − 1)𝑠 2 (𝑛 − 1)𝑠 2
2 ≤ 𝜎2 ≤
Χ𝛼 Χ2 𝛼
2 1− 2

2
0 2
Χ1− Χ𝛼2 (n – 1)s2
𝛼  (1-) % of all  will lie in the interval ≤
2 2 χ 2αΤ2

(n – 1)s2
𝜎≤
χ 2(1−αΤ2)
Interval Estimation of 2
Buyer’s Digest rates thermostats manufactured for home temperature control. In a recent
test, 10 thermostats manufactured by ThermoRite were selected and placed in a test room
that was maintained at a temperature of 68oF. The temperature readings of the ten
thermostats are shown below.

•We will use the 10 readings below to develop a 95% confidence interval estimate of
the population variance.

•Verify that the sample variance is 0.7


Thermostat 1 2 3 4 5 6 7 8 9 10

Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2
Interval Estimation of 2
Selected Values from the Chi-Square Distribution Table

Degrees Area in Upper Tail


of Freedom .99 .975 .95 .90 .10 .05 .025 .01
5 0.554 0.831 1.145 1.610 9.236 11.070 12.832 15.086
6 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812
7 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475
8 1.647 2.180 2.733 3.490 13.362 15.507 17.535 20.090
9 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666

10 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209

For n - 1 = 10 - 1 = 9 d.f. and  = .05

(n – 1)s2 (n – 1)s2
≤𝜎≤  9 ∗ .7/19.023 ≤ 𝜎 ≤ 6.3/2,7  0.58≤ 𝜎 ≤ 1.53
χ 2αΤ2 χ 2(1−αΤ2)
Hypothesis Testing of 2 – Example
Buyer’s Digest rates thermostats manufactured for home temperature control. In a recent
test, 10 thermostats manufactured by ThermoRite were selected and placed in a test room
that was maintained at a temperature of 68oF. The temperature readings of the ten
thermostats are shown below.
•Buyer’s Digest gives an “acceptable” rating to a thermostat with a temperature variance of
0.5 or less.
•We will conduct a hypothesis test (with  = .10) to determine whether the
ThermoRite thermostat’s temperature variance is “acceptable”.

Thermostat 1 2 3 4 5 6 7 8 9 10

Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2
Solution: The p-value approach

1. The Hypothesis: H0 : σ2 ≤ 0.25 Vs Ha : σ2 > 0.25


2. Data: n = 10, s2 = 0.7,  = 10%.
3. Right-Tail Test
4. Sampling distribution is (n - 1)s2/ 2 = 9s2/.25 ~ χ2 with 9 degrees of freedom
5. Test Statistic: 9*.63/.25 = 22.68
6. p-value = P(χ2 > 22.68) < .01 = P(χ2 > 21.666) < 0.1
7. Since p-value < , we reject H0.
Hypothesis Testing about the Difference in Two Population
Variances

F Test for Two Population Variances

F S 1
2
S 2

dfnumerator    n 1
1 1

dfdeno min ator   n 1


2 2
Example: An F Distribution
for 1 = 10 and 2 = 8
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.00 1.00 2.00 3.00 4.00 5.00 6.00
A Portion of the F Distribution Table for  = 0.025

F .025,9 ,11

Numerator Degrees of Freedom

1 2 3 4 5 6 7 8 9
1 647.79 799.48 864.15 899.60 921.83 937.11 948.20 956.64 963.28
2 38.51 39.00 39.17 39.25 39.30 39.33 39.36 39.37 39.39
3 17.44 16.04 15.44 15.10 14.88 14.73 14.62 14.54 14.47
4 12.22 10.65 9.98 9.60 9.36 9.20 9.07 8.98 8.90
Denominator 5 10.01 8.43 7.76 7.39 7.15 6.98 6.85 6.76 6.68
Degrees of Freedom 6 8.81 7.26 6.60 6.23 5.99 5.82 5.70 5.60 5.52
7 8.07 6.54 5.89 5.52 5.29 5.12 4.99 4.90 4.82
8 7.57 6.06 5.42 5.05 4.82 4.65 4.53 4.43 4.36
9 7.21 5.71 5.08 4.72 4.48 4.32 4.20 4.10 4.03
10 6.94 5.46 4.83 4.47 4.24 4.07 3.95 3.85 3.78
11 6.72 5.26 4.63 4.28 4.04 3.88 3.76 3.66 3.59
12 6.55 5.10 4.47 4.12 3.89 3.73 3.61 3.51 3.44
Hypothesis Test for Equality of Two Population Variances:
Sheet Metal Example (Part 1)

Ho: 1   2
2 2   0.05
F  359
.
n  10
.025,9,11

Ha: 1   2
1
2 2
n  12
2
F.975,11,9 =
1

2
F .025,9,11

F
S 1

1
2
S 2
359
.
 0.28

dfnumerator    n 1
1 1 If F < 0.28 or F > 3.59, reject Ho.
dfdeno min ator    n 1
2 2 If 0.28  F  359
. , do reject Ho.
Sheet Metal Example (Part 2)

Rejection Regions

If F < 0.28 or F > 3.59, reject Ho.


If 0.28  F  3.59, do reject Ho.
Nonrejection
Region

F .975,11,9
 0.28 F .025,9 ,11
 359
.

Critical Values

Slide 10-38
Sheet Metal Example :
Machine 1 Machine 2
22.3 21.8 22.2 22.0 22.2 22.0
21.8 21.9 21.6 22.1 22.0 22.1
22.3 22.4 21.8 21.7 21.9
21.6 22.5 21.9 21.9 22.1

n  10
1
F
S
2
1

01138
.
 5.63
n 2
 12
2
S  01138  0.0202
2 0.0202 2
1
. S 2 S 2

Since F = 5.63 > Fc = 3.59, reject Ho.


MC1 MC2
2.95 3.22 Hypothesis Testing on Variances – Not Normal
3.45 3.30
3.50 3.34
•The firm needs to examine whether there is a significant difference in the variances in the bag
3.75 3.28
3.48 3.29
weights for the two machines at  = 5%.
3.26 3.25
3.33 3.30
F test to compare two variances
3.20 3.27 data: x and y
3.16 3.38 F = 8.2844, num df = 24, denom df = 21, p-value = 7.222e-06
3.20 3.34 alternative hypothesis: true ratio of variances is not equal to 1
3.22 3.35 95 percent confidence interval:
3.38 3.19 3.499201 19.144689
3.90 3.35 sample estimates:
3.36 3.05 ratio of variances
3.25 3.36 8.284448
3.28 3.28
3.20 3.30
3.22 3.28
2.98 3.30
3.45 3.20
3.70 3.16
3.34 3.33
3.18
3.35
3.12
MC1 MC2
2.95 3.22 Hypothesis Testing on Variances – Not Normal
3.45 3.30
3.50 3.34
•The firm needs to examine whether there is a significant difference in the variances in the bag
3.75 3.28
3.48 3.29
weights for the two machines at  = 5%.
3.26 3.25
3.33 3.30
3.20 3.27
3.16 3.38
Quantiles
3.20 3.34 2.5% 97.5%
3.22 3.35 2.889561 27.753650
3.38 3.19
3.90 3.35
3.36 3.05
3.25 3.36
3.28 3.28
3.20 3.30
3.22 3.28
2.98 3.30
3.45 3.20
3.70 3.16
3.34 3.33
3.18
3.35
3.12
Times Example – Under Non-Normality Condition
15.7
16.9 The State Transport started focusing on punctuality of its buses. First it wanted to establish consistent
12.8 schedules and decided that the arrival times at any bus stop should have a variance of 4 minutes. On
15.6 the first day, it collected 24 arrivals at its main terminus. The following output gives the R Output.
14 Comment.
16.2
14.8
19.8
13.2 95% Confidence Interval using Bootstrapping
15.3 (Assuming X is not near Normal)
14.7
10.2 2.5% 97.5%
16.7 2.25258 7.67069
13.6
14
18.2
15.7
14.2
11.1
16.8
11.8
16
14.2
12.7
Problems
1) It is believed that the precision(as measured by the variance)of an instrument is no more than
0.16.Write down the null and alternative hypothesis for testing this belief. Carry out the test
at1%level given 11 measurements of the subject on the instrument.
2.5,2.3,2.4,2.3,2.5,2.7,2.5,2.6,2.6,2.7,2.5
2)Test the hypothesis that s.d (σ) =10,given that s=15 for a random sample of size 50 from a
normal population.
3)In one sample of 8 observations the sum of the squares of deviations of the sample values
from the sample mean was 84.4 and in the other sample of 10 observations it was 102.6.test
whether this difference is significant at 5% level, given that 5 percent point of F for n1=7 and
n2=9 d.f is 3.29.

19-09-2019
Problem on Single Variance:

• 1)An Aeroplane part must be machined to close tolerances to be acceptable to


customers production specifications call for a Maximum variance in the lengths of
the parts of 0.0004.Suppose the sample variance for 30parts turns out to be s^2
= 0.0005.Use Alpha = 0.05 to test whether the population variance specification
is being violated.

Potrebbero piacerti anche