Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
_________________________________________________________________
_
STATISTICAL TESTING OF HYPOTHESIS
What is a Hypothesis?
17
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
This procedure always starts with giving the two mutually exhaustive
hypotheses: the null hypothesis, denoted by H 0 , and the alternative hypothesis,
denoted by H1 . The null hypothesis is always assumed to be true before
performing the test. Sometimes, it is referred as the status quo. In practice, the
null hypothesis is expressed as a statement concerning the value of a population
parameter (say the mean). On the other hand, the alternative hypothesis
describes what you will conclude if you reject the null hypothesis. It is often
called the research hypothesis since it is the alternative that researchers want to
get.
After establishing the null and alternative hypotheses, the next step is to
choose the level of significance, denoted by . To understand the notion of level
of significance, a researcher may commit two possible errors in testing
hypothesis. If the researcher rejects the null hypothesis, given that it is true, then
he or she committed a Type I error. On the other hand, if the researcher does not
reject the null hypothesis, given that it is false, then he or she committed a Type II
error. The following table below summarizes the decisions the researcher could
make and the possible consequences.
Researcher
18
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
that rejects the null hypothesis at 0.05 level of significance is called significant,
while a test that rejects the null hypothesis at 0.01 level of significance is called
highly significant.
Step 3. Select the Appropriate Test Statistic
19
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
significance. In this case, we reject the null hypothesis. Thus, given a specified
value , we reject H 0 if and only if p value . Otherwise, we do not reject.
This method of testing hypothesis is very popular nowadays since almost
all statistical packages report the p-value for specific test.
In theory, it is assumed that the test statistic has no fixed value. That is, it
is assumed to be random. The probability distribution of a test statistic is often
called as the sampling distribution. Now in hypothesis testing, we partition the
set of all values of the test statistic into rejection and acceptance region. If the
value of the test statistic falls in the rejection region, then we reject the null
hypothesis. Otherwise, we do not reject it. If the region rejections are located at
the tails of the distribution of the test statistic, then we have a two-tailed test.
Rejection region
Rejection region Acceptance Rejection region
region
Otherwise, we have a one-tailed test. The figure below depicts two one-
tailed tests using z statistic.
Rejection Acceptance
region region
Acceptance Rejection
region region
(Right) one-tailed test
20
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
This test statistic has a standard normal distribution, and often called the
z-test. For a specified level of significance , the rejection region for specific
alternative hypothesis is given below.
1-α
α
Z_α
21
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
2. In the problem, it is indicated to use 0.05 level of significance.
3. Since we are testing the mean lifetime and the population standard
deviation is known ( σ / √ n = 0.2 / √ 50 = 0.02828 ), then we use
megastat. In the normal distribution box, enter 4 for the mean,
0.02828 for the standard deviation, and 4.05 for the x. Click
preview for the result. The result is z = 1.77.
NOTE: Since it is a right tailed test, use positive result. Thus, Z > z 0.05 = 1.645
1.77
1.645
Rejection region for the example
5. Based from the rejection region (and critical value 1.645), we see
that the z value is inside the rejection region (or 1.77 is greater than
22
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
1.645). Thus, we reject the null hypothesis and conclude that the
battery life exceeds 4 hours.
Example 2 A random sample of 100 deaths in the Philippines last year showed
an average life span of 69.3 years. Assuming a population standard
deviation of 7.8 years, does this seem to indicate that the life span
today is lesser than 70 years? Use a 0.01 level of significance.
2. Use a = 0.01.
3. Since we are testing the mean life span and the population standard
deviation is known (σ / √ n = 7.8 / √ 100 = 0.78. Using normal
distribution under megastat, z = – 0.90.
-0.90
23
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
-2.33
5. Based from the rejection region (and critical value –2.33), we see
that the z value is outside the rejection region (or –0.90 is greater
than –2.33). Thus, we do not reject the null hypothesis and
conclude that the life expectancy of Filipinos is 70 years.
is defined by (X i X )2
.
s i 1
n 1
2. the population is not known to be normal and the sample size is large
enough. In this case, we apply the central limit theorem. Also, if the
population standard deviation is not known, use s.
Z distribution
T distribution
Note particularly that the t distribution is flatter, more spread out, than the
standard normal distribution. This is because t distributions have larger standard
deviations than the standard normal.
The following characteristics of the t distribution are based on the
assumption that the population of interest is normal, or nearly normal.
1. It is, like the z distribution, bell-shaped and symmetrical about zero
(0).
2. There is not one t distribution, but rather a “family” of t distribution.
All t distributions have mean 0, but their standard deviations differ
according to the sample size n. But as the sample size increases, the t
distribution approaches the z distribution. That is, for large n, t and z
are almost identical.
24
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
the sample size is small (usually smaller than 30) and the population standard
deviation is not known (In practice this is usually the case). Suppose we want to
test H 0 : 0 versus some alternative hypothesis. For this testing problem, we
use the t statistic defined by
X 0
T
S/ n
where S is the sample standard deviation. The following provides the rejection
regions for this testing problem.
Example 3 A recent survey stated that cell phone owners received an average
of 50 texts daily. To test the claim, a researcher surveyed 25 cell
phone owners and found out that the average number of received
text was 46. The standard deviation of the sample was 7. At 0.03
level of significance, is there enough evidence to reject the
survey’s claim?
Solution: 1. H0: = 50
H1: 50
2. Use a = 0.03.
25
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
26
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Rejection region
- 2.86
-2.31 2.31
Example 4 A company claims that the mean weight per banana it ships is 150
grams. A quality control supervisor, inspect from a random sample
of 11 and weigh each banana. The results are reported below in
grams.
152 149 157 155 152 148 147 149 150 152 156
2. Use a = 0.05.
27
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
interval is equal to ( 1 – a ) 100%. Choose, of course, t-test.
For clarity, look at the figure below.
28
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
1.54
-2.23 2.23
Suppose now that we obtain two independent normal samples. That is, let
X 1 ,K , X n be normally distributed with mean 1 and standard deviation, and
Y1 ,K Ym be normally distributed with mean 2 and standard deviation 2 . In this
case, we want to test H 0 : 1 2 versus some alternative hypothesis H1 . If the
population standard deviations are both known, then we use the Z statistic defined
by
X Y
Z
12 22
n m
1 2 Z z
1 2 Z z
1 2 Z z / 2 or Z z / 2
Note that the above test can also be used when the population standard
deviations are not known provided that both n and m are large. Just replace 1 by
s1 , and 2 by s2 .
29
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
29.5 and standard deviation of 3.4 calories. Test at 1% significance
level if the sample results indicates that the bottled soft-drink have
equal calories content.
3. Compute for the z– statistic. Using excel, enter the data into
two columns. In this case consider the following data for
column A: test 1, 32.4, 2, 40; and for column B: test 2,
29.5, 3.4, 50. Click megastat menu and select Hypothesis
Tests. In the Hypothesis Tests option, select Compare Two
Independent Groups. At the dialog box, select summary
input. Highlight A1 to A4 for Group 1 and B1 to B4 for
Group 2. Select z – test option and change the confidence
interval for 99%. Look at the figure on the next page
After pressing OK, megastat will provide an output sheet where all the necessary
computations and statistics will be displayed.
30
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
5.04
-2.58 2.58
Two-Sample t-Test
( n 1) S12 ( m 1) S 22
SP
nm2
31
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
kilograms with standard deviation of 4.46 kilograms. Test the
manufacturer’s claim using a 0.01 level of significance.
2. Use a = 0.01.
3. Compute for the t– statistic. Using excel, enter the data into
two columns. In this case consider the following data for
column A: thread X, 85.7, 5.67, 17; and for column B:
thread Y, 75.3, 4.46, 17. Click megastat menu and select
Hypothesis Tests. In the Hypothesis Tests option, select
Compare Two Independent Groups. At the dialog box,
select summary input. Highlight A1 to A4 for Group 1 and
B1 to B4 for Group 2. Enter 10 at the hypothesized
difference. Select greater than option in the Alternative.
Select t – test ( pooled variance ) option and change the
confidence interval for 99%.
32
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
The result, t = 0.23.
0.23
2.45
Example 7 A politician claims that he will garner 90% votes from his
bailiwick province. Would you agree to his claim if on a given day
a researcher asked 1000 qualified voters and turn out that only 872
is in favor of the politician? Use a = 0.05
2) Let a = 0.05
3) Computation:
33
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Z 0.05 / 2 = ± 1.96
34
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Example 8 In a study to estimate the proportion of residents in a certain city
and its suburbs who favor the construction of a thermal power
plant, it is found that 73 out of 130 urban residents favor the
construction while only 60 of 150 suburban residents are in favor.
Is there significant difference between the proportion of urban and
suburban residents who favor construction of the thermal plant?
Use a 0.01 level of significance.
Solution: 1) H0: p1 = p2
H1: p1 ≠ p2
2) Let a = 0.01.
3) Computation:
4) Critical Region
35
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Z 0.01 / 2 = ± 2.58
( n – 1 ) s2
χ = σ2
2
0
Example 9 A manufacturer of cell phone batteries claims that the life of his
batteries is approximately normally distributed with a standard
deviation equal 1.05 years. If a random sample of 15 of these
batteries has a standard deviation of 1.3 years, do you think that
σ > 1.05 years? Use a 0.01 level of significance.
2) Let a = 0.01
36
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
χ2 = 21.46, P = 0.0904
4) Critical Region:
χ2 > 29.14.
21.46
χ2
29.14
37
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
5) Do not reject H0. The χ2 statistic is not significant at the 0.01
level. However, based on the P-value 0.09 there is evidence that
σ > 1.05.
Goodness-of-Fit Test
( O – E )2
χ2 =Σ
E
Example 10 There are three gates of Juan G. Macaraeg National High School.
The principal would like to know if the gates are equally utilized.
As an experiment, 2700 students are observed as they enter the
school. The number of students enter the gate in Canarvacanan
were 1005, in Sta. Fe were 985, and in Sto. Niño were 710. At
0.05 significance level, can we conclude that there is no difference
in the use of three gates?
Solution:
1. H0: Students have no gate preference.
H1: Students show a gate preference.
2. Let a = 0.05.
38
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
χ2 = 60.39
4. Critical Region
χ2 0.05, 2 = 5.99
The chi-square test procedure can also be used to test the hypothesis of
independence of two variables of classifications. Suppose we wish to determine if
each person’s blood type and eye color are related in any way. To find whether
two observed characteristics of a member of a population are independent, we
will use Test of Independence.
Suppose we pick a sample size n and classify the data in a two-way table
on the basis of two variables. Such a table for determining whether the
distribution according to one variable is contingent on the distribution of the other
is called a contingency table. A contingency table with r rows and c columns is
referred to as an r × c table ( “r × c” is read “ r by c” ).
The formula for Test for Independence:
39
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
( O – E )2
χ2 = Σ
E
where the summation extends over all rc cells in the r × c contingency
table. If χ2 > aχ2 with v = ( r – 1 )( c – 1 ) degrees of freedom, reject the null
hypothesis of independence at the level of significance; otherwise, do not reject
the null hypothesis.
Milk Consumption
Age
Low Moderate High
16 – 25 10 15 16
26 – 35 11 13 27
36 – 45 16 28 5
45 and above 8 9 7
2) Let α = 0.05.
3) Compute for χ2
40
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
χ2 = 22.37
4) Critical Region:
χ2 0.05, 6 = 12.59
Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________
Exercise 2.1
41
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
1. The hospital record shows that the mean weight of a newly born baby is
8.3 lbs, with the standard deviation of 0.6 lbs. A researcher takes a sample
of 100 newly born babies and found to have a mean of 7.8 lbs. Test the
claim at 0.01 level of significance.
42
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
calories consumed was 1997. The standard deviation of the sample was 56
calories. At a = 0.05, can it be concluded that there is no difference
between the number of calories consumed by the women over age 60?
5. A company claims that the mean weight per banana it ships is 180 grams
with a standard deviation of 10 grams. Data generated from a sample of 70
bananas randomly selected from a shipment indicated a mean weight of
43
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
193.5 grams per banana. Is there sufficient evidence to reject the
company’s claim? Use a = 0.01.
6. The treasurer of a certain university claims that the mean monthly salary
of their college professor is P 37,750 with a standard deviation of P 3000.
A researcher takes a random sample of 100 college professors why were
found to have a mean monthly salary of P 34,375. Do the 100 college
professors have higher salaries than the rest? Test the claim at a = 0.02
level of significance.
Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________
44
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Exercise 2.2
1. Past experience indicates that the time for high school juniors to complete
a standardized test is a normal random variable with a mean of 60 minutes.
If a random sample of 15 high school juniors took an average of 66
minutes to complete this test with a standard deviation of 5 minutes, test
the hypothesis at 0.01 level of significance that µ = 60 minutes against the
alternative that µ < 60 minutes.
45
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
3. Test the hypothesis that the average content of a particular soft drink is 1
liter if the contents of a random sample of 10 bottles are 1.04, 0.97, 1.01,
1.05, 0.97, 0.98, 1.02, 1.05, 0.97, and 0.97 liters. Use a 0.01 level of
significance and assume that the distribution of contents is normal.
4. The president of a certain tricycle operators and drivers claims that the
average mileage of tricycles is less than 80000. A sample of 16 tricycles
has an average mileage of 90000, with standard deviation of 8000. At α =
0.02, is there enough evidence to reject the president’s claim?
46
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________
Exercise 2.3
Test the hypothesis that there is no difference in the 2 brands of tires. Use
a 0.03 level of significance.
47
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
4. The following data, recorded in kilometers per liter, represent the fuel
consumption of two vehicles tested at 90-kilometer per hour steady-
speedy tests;
Test the hypothesis that Mini buses, on the average, exceed similarly
equipped jeepneys by 5 kilometers per liter. Use a 0.01 level of
significance.
48
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________
Exercise 2.4
1. To find out whether a new drug will cure diabetes, 15 mice with an
advanced stage of disease, are selected. Survival times, in years, from the
time the experiment commenced are as follows:
STUDENT
1 2 3 4 5 6 7 8 9
Score
Before 15 19 19 17 24 29 26 11 18
Review
Score
After 17 29 41 20 32 28 39 30 27
Review
49
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
WOMAN
1 2 3 4 5 6 7 8
Weight
57.6 60.5 89.8 67.9 57.6 54.2 68.3 69.3
before
Weight
50.6 60.3 62.4 60.2 51.6 55 59.4 60.3
after
50
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________
Exercise 2.5
51
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
4. Supposed that, in the past, 40% of all adults favored death penalty. Do we
have reason to believe that the proportion of adults favoring death penalty
today has increased if, in a random sample of 40 adults, 19 favor death
penalty? Use a 0.02 level of significance.
52
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________
Exercise 2.6
2. A study was made to determine whether more Filipinos than Italians prefer
white champagne to pink champagne at weddings. Of the 1000 Filipinos
selected at random, 178 preferred white champagne, and of the 3000
Italians selected, 80 preferred white champagne. Can we conclude that a
higher proportion of Filipinos than Italians prefer white champagne at
weddings? Use a 0.02 level of significance.
53
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________
Exercise 2.7
2. A company claims that the variance of the sugar content of its ice cream is
equal to 50. A sample of 200 servings is selected, and the sugar contents
are measured and found out that the sample variance is 40. At α = 0.05, is
there sufficient evidence to believe the claim?
54
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Men Women
n 10 23
sd 1.2 3.7
Test the hypothesis against the alternative hypothesis that variance of men
is greater than of the women. Use a 0.01 level of significance.
55
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________
Exercise 2.8
56
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
Name:_______________________________________ Score:______________
Section:______________________________________ Date: ______________
Exercise 2.9
RESPONSE
GROUP
More Less Same
Professional 59 29 19
Blue collar 16 12 20
Unskilled laborers 17 57 5
57
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
3. In a study of car accidents and drivers who use cellular phones, the
following sample data are obtained. At a = 0.01, test the claim that the
occurrence of accidents in independent of the use of cellular phones.
4. The supermarket sells red and white eggs in sizes small, medium, large,
and extra large. The table shows the number of cartons sold for the
various sizes and colors during a one-month period.
58
Chapter 2 Statistical Hypothesis Testing
_________________________________________________________________
_
59