9 (A) Hypothesis

Hypothesis Testing
Statistical decisions
We study the sample data and then make decisions about the
population from which the sample is drawn. Such decisions
are called statistical decisions.
Statistical hypotheses
They are statements or assumptions which may or may not be
true concerning one or more populations. On the basis of
sample information, these hypotheses will be tested. Normally,
the hypothesis to be tested is formulated in the sole purpose of
being rejected or nullified. This hypothesis is called the null
hypothesis, denoted by H0. We have to formulate the other
hypothesis which differs from the null hypothesis and it is
usually called the alternative hypothesis, denoted by H1 or H a .
Tests of hypotheses and significance

Usually sample result would differ from those specified by H 0 .
Even if the null hypothesis H0 is true, such observed difference
may be the result of pure chance. If the observed difference is
large, we say that the observed difference is significant, then
the decision is to reject H0. Procedures which enable us to
decide whether to reject or retain the null hypothesis or to
determine whether observed sample result differs significantly
from expected result specified by H0 are called tests of
hypotheses or tests of significance or rules of decision.
Type I and Type II errors
When H0 is tested , we may commit 2 types of errors :-

(1) Rejecting H0 when it is in fact true -----type I error
P [committing a type I error ] = P [ reject H0 / H0 is true ]

= level of significance = α
(2) Accepting H0 when it is in fact false ----- type II error

1
P [ committing a type II error ] = P [ accept H 0 / H0 is
false ] = β
Accept H0 Reject H0
Decision
Hypothesis
If H0 is true Correct decision. Type I error.
Probability = 1 - α Probability = α
corresponding to corresponding to
‘confidence level ‘ ‘significance level ‘
If H0 is false Type II error Correct decision.
Probability = β Probability = 1 - β
corresponding to ‘
power ‘
Critical region and critical point

When H0 is true, the set that consists of all the possible
outcomes which lead to the acceptance of H 0 is said to
constitute the acceptance region (AR) of H0, while the other set
which consists of all the sample outcomes which lead to the
rejection of H0 is said to constitute the rejection region or
critical region (CR) of H0. A critical value is a value used in the
test criterion to separate the critical region of H0 from the
acceptance region.
Right-tailed test
Left-tailed test
2-tailed test
Procedures for testing statistical hypotheses
2
(1) State the assumptions or known facts about:-
(i) the population of interest

(ii) the nature of the samples and the sample sizes
(iii) state H0 and H1 .
(2) Select a test statistic whose sampling distribution is

known if H0 is true and other assumptions are satisfied.
(3) Choose the significance level, α of the test and thus

determine an appropriate critical region of fixed size.
(Usually α = 0.05 or 0.01 are used)
(4) To compute the realised value of the test statistic from

sample results and other known quantities.
(5) Make decision. If the test statistic falls in the CR, then we
reject H0 otherwise we accept it and draw conclusion.
Tests concerning means
Test of hypotheses concerning the mean, μ of a single

population
H0 : μ  μ0 or H0 : μ  μ0 or H0 : μ = μ0
H1 : μ < μ0 H0 : μ > μ0 H1 : μ  μ0
(left-tailed test) (right-tailed test) (2-tailed test)
where μ0 is a predetermined constant.
Test A (Z - test )
3
Model : Population is normal with known standard deviation σ
.
OR
Population is not normal with known standard
deviation σ and sample size n  30.
X  o 
The test statistic is Z= where X 
X n
(a) Left-tailed test

To test H0 : μ  μ0 against H1 : μ < μ 0
______________
At α significance level, the CR = {z / z < - zα }

(b) Right-tailed test
To test H0 : μ  μ0 against H1 : μ > μ0
______________
At α significance level, the CR = {z / z > zα }
(c) Two-tailed test

To test H0 : μ = μ0 against H 1 : μ  μ0
______________
At α significance level, the CR = {z /  z > zα/2 }

E.g. A standard intelligence examination has been given to the
students for several years and it is assumed that the
scores are normally distributed with an average of 80 and
a standard deviation of 7. A group of 25 students
obtained a mean grade of 77 in the examination this year.
Is this year’s students inferior in intelligence to the past
4
years’ students using (i) 5% (ii) 1% level of significance
?
E.g. A chemical company obtains an average of 1800 lbs of

finished product per batch processed with a standard
deviation of 100 lbs. By a new processing technique, it is
claimed that the yield can be increased. To test this
hypothesis, 49 sample batches are processed using the
new technique and it is found that the average yield is
1850 lbs. Can we conclude that the new technique
improves the yield at 1% significance level ?
Test B ( large sample Z-test )
Model:- (i) Population is normal or not normal with

unknown standard deviation but estimated by
sample standard deviation s;
5
(ii) The sample size is large , n  30.
(a), (b) and (c) of test A would still valid .
E.g. A manufacturer of batteries believes that one particular

type of battery has a useful life of 1000 hours. A simple
random sample of 100 of the batteries is taken and the
mean life is found to be 950 hours with standard
deviation of 270 hours. Does this indicate that the mean
life of this type of batteries is not 1000 hours at 5% level
of significance ?
Test C (Small sample t-test)

Model: (i) Population is normally distributed with
unknown standard deviation but estimated by
sample standard deviation, s.
6
(ii) The sample size is small (n<30).
X  o s
The test statistic is t = SX where SX 
n
which follows a t-
distribution with (n-1) degrees of freedom.

To test H0 : μ  μ0 against H1 : μ < μ 0
______________
At α significance level, the CR = {t / t < - t α, n-1 }

To test H0 : μ  μ0 against H1 : μ > μ0
______________
At α significance level, the CR = {t / t > t α, n-1 }
(c) Two-tailed test

To test H0 : μ = μ0 against H 1 : μ  μ0
______________
At α significance level, the CR = {t /  t > tα/2, n-1 }

E.g. The expected life time of electric light bulbs produced by a
given process was 1500 hours. To test a new batch, a sample
of 10 was taken which showed a mean life time of 1400 hours
and standard deviation is 90 hours. Is there any evidence of a
significance change in the length of battery life?
7
Tests of hypotheses concerning the difference of means of
two populations
H0 :     d
1 2 or H0 :     d
0 1 2 or H0 :     d
0 1 2 0
H1:     d
1 2 0 H 1:     d
1 2 0H1 :     d
1 2 0
(Left-tailed test) (Right-tailed test) (2-tailed test)
where d0 is a predetermined constant.
Test D (Z-test)
Model: The two populations are normal with known standard
8
deviations 1 and 2 .
( x1  x 2 )  d 0 1  2
2 2
The test statistic is Z =  x x where  x x 

1 2
n1

n2
.
1 2

To test H0 :  1  2  d0 against H1 : 1   2  d 0
At α significance level, the CR = {z / z < - zα }

To test H0 :     d against H1:     d
1 2 0 1 2 0
At α significance level, the CR = {z / z > zα }
(c) Two-tailed test

To test H0 :   1 2  d0 against H1 : 1   2  d 0
At α significance level, the CR = {z /  z > zα/2 }
Note: Test D is seldom used, it is used as a basis for test E.
Test E (Large sample Z-test)
Model: (i)  1 and  2 are unknown and estimated by s1 and s2.

(ii) the sample sizes n1 , n2 are large (  30)
( x1  x 2 )  d 0 2
s1 s
2
The test statistic is Z = sx x where sx x 

1 2
n1
 2
n2
.
1 2
(a), (b) and (c) of Test D would still valid here.
E.g. To ascertain whether a new fertiliser is more efficient than

9
the old fertiliser in rice production. A piece of land was divided
into 100 squares of equal areas, all of the same quality. The
new fertiliser was applied to 50 squares and the old fertiliser to
the other 50. The mean no. of kg. of rice harvested per square
of land using the new fertiliser was 25.5 with a variance of 22.
The corresponding mean and variance for the squares using
the old fertiliser were 24.6 and 19 respectively. Is the new
fertiliser more efficient than the old one at 1% sig. level?
Test F (small sample t-test)

Model: (i) 2 populations are normal with unknown but
common standard deviation  1 =  2 =  .
(ii) The 2 samples are independent random samples
with small sample sizes n1 and n2 (< 30).
( x1  x 2 )  d 0
The test statistic is t = sx x
1 2
where sp = pooled sample standard deviation =

2 2
(n1  1) s1  (n2  1) s 2
n1  n2  2
1 1
and sx x  s p
1 2

n1 n 2 , t follows a t-distribution with (n1 +n2 -2)
10
degrees of freedom.

To test H0 :  1  2  d0 against H1 : 1   2  d 0
At α significance level, the CR = {t / t <  t , n1  n2  2 }

To test H0 :     d against H1:   
1 2 0 1 2  d0
At α significance level, the CR = {t / t > t , n1  n2  2 }
(c) Two-tailed test

To test H0 :   
1 2  d0 against H1 : 1   2  d 0
t
At α significance level, the CR = {t /  t > 2
, n1 n2  2 }
E.g. Two salesmen A and B are working in a certain district.

From a sample survey conducted by the Head office, the
following results on sales were obtained:-
Salesman A Salesman B
n1 = 20 n2 = 18
x = 170
1 x = 205 2
s1 = 20 s2 = 25
State whether there is a significant difference in the mean

sales between the two salesmen at 5% sig. level.
Tests concerning proportions
Test G (Tests of hypotheses concerning the proportion, p

11
of a single proportion
H0 : p  p0 or H0 : p  p0 or H0 : p = p0
H1 : p < p0 H1 : p > p0 H1 : p  p0
(Left-tailed test) (Right-tailed test) (2-tailed test)
where p0 is a predetermined constant.
Model: (i) Population is large, i.e. the no. of elements, N in

the population is large;
(ii) Although the sample is large n  30, it is small
relative to the size of the population i.e.
sampling fraction = n/ N < 0.05
(iii) p is not near to 0 or 1.

p  p0 p 0 (1  p 0 )
The test statistic Z =  
where  
p n
p

To test H0 : p  p0 against H1 : p < p0
At α significance level, the CR = { z / z < -z α}.

To test H0 : p  p0 against H1 : p > p0
At α significance level, the CR = { z / z > z α }
(c) Two-tailed test.

To test H0 : p = p0 against H1 : p  p0
At α significance level, the CR = { z / z > z α/2 }
E.g. A bus company trains drivers in groups of 25. Normally, 3

out of each group fail to pass the final test. A new method
of instruction is being carried out where a group of 100
were trained together. There were 9 failures. Test to see
12
if the new training method is better using 5% significance
level.
Test H (Test concerning differences between proportions)

Consider 2 independent random samples from 2 binomial
populations consisting of n1 and n2 trials and the no. of successes
are x1 and x2 respectively.
 x1 p1 (1  p1 )
Then p1  ~ N ( p1 , )
n1 n1
 x2 p 2 (1  p 2 )
p2  ~ N ( p2 , ) approximately when n1 , n2 are large.
n2 n2
Consider the difference between the 2 proportions p1 – p2 whose

 
estimator is p  p . 1 2
 
Then we have E ( p  p ) = p1 – p2 and 1 2
    p1 q1 p q
V (p  p2 ) = V( p ) + V ( p ) =  2 2
1 1 2 n1 n2
13
 
 p1 q1 p2 q2
The standard error of p1  p 2 = 
p1  p 2

= n1

n2
 
(i) If p1 = p2 = p known, the standard error of p1  p 2 is
 pq pq 1 1

p1  p 2

= n1

n2
 pq (
n1

n2
)
 
(ii) If p1 = p2 = p unknown, the standard error of p1  p 2 is
S
  1 1  x1  x 2
estimated by  
p1  p 2 = pq (
n1

n2
) where p 
n1  n 2
The null hypothesis

Ho : p1 - p2 = do or H o : p 1 - p 2  do or H o : p1 - p 2 
do
H1 : H1 : H1 :
 
( p1  p 2 )  d 0
The test statistic is Z = S 
where
p1  p2
S
  1 1  x1  x 2

p1  p 2

= pq (
n1

n2
) and p 
n1  n2

Ho : p1 - p2  do against H1 : p1 - p2 < do
At  sig. Level, the C.R. = {z / z < - z }

Ho : p1 - p2  do against H1 : p1 - p2 > do
At  sig. Level, the C.R. = {z / z > z }
(c) Two-tailed test

Ho : p1 - p2 = do against H1 : p1 - p2  do
At  sig. Level, the C.R. = {z / Z  Z  } 2
E.g. A political party X believes it has increased its percentage of

the vote by 5% points over the previous 12 months. A survey of
14
500 electors 1 year ago showed that 100 voted for X. In a recent
survey, it received 96 votes from 400 electors. Would you accept
the view of a 5% points increase using 1% sig. level?
Test I (Paired comparison t – test)
We have n pairs of observations (x1 , y1), (x2 , y2), … , (xn , yn).

Assume that X and Y are normally distributed.
Let D = X – Y
D 
 D  (X  Y)  X  Y
n n
( D ) 2
and SD = ( D  D) 2  D2  n

n 1 n  1
Distribution of D has mean  D   X  Y and standard deviation

D SD
D  which is estimated by SD  .
n n
D  ( X  Y ) X  Y  ( X  Y )
The test statistic is t = SD

SD
which
follows a student’s t - distribution with (n – 1) degrees of freedom.
15
(a) Left-tailed test Ho :  X  Y  0 against H1 :  X  Y  0
At  sig. level, the C.R. = { t / t < - t , n 1 }
(b) Right-tailed test Ho :  X  Y  0 against H1 :  X  Y  0
At  sig. level, the C.R. = { t / t > - t , n 1 }
(c) Two-tailed test Ho :  X  Y  0 against H1 :  X  Y  0
t  t
At  sig. level, the C.R. = { t / 2
, n 1 }
E.g. A new product was introduced into the market in January 1997. After a poor year for sales,
the manufacturer initiated an intensive advertising campaign during January 1998. The table below
records the sales, in thousand dollars, for a one-month period before and a one-month period after the
advertising campaign, for each of eleven regions.
Region A B C D E F G H I J K
Sales
Before 2.4 2.6 3.9 2.0 3.2 2.2 3.3 2.1 3.1 2.2
2.8
Sales
After 3.0 2.5 4.0 4.1 4.8 2.0 3.4 4.0 3.3 4.2
3.9
The sales may be assumed to follow a normal distribution.
Determine, at the 5% sig. level, whether an increase in sales has
occurred .
Solution:
Region A B C D E F G H I J
K
Sales
Before(X)2.4 2.6 3.9 2.0 3.2 2.2 3.3 2.1 3.1 2.2
2.8
16
After (Y) 3.0 2.5 4.0 4.1 4.8 2.0 3.4 4.0 3.3 4.2
3.9
D= Y-X
D= ; D =
2
17

9 (A) Hypothesis

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

9 (A) Hypothesis

Caricato da

Copyright:

Formati disponibili

Hypothesis Testing

Tests of hypotheses and significance

Type I and Type II errors

When H0 is tested , we may commit 2 types of errors :-

P [committing a type I error ] = P [ reject H0 / H0 is true ]

(2) Accepting H0 when it is in fact false ----- type II error

Critical region and critical point

(i) the population of interest

(2) Select a test statistic whose sampling distribution is

(3) Choose the significance level, α of the test and thus

(4) To compute the realised value of the test statistic from

Tests concerning means

Test of hypotheses concerning the mean, μ of a single

where μ0 is a predetermined constant.

(a) Left-tailed test

At α significance level, the CR = {z / z < - zα }

(c) Two-tailed test

At α significance level, the CR = {z /  z > zα/2 }

E.g. A chemical company obtains an average of 1800 lbs of

Test B ( large sample Z-test )

Model:- (i) Population is normal or not normal with

(a), (b) and (c) of test A would still valid .

E.g. A manufacturer of batteries believes that one particular

Test C (Small sample t-test)

(a) Left-tailed test

At α significance level, the CR = {t / t < - t α, n-1 }

(c) Two-tailed test

At α significance level, the CR = {t /  t > tα/2, n-1 }

(Left-tailed test) (Right-tailed test) (2-tailed test)

where d0 is a predetermined constant.

The test statistic is Z =  x x where  x x 

(a) Left-tailed test

At α significance level, the CR = {z / z < - zα }

At α significance level, the CR = {z / z > zα }

(c) Two-tailed test

At α significance level, the CR = {z /  z > zα/2 }

Note: Test D is seldom used, it is used as a basis for test E.

Test E (Large sample Z-test)

Model: (i)  1 and  2 are unknown and estimated by s1 and s2.

The test statistic is Z = sx x where sx x 

(a), (b) and (c) of Test D would still valid here.

E.g. To ascertain whether a new fertiliser is more efficient than

Test F (small sample t-test)

where sp = pooled sample standard deviation =

(a) Left-tailed test

At α significance level, the CR = {t / t <  t , n1  n2  2 }

At α significance level, the CR = {t / t > t , n1  n2  2 }

(c) Two-tailed test

E.g. Two salesmen A and B are working in a certain district.

State whether there is a significant difference in the mean

Tests concerning proportions

Test G (Tests of hypotheses concerning the proportion, p

Model: (i) Population is large, i.e. the no. of elements, N in

(a) Left-tailed test

At α significance level, the CR = { z / z < -z α}.

At α significance level, the CR = { z / z > z α }

(c) Two-tailed test.

At α significance level, the CR = { z / z > z α/2 }

E.g. A bus company trains drivers in groups of 25. Normally, 3

Test H (Test concerning differences between proportions)

Consider the difference between the 2 proportions p1 – p2 whose

The null hypothesis

(a) Left-tailed test