Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
(Inference)
By
Abebe Megerso
(BSc in PH., MPH in Epid., Asst. Prof.)
1
Session Objectives
• Define hypothesis & describe types of
hypothesis,
• Describe steps in hypothesis testing,
• Discus rules for stating statistical hypotheses,
• Explain hypothesis testing process,
• Describe types of errors in hypothesis tests,
• Test hypothesis on single & double population,
2
Definition of Hypothesis
• Is a claim (assumption) about a population
parameter.
3
Examples of Hypotheses:
Population Mean:
• Etc.
4
Examples …
Population Proportion:
• Etc.
5
Types of Hypothesis
1. The Null Hypothesis, HO:
6
Types of Hypothesis…
H0 is a statement of agreement (or no difference),
7
Types of Hypothesis…
2. The Alternative Hypothesis, HA
8
Types of Hypothesis…
• Is a statement that disagrees (opposes) with
H o.
9
Hypothesis Testing
• The majority of statistical analyses involve
comparison, (e.g. between treatments or
procedures or between groups of subjects).
10
Hypothesis Testing…
• Hypothesis Testing (HT) provides an objective
framework for making decisions using
probabilistic methods.
11
Hypothesis Testing…
• Begin with the assumption that the Ho is
true:
12
Steps in Hypothesis Testing
1. Formulate the appropriate statistical hypotheses
clearly.
• Specify HO & HA
H0: = 0 H0: ≤ 0 H0: ≥ 0
H1: 0 H1: > 0 H1: < 0
two-tailed one-tailed one - tailed
Or
14
Steps …
5. Specify the desired level of significance for
the statistical test (=0.05, 0.01, etc.).
6. Determine the critical value.
– A value the test statistic must attain to be
declared significant (=0.05).
16
Rules for Stating Statistical
Hypotheses
1. One population:
• Indication of equality (either =, ≤ or ≥) must appear in
Ho.
Ho: μ = μo, HA: μ ≠ μo
• Can we conclude that a certain population mean is
– not 30?
Ho: μ = 30 & HA: μ ≠ 30 OR
– greater than 50?
Ho: μ ≤ 50 & HA: μ > 50
17
Rules …
Population Proportions:
Ho: P = Po, HA: P ≠ Po
E.g. Can we conclude that the proportion of
patients with leukemia who survive more than
six years is not 60%?
Ho: P = 0.6 & HA: P ≠ 0.6
18
Rules …
2. Two populations:
Mean Difference:
Proportion Difference:
19
In summary:
1. What you hope to conclude as a researcher
should be placed in the HA.
20
Hypothesis Testing Process
• Now think about how the hypothesis test should
be carried out.
• We draw a random sample of size n from the
underlying population & calculate its sample
mean.
• We compare the sample mean to the postulated
mean μ0.
• Is the difference between sample mean & μ0 too
large to be attributed to chance alone?
21
Process …
22
Decision Rule:
• Results used for decision are computed from
the data of the sample.
23
Decision Rule …
• An example of a test statistic is the
quantity obtained from:
24
Rejection & Non-Rejection Regions
• The values of the test statistic assume the points on
the horizontal axis of the normal distribution & are
divided into two groups:
Rejection region, &
Non-rejection region.
• The values of the test statistic forming the rejection region are
less likely to occur if the Ho is true.
• The values making the acceptance (non-rejection) region are
more likely to occur if the Ho is true.
25
Example: Two-sided test at α 5%
-1.96 1.96
Rejection region Non-rejection region Rejection region
26
Statistical Decision
• Reject Ho if the value of the test statistic that
we compute from our sample is one of the
values in the rejection region.
27
Level of Significance, α
• Is the probability of rejecting a true Ho.
29
Level of Significance
& the Rejection Region
Example:
30
Level of Significance …
31
Another way to state conclusion
• Reject Ho if P-value < α
• Accept Ho if P-value ≥ α (fail to reject)
P-value is the probability of obtaining a test statistic
as extreme as or more extreme than the actual test
statistic obtained if the Ho is true.
The larger the test statistic, the smaller is the P-value.
OR, the smaller the P-value the stronger the evidence
against the Ho.
32
Types of Errors in Hypothesis
Tests
• Whenever we reject or fail to reject the Ho,
we commit errors.
• Two types of errors are committed:
Type I Error,
Type II Error,
33
Type I Error
• The error committed when a true Ho is rejected.
• Considered as a serious type of error.
• The probability of a type I error is the
probability of rejecting the Ho when it is true.
• The probability of type I error is α.
• Called level of significance of the test.
• Set by researcher in advance.
34
Type II Error
• The error committed when a false Ho is
not rejected (fail to reject false Ho).
35
Power
• The probability of rejecting the Ho when it is
false.
36
Action Reality
(Conclusio
n)
Ho True Ho False
37
Type I & II Error Relationship
38
Factors Affecting Type II Error
39
Factors affecting the Power of the
Test
The power depends on the following:
1. As n↑, power ↑
2. As |µ1-µo|↑, power ↑
3. As ↑, power ↓
4. As α↓, power ↓
40
Hypothesis Testing approach
Hypothesis Test for One Samples
• Test for single mean,
• Test for single proportion,
Hypothesis Test for Two Samples
• Test for the difference between two population
means,
• Test for the difference between two population
proportions,
41
1. Hypothesis Testing of a Single Mean
(Normally Distributed)
42
1.1 Known Variance
43
Example: Two-Tailed Test
1. A simple random sample of 10 people from a certain
population has a mean age of 27. Can we conclude that
the mean age of the population is not 30? The variance
is known to be 20; & let level of significance be = .05.
A. Hypothesis
Ho: µ = 30
HA: µ ≠ 30
B. Assumptions
• Simple random sample,
• Normally distributed population,
44
Example …
C. Data:
n = 10, sample mean = 27, 2 = 20, α = 0.05
D. Test statistic:
As the population variance is known, we use Z as
the test statistic.
45
Example …
E. Decision Rule:
• Reject Ho if the Z value falls in the rejection
region.
• Don’t reject Ho if the Z value falls in the non-
rejection region.
• Because of the structure of Ho it is a two tail test.
• Therefore, reject Ho if Z ≤ -1.96 or Z ≥ 1.96.
46
Example …
F. Calculation of test statistic:
G. Statistical decision:
We reject the Ho because Z = -2.12 is in the rejection
region; the value is significant at 5% = α.
H. Conclusion:
We conclude that µ is not 30; P-value = 0.0340 < α.
A Z value of -2.12 corresponds to an area of 0.0170.
Since there are two parts to the rejection region in a
two tail test, the P-value is twice this which is .0340.
47
Hypothesis test using confidence interval
• Confidence interval:
48
Example: One -Tailed Test
• A simple random sample of 10 people from a certain
population has a mean age of 27.
• Can we conclude that the mean age of the population is
less than 30? The variance is known to be 20; let α =
0.05.
• Data:
n = 10, sample mean = 27, 2 = 20, α = 0.05
• Hypotheses:
Ho: µ ≥ 30, HA: µ < 30
49
Example …
• Test statistic:
• Rejection Region:
• Conclusion:
We conclude that µ < 30.
p = .0170; this time because it is only a
one tail test & not a two tail test.
51
Example …
• Suppose that the Ho & Ha take the form
Ho: µ = µo, Ha: µ > µo
• In this case, Ho would be rejected for large
values of test statistic (critical values >0).
• The P-value would correspond to the area in the
upper tail of the SND, to the right of the value of
the test statistic.
52
1.2 Unknown Variance
• In most practical applications the standard
deviation of the underlying population is not
known.
• In this case, can be estimated by the sample
standard deviation s.
• If the underlying population is normally
distributed, then the test statistic is:
53
Example: Two-Tailed Test
• A simple random sample of 14 people from a certain
population gives a sample mean body mass index (BMI)
of 30.5 & s of 10.64.
• Can we conclude that the BMI is not 35 at α = 5%?
• Ho: µ = 35, Ha: µ ≠35
• Test statistic
56
Two Population Means, Independent
Samples
57
Two Sample Means,
Independent Samples
Two Population Means …
58
2.1 Known Variances
(Independent Samples)
• When two independent samples are drawn from
a normally distributed population with known
variance, the test statistic for testing the Ho of
equal population means is:
59
Example:
• Researchers wish to know difference in mean serum uric
acid (SUA) levels between normal individuals & those
with Down’s syndrome.
• The means SUA levels on 12 individuals with Down’s
syndrome & 15 normal individuals are 4.5 & 3.4 mg/100
ml, respectively, with variances (2=1, 2=1.5,
respectively).
• Is there a difference between the means of both groups at
α = 5%?
• Hypotheses:
Ho: µ1- µ2 = 0 or Ho: µ1 = µ2
HA: µ1 - µ2 ≠ 0 or HA: µ1 ≠ µ2
60
Example …
• With α = 0.05, the critical values of Z are -1.96 &
+1.96. We reject Ho if Z < -1.96 or Z > +1.96.
62
Example:
• We wish to know if we may conclude, at the 95%
confidence level that smokers, in general, have
greater damaged lung cells than do non-smokers.
63
Example …
• Hypotheses:
Ho: µ1 ≤ µ2 = 0, HA: µ1 > µ2
• With α = 0.05 & df = 23, the critical value of t is
1.7139; we reject Ho if t > 1.7139.
• Test statistic:
65
Unequal variances …
• Where the degree of freedom (d’) is given by:
66
Unequal variances …
67
Example:
• Suppose we want to compare the characteristics of
tuberculosis meningitis for patients infected with HIV
& those not infected with HIV.
• In particular, we are interested in comparing age at
diagnosis.
• A random sample of n1 = 37 HIV infected patients has
mean age at diagnosis x1 = 27.9 years & s1 = 5.6 years.
• A sample of n2 = 19 uninfected patients has mean age
at diagnosis x2 = 38.8 years & s2 = 21.7 years.
68
Example …
• The test statistic is:
69
Example …
• Note that:
• And
70
Example …
• For a t distribution with 19 df, the area to the left
of −2.15 is between 0.01 & 0.025.
• Therefore, 0.02 < p < 0.05
• For a test conducted at α= 0.05, H0 is rejected.
• We conclude that among patients diagnosed with
tuberculosis meningitis, those who are infected
with HIV tend to be younger than those who are
not.
71
Hypothesis Testing for
Paired Samples
• Two samples are paired when each data point of
the first sample is matched & is related to a
unique data point of the second sample.
• Tests means of two related populations:
Paired or matched samples,
Repeated measures (before/after),
• Longitudinal or follow-up study,
72
Paired Samples …
• Assumptions:
73
The Paired t Test
75
Paired t Test …
76
Example:
• The following data show the SBP levels (mm Hg) in 10 women
while not using (baseline) & while using (follow-up) oral
contraceptives (OC).
• Can we conclude that there is a difference between mean
baseline & follow-up SBP at α 5%? di = baseline – follow-up,
79
Proportions …
80
3. Hypothesis Testing about a Single
Population Proportion
81
Single Population Proportion…
82
Example
• We are interested in the probability of developing
asthma over a given one-year period for children 0 to 4
years of age whose mothers smoke in the home.
• In the general population of 0 to 4-year-olds, the annual
incidence of asthma is 1.4%.
• If 10 cases of asthma are observed over a single year in
a sample of 500 children whose mothers smoke, can we
conclude that this is different from the underlying
probability of p0 = 0.014? α = 5%
H0 : p = 0.014
HA: p ≠ 0.014
83
Example …
• The test statistic is given by:
84
Example …
• The critical value of Zα/2 at α=5% is ±1.96.
• Don’t reject Ho since Z =1.14 is in the non-
rejection region between ±1.96.
• P-value = 0.2548
• We do not have sufficient evidence to conclude
that the probability of developing asthma for
children whose mothers smoke in the home is
different from the probability in the general
population.
85
4. Hypothesis Tests about the Difference
Between Two Population Proportions
86
Two Population Proportions…
87
Two Population Proportions…
88
Example
• A study was conducted to investigate the possible cause
of gastroenteritis outbreak following a lunch served in a
high school cafeteria.
• Among the 225 students who ate the sandwiches, 109
became ill; while, among the 38 students who did not
eat the sandwiches, 4 became ill.
• Is there a significant difference between the two groups
at α =5%.
• We wish to test:
Ho: p1 = p2 against the alternative
HA: p1 ≠ p2 89
Example …
90
Example …
• Assume that the sample sizes are large enough,
& the normal approximation to the binomial
distribution is valid.
• If the Ho is true, then p1 = p2 = p
91
The area under the standard normal curve to the right of
4.36 is less than 0.0001; &, p < 0.0002.
We reject H0 at the 0.05 level.
93