Sei sulla pagina 1di 6

Page 1 of 6

ENGINEERING DATA ANALYSIS


HYPOTHESIS TESTING FOR SINGLE SAMPLE

Hypothesis Testing
Hypothesis testing and estimation are used to reach conclusions about a population by examining a sample of
that population.

A hypothesis is a statement about one or more populations. There are research hypotheses and statistical
hypotheses.

 Research Hypotheses
A research hypothesis is the supposition or conjecture that motivates the research. It may be proposed after
numerous repeated observations. Research hypotheses lead directly to statistical hypotheses.

 Statistical Hypotheses
Statistical Hypotheses sometimes called confirmatory data analysis, is a hypothesis that is testable on the basis
of observing a process that is modelled via a set of random variables.

Statistical hypotheses are stated in such a way that they may be evaluated by appropriate statistical techniques.

There are two types of Statistical Hypothesis:

 Null Hypothesis (Ho)


- It is the hypothesis to be tested which one hopes to reject.
- It shows the equality or no significant difference or relationship between the variables

 Alternative Hypothesis (Ha)


- It generally represents the idea which the researcher wants to prove.

TWO TYPE OF HYPOTHESIS TESTING:

1. One-Tailed Test - It is a directional test with the region of rejection lying on either left or right of the normal curve.
a. Right-Directional Test
(Ha uses comparatives such as greater than, more than, higher than, better than, lower than, superior to, exceeds,
etc..)
b. Left- Directional Test
(Ha uses comparatives such as smaller than, less than, lower than, inferior to, below, etc..)

2. Two Tailed Test - It is a non- directional test with the region of rejection lying on both tails of the normal curve.
(Ha uses words such as not equal to, significantly different, etc)

LEVEL OF SIGNIFICANCE

The level of significance, 𝛼, is a probability and is, in reality, the probability of rejecting a true null hypothesis. For example,
with 95% confidence intervals, 𝛼 = .05 meaning that there is a 5% chance that the parameter does not fall within the 95% confidence
region. This creates an error and leads to a false conclusion.

EDA: Hypothesis Testing for Single Sample


Page 2 of 6

WHAT IS TYPE I ERROR AND WHAT IS TYPE II ERROR?


When doing hypothesis testing, two types of mistakes may be made and we call them Type I error and Type II error.

Reality
Decision
𝐻0 is true 𝐻0 is false
Reject 𝐻0 Type I error Correct
Accept 𝐻0 Correct Type II error

If we reject 𝐻0 when 𝐻0 is true, we commit a Type I error. The probability of Type I error is denoted by: α.
If we accept 𝐻0 when 𝐻0 is false, we commit a Type II error. The probability of Type II error is denoted by: β.
Our convention is to set up the hypotheses so that Type I error is the more serious error.

STEPS IN CONDUCTING THE HYPOTHESIS TESTING

Every time we perform a hypothesis test, this is the basic procedure that we will follow:

Step 1. Check the conditions necessary to run the selected test and select the hypotheses for that test.:
Step 2. Decide on the significance level, 𝛼.
Step 3. Compute the value of the test statistic:
Step 4. Find the appropriate critical values for the tests using tables
Step 5. Check to see if the value of the test statistic falls in the rejection region. If it does, then reject 𝐻0 (and
conclude 𝐻𝑎 ). If it does not fall in the rejection region, do not reject 𝐻0 .
Step 6. State the conclusion in words.

Test about Proportion (Z)

We'll start our exploration of hypothesis tests by focusing on population proportions. Specifically, we'll derive the methods used for
testing whether a single population proportion p equals a particular value, p0.

Condition for test about proportion: If Z-test for one proportion: n𝑝0 ≥ 5 and n(1−𝑝0 ) ≥ 5

Two-tailed Right-tailed Left-tailed

𝐻0 : p = 𝑝0 OR 𝐻0 : p = 𝑝0 OR 𝐻0 : p = 𝑝0

𝐻𝑎 : p ≠ 𝑝0 𝐻𝑎 : p > 𝑝0 𝐻𝑎 : p < 𝑝0

EDA: Hypothesis Testing for Single Sample


Page 3 of 6

Z-TEST EXAMPLE: Penn State Students from Pennsylvania


Continuing with our one-proportion example at the beginning of this lesson, say we take a random sample of 500 Penn State students
and find that 278 are from Pennsylvania. Can we conclude that the proportion is larger than 0.5 at a 5% level of significance?

Step 1. Can we use the one-proportion z-test?

The answer is yes since the hypothesized value 𝑝0 is 0.5 and we can check that:

𝑛𝑝0 = 500 × 0.5 = 250 ≥ 5 and n(1−𝑝0 ) = 500 × (1−0.5) = 250 ≥ 5

Set up the hypotheses. Since the research hypothesis is to check whether the proportion is greater than 0.5 we set it up as a one
(right)-tailed test:

𝐻0 : p = 0.5

𝐻𝑎 : p > 0.5

Step 2. Decide on the significance level, 𝛼.

According to the question, 𝛼 = 0.05.

Step 3. Compute the value of the test statistic:

0.556−0.5
=
0.5(1−0.5)

500

= 2. 504

Step 4. Find the appropriate critical values for the test using the z-table. Write down clearly the rejection region for the problem. We
can use the standard normal table to find the value of 𝑧0 0.05 .

From the table, 𝑧0 0.05 is found to be 1.645 and thus the critical value is 1.645. The rejection region for the right-tailed test is given by:

Z∗>1.645

Step 5. Check whether the value of the test statistic falls in the rejection region. If it does, then reject 𝐻0 (and conclude 𝐻𝑎 ). If it does
not fall in the rejection region, do not reject 𝐻0 .

The observed Z-value is 2.504 - this is our test statistic. Since Z* falls within the rejection region, we reject 𝐻0 .

Step 6. State the conclusion in words.

With a test statistic of 2.504 and critical value of 1.645 at a 5% level of significance, we have enough statistical evidence to reject the
null hypothesis. We conclude that a majority of the students are from Pennsylvania.

EDA: Hypothesis Testing for Single Sample


Page 4 of 6

Test on Mean when Variance is Known (Z)

When a population is normal, any of its samples regardless of size, is also normal. On the other hand if the population is
skewed, a sample should be approximately normal when sample sizes are quite large (𝑛 ≥ 30). For both of these cases if the sample
is normal or at least approximately normal with large sizes, testing the hypothesis should be best performed through the z-test.

Null Alternative
Appropriate Test Decision Rule and Rejection
Hypothesis Hypothesis Assumptions
Statistic Region
(Ho) (Ha)
Reject Ho if |ZC|> Z/2.
Otherwise, we fail to reject Ho.
Variable of interest rejection regions
follows the normal 𝑋̅ − 𝜇0
µ = µ0 µ  µ0 distribution with 𝑍𝐶 = 𝜎
known population ⁄ 𝑛

variance (2)
-Z/2 Z/2

Standard Normal Distribution


Reject Ho if ZC > Z,. Otherwise,
we fail to reject Ho.
Variable of interest
rejection region
follows the normal 𝑋̅ − 𝜇0
µ = µ0 𝑍𝐶 = 𝜎
µ > µ0 distribution with
known population ⁄ 𝑛

variance (2)
Z

Standard Normal Distribution


Reject Ho if ZC < - Z. Otherwise,
we fail to reject Ho.
Variable of interest rejection region
follows the normal 𝑋̅ − 𝜇0
µ = µ0 𝑍𝐶 = 𝜎
µ < µ0 distribution with
known population ⁄ 𝑛

variance (2)
-Z

Standard Normal Distribution

Z-Test Example:

1. The Graduate Record Exam (GRE) is a standardized test required to be admitted to many graduate schools in the United
States. A high score in the GRE makes admission more likely. According to the Educational Testing Service, the mean score
for takers of GRE who do not have training courses is 555 with a standard deviation of 139. Brain Philippines (BP) offers
expensive GRE training courses, claiming that their graduates score better than those who have not taken any training
courses. To test the company’s claim, a statistician randomly selected 30 graduates of BP and asked for their GRE scores
and got an average of 560.

EDA: Hypothesis Testing for Single Sample


Page 5 of 6

Solution:
Step 1: Formulate the appropriate null and alternative hypotheses.
(Answer: Ho: Graduates of BP courses scores 555 or in symbols, µ = 555 while
Ha: Graduates of BP courses did score better than 555 or in symbols, µ > 555.)

Step 2: For convention or rule of thumb, let us use , 𝛼 = 0.05.


Step 3: Using a simple random sample of observations, compute the value of the test statistic.
𝑋̅−𝜇0 560−555
(Answer: The computed test statistic is 𝑍𝐶 = 𝜎 = 139⁄ = 0.197
⁄ 𝑛
√ √30

Step 4: Find the appropriate critical values for the test and state the decision rule and specify the rejection region.
(Answer: With 5% level of significance, the decision rule is ‘Reject the null hypothesis (Ho) if ZC > Z0.05 = 1.645. Otherwise, we fail to
reject Ho. The rejection region is found on the right tail of the standard normal distribution as shown below:

rejection region

Z=1.645

Step 5: Make a decision whether to reject or fail to reject Ho.


(Answer: With the computed test statistic equal to 0.197 which is less than the critical value, the null hypothesis is accepted.)

Step 6: State the conclusion.


(Answer: We then say that the graduates of Brain Philippines has no significant difference to those who did not undergo training.

Test on Mean when Variance is Unknown

When the sample size is small (𝑛 < 30), assuming that the sample is approximately normal, t-test in reference to the t-
distribution is more appropriate to employ in testing the hypothesis for the mean.

In most cases of hypothesis testing, the variance of the population is unknown. In this case, the standard error should be
estimated by the sample standard deviation. Associated with this is the divisor 𝑛 − 1 which we are going to call the degrees of
freedom. Denoting the degrees of freedom we use 𝑑𝑓 = 𝑛 − 1.

𝑥̅ − 𝜇0
𝑡= 𝑠
√𝑛

T-test Example:

This example is taken from Understanding Statistics in the Behavioral Sciences (3rd Ed), by Robert R. Pagano as cited by B. Weaver
(2011). 10 years ago the average height of young adult women living in a certain city was 63 inches. The standard deviation is
unknown. A researcher wants to determine whether the height of young adult women differs significantly from 63 inches. She randomly
samples eight young adult women currently residing in her city and measures their heights. The following data are obtained: [64, 66,
68, 60, 62, 65, 66, 63].

EDA: Hypothesis Testing for Single Sample


Page 6 of 6

Solution:
Solving for the mean and variance of the sample we have 𝑥̅ = 64.25 and 𝑠 2 = 6.5. Following the steps in hypothesis testing we
have the following results.

1. 𝐻0 : 𝜇 = 63 inches
𝐻𝑎 : 𝜇 ≠ 63 inches (two tailed)
2. For this particular problem we can set the level of significance to be at 0.05.
3. Solving for the t computed value we have
𝑥̅ − 𝜇0
𝑡= 𝑠
√𝑛

64.25 − 63
𝑡=
2.55
√8

𝑡 = 1.39
4. Critical value: Since 𝑛 = 8, we have 𝑑𝑓 = 7. Referring to the tables of critical values for t-distribution, the critical value is
𝑡 = 2.365; the acceptance region would be −2.365 < 𝑡 < 2.365.

5. Decision: The computed t-value lies within the acceptance region thus we cannot reject the null hypothesis.
6. Conclusion: Therefore the height of young adult women does not differ significantly from 63 inches.

EDA: Hypothesis Testing for Single Sample

Potrebbero piacerti anche