Hypothesis Testing Sept 2016

Hypothesis Testing
© U Dinesh Kumar, IIM Bangalore

Hypothesis Testing
A hypothesis test is a statistical test that is used to check

whether there is enough evidence in a sample of data to infer
that a certain statement/claim/condition is true for the entire
population.
Blackout Babies
On November 9, 1965 power went out in New York city and much
of North East for more than 10 hours.
On Wednesday August 10 , 1966 New York Times published an

article, which claimed that the “Births Increased 9 months after
the Blackout”.

Claim made by
Complan
How to check whether

this is this true or not?
If you drink Complan, you grow taller

Hypothesis testing is a systematic
way to test claims or ideas about a
group or population.

Interesting Hypotheses
• Good looking couples are more likely to have girl
child(ren)!
• Married people are more happier than singles!!!
• Vegetarians miss fewer flights.
• Black cars have more chance of involving in an accident
than white cars in moon light.
• Women use camera phone more than men.
• Left handed men earn more money!
• Smokers are better sales people.
• Those who whistle at workplace are more efficient.
Government of West Bengal approaches Mission Hospital and
requests to provide cardiac related treatments for 1,50,000 for
all the patients supported by a government scheme.
Should Mission Hospital accept the offer?

What do we know?
• The average cost of treatment based on the
sample estimate = 198723
• Sample size = 248

What is Hypothesis?
– A claim or statement about a property of a

population.
– The goal in Hypothesis Testing is to analyze a sample

to find evidence for the claim or statement
(hypothesis).
Null Hypothesis
• Hypothesis testing has two hypotheses, null and
alternative.
• Null hypothesis is a statement about a population

parameter, such as population mean or proportion.
• In most cases, we believe that the null hypothesis is

wrong. Decisions are made about the null
hypothesis.
Alternative Hypothesis
• Alternative hypothesis is complement of null hypothesis.
• Alternative hypothesis is also known as research hypothesis.
• Usually, we believe that the alternative hypothesis is true.

Hypothesis Test - Comparison to court trial
• A criminal trial is an example of hypothesis testing without the

statistics.
• In a trial the judge must decide between two hypotheses.
• The null hypothesis is

H0: The defendant is innocent
• The alternative hypothesis or research hypothesis is

H1: The defendant is guilty
11.13
Why is it called Null
• A null hypothesis is a statement of the status quo, one of no

difference or no effect. If the null hypothesis is not rejected,
no changes will be made.
• An alternative hypothesis is one in which some difference or

effect is expected.
14
Hypothesis Testing Objective
1. There are two hypotheses, the null and the alternative
hypotheses.
2. The procedure begins with the assumption that the null
hypothesis is true.
3. The goal is to determine whether there is enough evidence to
infer that the alternative hypothesis is true, or the null is not
likely to be true.
4. There are two possible decisions:
5. Conclude that there is enough evidence to support the
alternative hypothesis. Reject the null.
6. Conclude that there is not enough evidence to support the
alternative hypothesis. Fail to reject the null.
15
Steps in Hypothesis Testing
1) Describe in words the population characteristic about which
hypotheses are to be tested.
2) State the null hypothesis, Ho and State the alternative

hypothesis, H1.
3) Calculate the test statistic to be used
5) Identify the reject region (one tailed or two tailed)
6) Compute the p-value and take the final decision.
16
Significance (Rejection Criteria)
• Level significance () is a decision criteria used for rejection of

null hypothesis.
• We collect evidence to show that the null hypothesis is not

true.
• In most cases, we set the null hypothesis to be 5%.
• Probability of obtaining a sample mean is less than 5% if the

null hypothesis were true.
Test Statistic
The test statistic is a mathematical expression that allows

researchers to determine the likelihood of obtaining sample
outcomes if the null hypothesis were true.
The value of the test statistic is used to make a decision

regarding the null hypothesis.
Hypothesis Testing - Decision
• Reject the null hypothesis. The sample mean is associated
with a low probability of occurrence when the null hypothesis
is true.
• Retain the null hypothesis. The sample mean is associated

with a high probability of occurrence when the null
hypothesis is true.
P-value
A p value is the probability of obtaining a sample outcome, given

that the value stated in the null hypothesis is true. The p value
for obtaining a sample outcome is compared to the level of
significance.
When p-value is less than significance (), we reject the null

hypothesis
Critical Value
A critical value is a cutoff value that defines the boundaries

beyond which less than 5% of sample means can be obtained if
the null hypothesis is true. Sample means obtained beyond a
critical value will result in a decision to reject the null hypothesis.
Critical Values
Lower one tailed H1: o < h
Two tailed test H1: o = h
Upper one tailed H1: o > h
22
Type I and Type II Error
• A Type I error occurs when we reject a true null
hypothesis. As an analogy, a Type I error occurs when the
jury convicts an innocent person.
• P(Type I error) =  [usually 0.05 or 0.01]
• A Type II error occurs when we don’t reject a false null

hypothesis [accept the null hypothesis]. That occurs
when a guilty defendant is acquitted.
• P(Type II error) = 
23
Type I and Type II Errors
Decision
Retain the Null Reject the Null
Truth in the
Population True Correct Type I Error
(1 - ) 
False Type II Error Correct

 (1-)
Power
Terms in Hypothesis Test
Term Explanation
 Conditional probability of incorrectly rejecting H0, when it is

actually true (Type I Error)
 Conditional probability of failing to reject H0, when it is false (Type

II Error)
1 -  (Power) Conditional probability of correctly rejecting H0, when H1 is true
p-value Evidence against the null hypothesis in favour of alternative

hypothesis.
Smaller p-value indicates stronger evidence against null

hypothesis
Hypothesis Testing and Sampling Distribution
• The sample mean is an unbiased estimator of the

population mean.
• Regardless of the distribution in the population, the

sampling distribution of the sample mean is normally
distributed.
Z - Test

One independent sample Z-test
The one–independent sample z test is a

statistical procedure used to test hypotheses
concerning the mean in a single population with
a known variance.
Test Statistic – Z test
• The z statistic is an inferential statistic used to determine the
number of standard deviations in a standard normal
distribution that a sample mean deviates from the population
mean stated in the null hypothesis.

X
Z
/ n
Test Statistic – Z test

X
Z
/ n
Estimated value of mean - Hypothesis value of mean

Z
Standard deviation of sampling distribution
Time Spent on Texting…
• Magazine women’s health published an article in
which they claimed that female students spend 94.6
minutes everyday “texting” with a standard
deviation of 26.8 minutes.
• http://www.womenshealthmag.com/life/hours-you-
spend-on-your-phone
Checking the hypothesis
• Data was collected from 57 female students from
City college of Business and the average amount
spent by them on texting was 89.2 minutes.
• Check the hypothesis that the time spent by female

students is less than 94.8 minutes.
Null and Alternative Hypothesis
• H0: Time spend by female students on texting is less than or
equal to 94.8 minutes.
• HA: Time spend by female students on texting is greater than

94.8 minutes
H0:   94.8
HA:  > 94.8

Z - Statistic

X   89.2  94.8
Z   1.57
 / n 26.8 / 57
p  value  0.057
0.06
0.05
0.04
0.03
0.02
Area =
0.01
0.057
0
80 82 84 86 88 90 92 94 96 98 100 102 104 106 108 110
Since p-value is greater than 0.05 (significance value), we
retain the null hypothesis (or fail to reject the null hypothesis).

When one can use Z-test
• Population is normal distribution & population

standard deviation is known
Treatment cost of Female patients
H 0 :   150000
H1 :   150000 150000 is the cost offered by
the government

X  198723
  150000
S  122587
n  248
(198723  150000)
Z  6.25
122587 248
37
Central Limit Theorem for
Proportions
• Let p be the probability of success and q is the probability of
failure. The sampling distribution for samples of size n is
approximately normal.
• Mean = p
pq
• Standard deviation =
n
Z-test for Proportions
• An E-commerce company believes that 10% of all their
customers return the products (jewelry) after using them.
• In a sample of 220 customers, 45 customers are estimated to

have returned the product after using it.
• Check the hypothesis that the proportion of customers who

return the product after using it is other than 10%.
Two-Sample Z test
• Two sample z-test is used for comparing two populations.
• For example, we would like to check whether the time spent

on using mobile phone is different for men and women.
Two Sample Z test with Equal variance
• Null Hypothesis (H0): 1 = 2
• Alternative hypothesis (H1): 1  2
 
X1  X 2
Z
1 1
 
n1 n2
Mobile Usage between Men and
Women
• A sample of 78 men revealed that they spent 4.6 hours on
average in a day on using their mobile phones.
• A sample 56 women revealed that they spent 6.2 hours on

average in a day on using their mobile phones.
• Assume that the population standard deviation is 1.1 hours.
• Check the hypothesis that there is no difference between men

and women on usage of mobile phones.
T Test

T-Distribution
• T-distribution is a continuous probability

distribution that arises when estimating
the mean of a normally distributed
population in situations where the sample
size is small and population standard
deviation is unknown
t-distribution
• t-distribution is a family of continuous probability distribution
that arises when estimating the mean of sample from a
population of normal distribution in which sample size is small
and population standard deviation is unknown.
45
Z test Vs T-test
• Z test is used when the population is normal
(or large sample) and  is known.
• T-test is used when population is normal and

 is estimated from sample.
46
More Hypothesis Testing
• Usually, the null hypothesis is contrary to what

we believe to be true about the population.
• Usually, the alternative hypothesis describes
the situation we believe to be true.
• If the obtained statistic is unlikely to be true,
we reject the null hypothesis.
48
Binary Decision
• Remember: The value being tested is the value in the null
hypothesis.
• Remember: The value being tested is a parameter, a

population value
• The decision maker either rejects the null hypothesis or fails

to reject the null hypothesis.
49
Two-tailed and One tailed tests
• If the alternative hypothesis uses an equal sign, this
indicates a two tailed test(nondirectional).
• In this case, the region of rejection is located in both
tails.
• If the alternative hypothesis uses a greater or less
than sign (<>), this is a directional test.
• In this case, the region of rejection is located is one
tail of the sampling distribution
50
Z-test for Proportion

p p
Z
p (1  p ) / n
The CEO of the hospital believes that the probability that the
treatment cost exceeds 150000 in at least 25% of the cases.
Validate the claim

Z-test for Proportion
H0: P <= 0.25
H1: P > 0.25

p p 0.52  0.25
Z   9.8
p (1  p ) / n 0.25  0.75 / 248
Degrees of Freedom
• Degrees of freedom is the difference between the number of
observations and the number of restrictions.
• In general, a degree of freedom is used whenever a

parameter is estimated.
• In general, degrees of freedom are additive.

Hypothesis Testing Sept 2016

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Hypothesis Testing Sept 2016

Caricato da

Copyright:

Formati disponibili

Hypothesis Testing

© U Dinesh Kumar, IIM Bangalore

A hypothesis test is a statistical test that is used to check

On Wednesday August 10 , 1966 New York Times published an

© U Dinesh Kumar, IIM Bangalore

How to check whether

If you drink Complan, you grow taller

© U Dinesh Kumar, IIM Bangalore

© U Dinesh Kumar, IIM Bangalore

Should Mission Hospital accept the offer?

© U Dinesh Kumar, IIM Bangalore

• Sample size = 248

– A claim or statement about a property of a

– The goal in Hypothesis Testing is to analyze a sample

• Null hypothesis is a statement about a population

• In most cases, we believe that the null hypothesis is

• Alternative hypothesis is also known as research hypothesis.

• Usually, we believe that the alternative hypothesis is true.

• A criminal trial is an example of hypothesis testing without the

• The null hypothesis is

• The alternative hypothesis or research hypothesis is

• A null hypothesis is a statement of the status quo, one of no

• An alternative hypothesis is one in which some difference or

2) State the null hypothesis, Ho and State the alternative

3) Calculate the test statistic to be used

5) Identify the reject region (one tailed or two tailed)

6) Compute the p-value and take the final decision.

• Level significance () is a decision criteria used for rejection of

• We collect evidence to show that the null hypothesis is not

• In most cases, we set the null hypothesis to be 5%.

• Probability of obtaining a sample mean is less than 5% if the

The test statistic is a mathematical expression that allows

The value of the test statistic is used to make a decision

• Retain the null hypothesis. The sample mean is associated

A p value is the probability of obtaining a sample outcome, given

When p-value is less than significance (), we reject the null

A critical value is a cutoff value that defines the boundaries

Lower one tailed H1: o < h

Two tailed test H1: o = h

Upper one tailed H1: o > h

• P(Type I error) =  [usually 0.05 or 0.01]

• A Type II error occurs when we don’t reject a false null

False Type II Error Correct

 Conditional probability of incorrectly rejecting H0, when it is

 Conditional probability of failing to reject H0, when it is false (Type

p-value Evidence against the null hypothesis in favour of alternative

Smaller p-value indicates stronger evidence against null

• The sample mean is an unbiased estimator of the

• Regardless of the distribution in the population, the

© U Dinesh Kumar, IIM Bangalore

The one–independent sample z test is a

Estimated value of mean - Hypothesis value of mean

• Check the hypothesis that the time spent by female

• HA: Time spend by female students on texting is greater than

HA:  > 94.8

© U Dinesh Kumar, IIM Bangalore

• Population is normal distribution & population

• In a sample of 220 customers, 45 customers are estimated to

• Check the hypothesis that the proportion of customers who

• For example, we would like to check whether the time spent

• Null Hypothesis (H0): 1 = 2

• Alternative hypothesis (H1): 1  2

• A sample 56 women revealed that they spent 6.2 hours on

• Assume that the population standard deviation is 1.1 hours.