Sei sulla pagina 1di 56

Module-10

Hypothesis Testing

Copyright 2012 BSI. All rights


reserved.

Objectives of this module

To introduce a range of hypothesis tests suitable for testing for differences between
averages.
Specifically:

1 Sample t-test

2 Sample t-test

Paired t-test

ANOVA

Copyright
2012BSI.Alrights
reserved.

Tools for Verifying the Causes

Hypothesis Tests

ANOVA

Design of
Experiments

Copyright 2012 BSI. All rights


reserved.

Verify the Causes


Identify possible
relationships
Correlation

Determine strength
of relationships

Regression

Quantify
relationships

Increasing complexity
of relationship

Plot - Scatter
Matrix
Boxplot
Dotplot

Identify and quantify


complex relationships

Hypothesis Tests for Average

Before a hypothesis test for averages can be selected, there are two preliminary
questions:
1) Are the samples Normally distributed?
Each of the samples that are to be included in the test must contain Normally
distributed data in order for the results of the test to be statistically valid.

2) How many samples do you want to compare?


The number of samples refers to the number of different samples (subgroups) that
are to be compared, not the amount of data that is in each of those samples (thats sample
size).

Copyright
2012BSI.Alrights
reserved.

To compare the Averages of samples of data


1
Determine
the Number
of Samples

YES

Are the
Samples
normally
distributed

2
2
3+

NO

Transform the
Data
using Box-Cox

NO

Compare
Median
Values

C
opyright
2012B
SI.A
lrights
reserved.

1 Sample t - Test
2 Sample t - Test
Paired t - Test
One-way Anova

To compare the Medians of samples of data


Determine
the Number
of Samples

YES

Are the
Samples
normally
distributed NO

Transform the
Data
using Box-Cox

NO

Compare
Median
Values

YES

Does the
Data have
Outliers

NO

C
opyright
2012B
SI.A
lrights
reserved.

Use the
Moods
Median
Test

Use the
KruskalWallis
Test
6

An Expensive Situation

You manage a warranty claims department. A customer claims loss of earnings of


$1,245 for an item which usually is about $1,000.
You examine 250 previous claims of the same item to make a comparison and find the
average to indeed be $997 with a standard deviation $88.
You want to know if the customer is over-claiming or if it is reasonable

Copyright
2012BSI.Alrights
reserved.

Innocent or Guilty?

If the customer is not over-claiming (it is a legitimate claim) then we would expect the
claim to fit with the pattern of data represented by the previous 250 claims

If the customer is over-claiming then we would expect the claim to not fit the pattern of
data from the previous 250 claims

Copyright
2012BSI.Alrights
reserved.

Does the Claim fit the Pattern?


You remember from previous module that if the data is normally distributed you can apply
normal theory

s = 88
1245

997 1085 1173 1261

1245 - 997
Z=
= 2.82
88

By using standard Normal


tables we can calculate the
probability of a claim of
$1245 based on the historical
data. If the probability is low,
then we can assume an
Illegitimate claim

X-X
Z= s

From Standard Normal tables the P-value, the probability of being equal to or greater than
1245 is 0.0024 or 0.24%. In other words we would expect such a claim to happen 1 in 417
claims

C
opyright2012BSI.Alrights
reserved.

P-values are Probabilities of Interest

P-value
Tail area
Area under curve beyond value of interest
Probability of being at value of interest or beyond

-4

-3

-2

-1

Value of
Interest

Copyright2012BSI.Allrights
reserved.

-4

-3

-2

-1

Value of
Interest

-4

-3

-2

-1

Value of
Interest

Value of
Interest

10

Making the Decision

The Normal theory is telling us that based on the previous 250 claims, we
should expect a claim of $1245 or greater every 417 claims
Hence:
This claim is legitimate it is that 1 in 417
This is not legitimate - it does not fit the previous data

Experience shows a small p-value (0 to 0.05) means


The probability is small that the value of interest comes from that distribution by chance therefore
something else is going on
Since our p-value 0.024 < 0.05 we can conclude that the claim is not legitimate

Copyright2012BSI.Allrights
reserved.

11

What Have we done?

In the previous example we have used the properties of the Normal distribution
to test whether the occurrence of an event could have happened by chance
(the data fits the expected pattern) or there is a real difference (the data does
not fit the expected pattern)

This type of situation occurs frequently during Six Sigma improvement projects,
either in the
Analyze phase when we are looking for differences to identify potential roots causes
Improve and Control phases when we are aiming to demonstrate that a real change has been made
we have made a difference

Copyright2012BSI.Allrights
reserved.

12

Analyze: Is there a Difference?

A common question during the Analyze phase is is there really a difference?


For example: We suspect the output of a process depends upon the supplier
Supplier A

Process

YA

Variation in
process output

Sample yA and sA

Is there a difference?
Sample yB and sB
Supplier B

Process
C
opyright2012BSI.Alrights
reserved.

YB

Variation in
process output

13

Improve: Have We Made a Difference?

A common question having improved a process is have we really made a


difference?
Variation in
process output

Yold

CHANGE

Old Process

Sample yold and sold

Is there a difference?
Sample ynew and snew
Variation in
process output

New Process
Copyright
2012BSI.Allrights
reserved.

Ynew
14

Is there a difference?
Looking at the histograms may give us the idea that there is a difference

20

Frequency

Frequency

20

10

10

0
10 11 12 13 14 15 16 17 18 19 20

Yold

10 11

12 13 14

15 16 17

18 19

Ynew

But can we be sure?


Copyright
2012BSI.Alrights
reserved.

15

A Difference?
It is possible that we have taken two samples from the same distribution

Old
sample

New
sample

While on face value they look different statistically they are not!! The difference
is due to chance (sampling)

Copyright
2012BSI.Alrights
reserved.

16

Hypothesis Testing

In the first example we calculated probabilities to help us decide. This situation is


common, where we are interested in making a decision about differences between two or
more sets of data that relate to real world situations
Hypothesis Testing allows us to decide whether an observed difference is real or has
happened by chance
In simple terms we expect there to be a difference between expected and observed
outcomes due to random (chance/common cause variation) factors. Hypothesis testing is
concerned with whether these differences are so large that they cannot be explained by
random (chance) effects
Hypothesis tests are often based on sample data in which case we use sampling
distributions rather than the distribution of individual values

Copyright
2012BSI.Alrights
reserved.

17

Hypothesis Testing Concept

In order to show that an apparent difference is real or due to chance we start


by assuming that there is no difference (Null Hypothesis, H0).
If the observed difference is within that expected by chance then the Null Hypothesis of no
difference is correct.
If the observed difference is greater than that expected by chance then the Null Hypothesis of no
difference is not correct.

Copyright2012BSI.Allrights
reserved.

18

Greater Than Expected!

Distribution of
sample means -

100%
99.73%
95.45%
68.26%

-3x -2x-1x

1x 2x 3x

Mean x

+
Theoretically the limits
are

We cannot be 100% certain that an actual sample mean falls outside the distribution.
There is always an element of risk

C
opyright2012BS
I.Alrights
reserved.

19

95% Confidence

Experience has shown that setting the decision Boundaries at 95% provides
a good balance between being able to
make a decision and the risk of making the wrong decision

Distribution of
sample means
95%
-1.96x 1.96x

Mean x
Copyright
2012BSI.Alrights
reserved.

20

Risk

A point here could be due


to chance or there is a difference
If we decide they are different
when they are not we have made a
type I error

A point here could be due to


chance or there is a difference.
If decide they are the same when
they are not we have made a
type II error

The risk of making this error


Is called the -risk

The risk of making this error


is called the -risk

95%

Mean x
C
opyright2012BSI.Alrights
reserved.

21

Type I and II errors

Type I - Deciding they are different when they are not


Type II - Not detecting a difference when there is one

P-value = the actual probability of making a Type I error


The probability of a Type II can be calculated given an assumed true difference
Both are important
Guarding too heavily against one increase the risk of the other
Increasing the sample size
Reduces Type II errors
Allows you to detect smaller differences

More Later
Copyright
2012BSI.Alrights
reserved.

22

Hypothesis Tests

Hypothesis tests can be used in a number of situations:


A sample is taken and found to have a mean x, is this value equal to a target value T?
Samples taken before and after a change, the two sample means are different - but are they?
Samples from a process for different stratification factors, the two sample means are different - but
are they?
Samples taken before and after a change, the standard deviations are different - but are they?
Samples from a process for different stratification factors, the two sample standard deviations are
different - but are they?
The proportion defective appears to vary with different factors but does it?

Copyright2012BSI.Allrights
reserved.

23

Steps in Hypothesis Testing

Hypothesis tests follow the pattern


determine the null hypothesis H0
determine the alternative hypothesis H1 or Ha
determine statistical test
state level of risk
establish sample size
collect data
analyse data
accept or reject the null hypothesis
determine conclusion

Copyright2012BSI.Allrights
reserved.

24

1. The Null Hypothesis

The null hypothesis is always stated so as to specify that there is no (null) difference

When comparing means or variances the null hypothesis is:


H0: = Target
H0: 1 = 2
H0: 1 = 2

Copyright2012BSI.All rights
reserved.

25

2. The Alternative Hypothesis

The alternative hypothesis opposes the null hypothesis and can therefore has several
options
For example
H1: 1 < 2
or
H1: 1 > 2
or
H1: 1 2

C
opyright2012BSI.Alrights
reserved.

The choice depends


upon the problem
Minitab gives the
option to select
these

26

3. Determine Tests

Hypothesis Test

Can be used to

t-test

Compare a group average against a target.


Compare two group averages.

paired t-test

Compare two group averages when data is


matched.

ANOVA
(Analysis Of Variance)

Compare two or more group averages.

Test for equal variances (F-test,


Bartletts test, Levenes test)

Compare two or more group variances.

Chi-square test

Compare two or more group proportions.

Copyright
2012BSI.Alrights
reserved.

27

3. Determine Test
Hypothesis tests are used when the input or process (X) variable is discrete.
If the X data are continuous, use Regression analysis to judge whether they are
related to the output (Y) variable.

X
Discrete

Discrete
Copyright 2012 BSI. All rights
reserved.

(Proportions)

Continuous

(Groups)

Continuous

t-test
Paired t-test

Regression

ANOVA

Chi-Square

Logistic
regression

28

4. State level of Risk

It is very important to specify before conducting the test the level of risk that the decision
will be based on
The risk means that there is the chance that we will make the wrong decision
In assessing the risk level we need to consider the situation and the business it is a
question of significance vs. importance
Significant means there is a real difference as proven by a hypothesis test
Important means that the difference observed is important to the business
Typically the -risk is set at 5% which provides a 95% confidence

Copyright
2012BSI.Alrights
reserved.

29

"Important" vs. "Significant" Differences

Significant but not important differences


Sometimes you will detect a statistically significant differenceyet it will be too small to be of
practical importance to your business.
Example: A project to reduce machine setup times
The average time for the old setup is 45 minutes. The new average setup time is 30 minutes.
The observed difference of 15 minutes is statistically significant
However, to justify the cost of implementation, a reduction of 25 minutes is necessary

Copyright2012BSI.Allrights
reserved.

30

"Important" vs. "Significant" Differences,

Important but not significant differences


Sometimes you cannot claim a difference is statistically significant yet the observed
difference is of importance to the business
Example: Production volumes
An increase of 1000 items produced per day is observed during the pilot.
An increase of 1000 is important to the business.
However, the difference is not statistically significant
Either the observed difference is due to random variation and no true difference exists,
or the the variation is too large (or sample size too small) to detect the difference.
You (and your business leaders) need to decide if it is worth the risk to go ahead and
implement the new process

Copyright
2012BSI.Alrights
reserved.

31

5. Establish Sample Size

The ability to detect differences is related to the sample size


A small sample means we cannot detect small differences, but the cost of data
collection is low
A large sample size will enable the detection of small differences, but the cost of data
collection could be high
Selecting the right sample size depends upon the Power = (1 - )

Copyright
2012BSI.Alrights
reserved.

32

6. Collect Data

Remember all the rules and guidelines about collecting data


Sample size
Good operational definition
Clear measurement procedure
Un-biased samples
Good Gauge R&R

Copyright2012BSI.Allrights
reserved.

33

7. Analyse Data

All hypothesis tests can be conducted by hand calculation BUT


Minitab Rules!

Copyright
2012BSI.Alrights
reserved.

34

8. Acceptance or Rejection

Method 1

if the p-value is > or = to - fail to reject H0

if the p-value is < - reject H0 and accept H1


Method 2

if 0 falls within the confidence interval - fail to reject H0


if 0 falls outside the confidence interval - reject H0 and accept H1

In practice we use Methods 2 and 3 from information provided by Minitab

Copyright
2012BSI.Alrights
reserved.

35

t -tests

Copyright 2012 BSI. All rights


reserved.

To compare the Averages of samples of data


1
Determine
the Number
of Samples

YES

Are the
Samples
normally
distributed

2
2
3+

NO

Transform the
Data
using Box-Cox

NO

Compare
Median
Values

C
opyright
2012B
SI.A
lrights
reserved.

1 Sample t - Test
2 Sample t - Test
Paired t - Test
One-way Anova

37

1 Sample t-test (1)

1 Sample t-tests are used to compare the average of one sample of data against a known average.

The known average may be a historical average or an industry benchmark.

For the purpose of the test, the known average is assumed to be exact.

Open data file: One-Sample-t.MPJ

Copyright2012BSI.Allrights
reserved.

38

1 Sample t-test (2)

A project team has implemented a new invoicing process, and wants to compare it
against the historical average of 16.5 days.

Individual Value Plot of New Process


22

Hypothesis:

New Process

20

The Individual Value Plot suggests

18

that the average invoice time for

16
14
12

16.5

the new process is lower (quicker)

New process

than the historical average of 16.5

10

Copyright 2012 BSI. All rights


reserved.

39

1 Sample t-test (3)

A 1 Sample t-test can be used to test if the difference between the averages is
statistically significant.

The hypotheses for the test would be:


Ho: There is no difference between the average invoice time of the new process and
16.5
Ha: New process is lower than average invoice time of 16.5

Because our theory is that there is a difference, we expect to reject the Null hypothesis.

Copyright
2012BSI.Alrights
reserved.

40

1 Sample t-test (4)

MINITAB: Stat > Basic Statistics > 1-Sample t


There are two options for entering
data.
1st Option: Enter the column that
contains the data here.

2nd Option: If you know the sample


size, mean and standard deviation
of the data it can be entered here.

In both cases, the known average that the data is


to be compared against should be entered here.
Copyright
2012BSI.Alrights
reserved.

41

1 Sample t-tests (5)

MINITAB: Stat > Basic Statistics > 1-Sample t

Leave the confidence level of 95%


Alternative: less than

Select the graph most appropriate for the


samples sizes involved. If in doubt, check
all three.

Copyright 2012 BSI. All rights


reserved.

42

1 Sample t-tests (6)

One-Sample T: New Process

Session Window
Output for 1 Sample
t-test

Test of mu = 16.5 vs < 16.5


95%
Upper
Variable N Mean StDev SE Mean Bound T P
New Process 15 14.9818 3.1218 0.8060 16.4015 -1.88 0.040

The p-value for the 1 sample


t-test is found in the session
window output. In this case,
the p-value is 0.040

Copyright2012BSI.A
lrights
reserved.

Since the p-value for this test is less than 0.05, we can be
95% confident that there is a difference between the
average invoicing time of the new process versus the
historical average. We reject the Null Hypothesis

43

1 Sample t-tests (7)


Histogramof NewProcess

Individual ValuePlot of NewProcess

(with Ho and 95%t-confidence interval for the mean)

(withHoand95%t-confidence interval for the mean)

Frequency

4
3
2
1

_
X
Ho

_
X

Ho

10

12

14

16
18
NewProcess

20

22

MINITAB adds a blue line to the graphs that


represents the confidence interval of the
sample average, and also indicates the
position of Ho (the known average).

Copyright 2012 BSI. All rights


reserved.

10

12

14

16
NewProcess

18

20

22

In this case, Ho (the known average) is


outside the confidence interval of the
sample average and therefore we can
prove a difference between the two.

44

1 Sample t-test Whiskey or water?

Problem : A cocktail bar suspects that one of its suppliers of whiskey is adding water to
increase profits. The freezing temperature of whiskey has a normal distribution with a
mean of = - 0.5450 C. Adding water raises the freezing temperature. The cocktail bar
wants to know if its suspicions are correct. Is the supplier adding water to its whiskey?
Procedure: A sample from each of ten bottles is obtained from this supplier.
Experimental Unit : A bottle.
Measurement : Freezing temperature of the sample.
Significance Level : = 0.05
Data Set : Whiskey.MPJ

Copyright
2012BSI.Alrights
reserved.

45

To compare the Averages of samples of data

1
Determine
the Number
of Samples

YES

Are the
Samples
normally
distributed

2
2
3+

NO

Transform the
Data
using Box-Cox

NO

Compare
Median
Values

Copyright
2012BSI.Alrights
reserved.

1 Sample t - Test
2 Sample t - Test
Paired t - Test
One-way Anova

46

The concept of Two Sample t-tests


You take two samples of data from
the same process, but at different
times. Has the process mean moved?

Would you think the process mean


has moved if the results had been..

What did you look at to make your decisions?

Copyright
2012BSI.Alrights
reserved.

47

2 Sample t-tests (1)

2 Sample t-tests are used to compare the averages of two samples of data. The two
samples may represent:

two different suppliers

two different processes

two different teams

two different products etc.

The two samples can be of different sample sizes, since the 2 sample t-test takes account
of both sample sizes.
The two samples can have different levels of variation (standard deviation) since the two
sample t-test takes account of this as well.
Open data file: Two-Sample-T-Appeal.MPJ

Copyright
2012BSI.Alrights
reserved.

48

2 Sample t-tests (2)

A project team has implemented a new appeals process, and the box plot below shows
the cycle time of the new and old processes.

Boxplot of Time (days) vs Process


42.5

Hypothesis:

40.0

Time (days)

37.5

The Box Plot suggests that the


average cycle time for the new
appeals process is lower (quicker)
than the old process.

35.0
32.5
30.0
27.5
25.0
New

Old
Process

Copyright
2012BSI.Alrights
reserved.

49

2 Sample t-tests (3)


A 2 Sample t-test can be used to test if the difference between the average cycle
times is statistically significant.
The hypotheses for the test would be:
Ho: There is no difference between the average cycle times for the new and old processes.
Ha: The new cycle time is lower than the cycle times for the old processes.

Because our theory is that the new is lower, we expect to reject the Null
hypothesis.

Copyright
2012BSI.Alrights
reserved.

50

2 Sample t-tests (4)

MINITAB: Stat > Basic Statistics > 2-Sample t


There are several options for entering
data.
1st Option: If the data is in a single
column,

with

second

column

containing subgroup data.


2nd Option: If the two data samples
are in two separate columns.
3rd Option: If you know the sample
size, mean and standard deviation of
both samples, this information can be
entered here.

C
o
p
y
rig
h
t
2
0
1
2
B
S
I.A
lrig
h
ts
reserved.

51

2 Sample t-tests (5)

MINITAB: Stat > Basic Statistics > 2-Sample t

Set the options as shown above:


a confidence of 95%
a test difference of 0.0
Alternative: less than

Copyright 2012 BSI. All rights


reserved.

52

2 Sample t-tests (6)


Two-Sample T-Test and CI: Time (days), Process
Two-sample T for Time (days)
Process N Mean StDev SE Mean
New
20 32.12 3.46 0.77
Old
25 34.35 3.12 0.62

Session Window
Output for 2 Sample
t-test

Difference = mu (New) - mu (Old)


Estimate for difference: -2.23700
95% upper bound for difference: -0.56083
T-Test of difference = 0 (vs <): T-Value = -2.25 P-Value = 0.015 DF = 38

The p-value for the 2 sample t-test is


found in the session window output. In
this case, the p-value is 0.015.

C
opyright
2012B
SI.A
lrights
reserved.

Since the p-value for this test is less than


0.05, we can be over 95% confident that
there is a difference between the average
cycle times.

53

2 Sample t-tests (7)


Individual Value Plot of Time (days) vs Process

42.5

42.5

40.0

40.0

37.5

37.5
Time (days)

Time (days)

Boxplot of Time (days) by Process

35.0
32.5

35.0
32.5

30.0

30.0

27.5

27.5

25.0

25.0
New

Old

Process

New

Old
Process

The graphs provided by the 2 sample t-test are the same as those provided by using
the graph menu.
By default, MINITAB adds a symbol to show the average of each subgroup.

Copyright 2012 BSI. All rights


reserved.

54

2 Sample t-tests (8)

Exercise: Using the data file: Two-Sample-T-Appeal.MPJ

Repeat the test using the samples in different columns option.

Repeat the test using the summarised data option.

Copyright
2012BSI.Alrights
reserved.

55

To compare the Averages of samples of data


1
Determine
the Number
of Samples

YES

Are the
Samples
normally
distributed

2
2
3+

NO

Transform the
Data
using Box-Cox

NO

Compare
Median
Values

C
opyright
2012B
SI.A
lrights
reserved.

1 Sample t - Test
2 Sample t - Test
Paired t - Test
One-way Anova

56

Potrebbero piacerti anche