Sei sulla pagina 1di 26

MEL761: Statistics for Decision Making

Dr S G Deshmukh
Mechanical Department
Indian Institute of Technology
Goodness of Fit Test

Learning Objectives
Understand the _
2
goodness-of-fit test and how to
use it.
Analyze data using the _
2
test of independence.
_
2
Goodness-of-Fit Test
The _
2
goodness-of-fit test compares
expected (theoretical) frequencies
of categories from a population distribution
to the observed (actual) frequencies
from a distribution to determine whether
there is a difference between what was
expected and what was observed.
_
2
Goodness-of-Fit Test
( )
data sample the from estimated parameters of number =
categories of number
values expected of frequency
values observed of frequency :
- 1 - = df
2
2
c
k
where
c k
e o
f
f
f
f f
e
o
e
=
=
=
=

_
Month Litres
January 1,610
February 1,585
March 1,649
April 1,590
May 1,540
June 1,397
July 1,410
August 1,350
September 1,495
October 1,564
November 1,602
December 1,655
18,447
Milk Sales Data
Hypotheses and Decision Rules
for Demonstration Problem 1
d distribute uniformly not are
sales milk for figures milk monthly The : H
d distribute uniformly are
sales milk for figures milk monthly The : H
a
o
o
_
=
=
=
=
=
.
.
. ,
01
1
12 1 0
11
24 725
01 11
2
df k c
If reject H .
If do not reject H .
Cal
2
o
Cal
2
o
_
_
>
s
24 725
24 725
. ,
. ,
Calculations
for Demonstration Problem 1
Month f
o
f
e
(f
o
- f
e
)
2
/f
e
January 1,610 1,537.25 3.44
February 1,585 1,537.25 1.48
March 1,649 1,537.25 8.12
April 1,590 1,537.25 1.81
May 1,540 1,537.25 0.00
June 1,397 1,537.25 12.80
July 1,410 1,537.25 10.53
August 1,350 1,537.25 22.81
September 1,495 1,537.25 1.16
October 1,564 1,537.25 0.47
November 1,602 1,537.25 2.73
December 1,655 1,537.25 9.02
18,447 18,447.00 74.38
e
f
=
=
18447
12
1537 25 .
Cal
2
74 37 _ = .
Demonstration Problem 1
Conclusion
0.01
df = 11
24.725
Non Rejection
region
Cal
2
74 37 24 725 _ = > . . , reject H . o
Bank Customer Arrival Data
for Demonstration Problem 2
Number of
Arrivals
Observed
Frequencies
0 7
1 18
2 25
3 17
4 12
>5 5
Hypotheses and Decision Rules
for Demonstration Problem 2
Ho: The frequency distribution is Poisson
H : The frequency distribution is not Poisson a
o
_
=
=
=
=
=
.
.
. ,
05
1
6 1 1
4
9 488
05 4
2
df k c
If reject H .
If do not reject H .
Cal
2
o
Cal
2
o
_
_
>
s
9 488
9 488
. ,
. ,
Calculations
for Demonstration Problem.2:
Estimating the Mean Arrival Rate
Number of
Arrivals
X
Observed
Frequencies
f
f X
0 7 0
1 18 18
2 25 50
3 17 51
4 12 48
>5 5 25
192
=

=
=

f X
f
192
84
2 3 . customers per minute
Mean
Arrival
Rate
Calculations for Demonstration Problem.2:
Poisson Probabilities for = 2.3
Number of
Arrivals X
Expected
Probabilities
P(X)
Expected
Frequencies
n P(X)
0 0.1003 8.42
1 0.2306 19.37
2 0.2652 22.28
3 0.2033 17.08
4 0.1169 9.82
>5 0.0838 7.04
n f =
=

84
Poisson
Probabilities
for = 2.3

_
2
Calculations
for Demonstration Problem 2
Cal
2
174
_
= .
Number of
Arrivals
X
Observed
Frequencies
f
Expected
Frequencies
nP(X)
(f
o
- f
e
)
2
f
e
0
1
2
3
4
>5
7 8.42
18 19.37
25 22.28
17 17.08
12 9.82
5 7.04
84 84.00
0.24
0.10
0.33
0.00
0.48
0.59
1.74
Demonstration Problem 2: Conclusion
0.05
df = 4
9.488
Non Rejection
region
Cal
2
174 9 488 _ = s . . , do not reject H . o
Using a _
2
Goodness-of-Fit Test
to Test a Population Proportion
.08 P : H
.08 = P :
a =
o H
o
_
=
=
=
=
=
.
.
. ,
05
1
2 1 0
1
3841
05 1
2
df k c
If reject H .
If do not reject H .
Cal
2
o
Cal
2
o
_
_
>
s
3841
3841
. ,
. ,
Using a _
2
Goodness-of-Fit Test to Test
a Population Proportion: Calculations
( ) ( )
2
2
33 16 167 184
16 184
18.0625 + 1.5707
19.6332
_
=
+
=
=



o e
f f
f
e
=
2 2
f
o
f
e
Defects 33 16
Nondefects 167 184
200 200
n =
( )( )
( )
( )( )
184
92 . 200
1
16
08 . 200

=
=
- =
=
=
- =
f
f
f
f
e
e
e
e
P n Nondef ects
P n Def ects
Using a _
2

Goodness-of-
Fit Test
to Test a
Population
Proportion:
Conclusion
0.05
df = 1
3.841
Non Rejection
region
. H reject , 841 . 3 6332 . 19 o
2
> =
_
Cal
_
2
Test of Independence
Used to analyze the frequencies of two
variables with multiple categories to
determine whether the two variables
are independent.

Qualitative Variables
Nominal Data
_
2
Test of Independence: Investment Example
In which region of the country do you reside?
A. Northeast B. Midwest C. South D. West
Which type of financial investment are you most likely to
make today?
E. Stocks F. Bonds G. Treasury bills
Type of financial
Investment
E F G
A O
13
n
A
Geographic B n
B
Region C n
C
D n
D
n
E
n
F
n
G
N
Contingency Table
_
2
Test of Independence: Investment Example
Type of Financial
Investment
E F G
A e
12
n
A
Geographic B n
B
Region C n
C
D n
D
n
E
n
F
n
G
N
Contingency Table
( ) ( ) ( )
If A and F are independent,
P A F = P A P F
( ) ( )
( )
P A
N
P F
N
P A F
N N
A F
A F
n n
n n
= =
=
( )
AF
A F
A F
e
n n
n n
N P A F
N
N N
N
=
=
|
\

|
.
|
=

_
2
Test of Independence: Formulas
( )
( )
ij
i j
e
n n
N
where
=
=
=
: i = the row
j = the column n
the total of row i
the total of column j
N = the total of all fr equencies
i
j
n
n
( ) 2
2
_
=


o e
where
f f
f
e
: df = (r - 1)(c - 1)
r = the number r of rows
c = the number r of columns
Expected
Frequencies
Calculated _
2

(Observed _
2
)
Type of
Gasoline
Income Regular Premium
Extra
Premium
Less than $30,000
$30,000 to $49,999
$50,000 to $99,000
At least $100,000
r = 4
c = 3
_
2
Test of Independence: Gasoline
Preference Versus Income Category
income of t independen
not is gasoline of Type : H
income of t independen
is gasoline of Type :
a
o H
( )( )
( )( )
o
_
=
=
=
=
=
.
.
. ,
01
1 1
4 1 3 1
6
16812
01 6
2
df r c
If reject H .
If do not reject H .
Cal
2
o
Cal
2
o
_
_
>
s
16812
16812
. ,
. ,
Gasoline Preference Versus Income
Category: Observed Frequencies
Type of
Gasoline
Income Regular Premium
Extra
Premium
Less than $30,000 85 16 6 107
$30,000 to $49,999 102 27 13 142
$50,000 to $99,000 36 22 15 73
At least $100,000 15 23 25 63
238 88 59 385
Gasoline Preference Versus Income
Category: Expected Frequencies
Type of
Gasoline
Income Regular Premium
Extra
Premium
Less than $30,000 (66.15) (24.46) (16.40)
85 16 6 107
$30,000 to $49,999 (87.78) (32.46) (21.76)
102 27 13 142
$50,000 to $99,000 (45.13) (16.69) (11.19)
36 22 15 73
At least $100,000 (38.95) (14.40) (9.65)
15 23 25 63
238 88 59 385
( )
( )
( ) ( )
( ) ( )
( ) ( )
ij
i j
e
n n
e
e
e
N
=
=
=
=
=
=
=
11
12
13
107 238
385
66 15
107 88
385
24 46
107 59
385
16 40
.
.
.
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
2
2
88 66 15 16 24 46 6 16 40
102 87 78 27 32 46 13 21 76
36 45 13 22 16 69 15 11 19
15 38 95 23 14 40 25 9 65
66 15 24 46 16 40
87 78 32 46 21 76
45 13 16 69 11 19
38 95 14 40 9 65
70 78
_
=
= + + +
+ + +
+ + +
+ +
=






o e
f f
f
e
2 2 2
2 2 2
2 2 2
2 2 2
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
.
Gasoline Preference Versus Income
Category: _
2
Calculation
Gasoline Preference Versus Income
Category: Conclusion
0.01
df = 6
16.812
Non rejection
region
Cal
2
7078 16812
_
= > . . , reject H . o

Potrebbero piacerti anche