Sei sulla pagina 1di 59

F distribution and ANNOVA

BITS Pilani Dr. Udayan Chanda, Department of Management, BITS Pilani.


Pilani|Dubai|Goa|Hyderabad
Testing for the Ratio Of Two
Population Variances
Hypotheses FSTAT
Tests for Two
Population
* H0: σ12 = σ22
H1: σ12 ≠ σ22 S 2
1
Variances 2
H0: σ12 ≤ σ22 S 2
H1: σ12 > σ22
F test statistic
Where:
S12 = Variance of sample 1
n1 = sample size of sample 1
S 22 = Variance of sample 2
n2 = sample size of sample 2
n1 –1 = numerator degrees of freedom
n2 – 1 = denominator degrees of freedom
03-Oct-18 F distribution and ANNOVA 2
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
The F Distribution
• The F critical value is found from the F table
• There are two degrees of freedom required: numerator and
denominator

• When 2
S
FSTAT  1
2
df1 = n1 – 1 ; df2 = n2 – 1
S 2

• In the F table,
– numerator degrees of freedom determine the column
– denominator degrees of freedom determine the row
03-Oct-18 F distribution and ANNOVA 3
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Finding the Rejection Region
H0: σ12 = σ22 H0: σ12 ≤ σ22
H1: σ12 ≠ σ22 H1: σ12 > σ22
/2

/2 

0 F 0 F
Reject H0 Do not Reject H0 Do not Reject H0
Fα/2
reject H0
F(1- α/2)
reject H0

Critical values: F(1-α/2, n2-1, n1-1)=?
F(α/2, n1-1,n2-1) =?
Reject H0 if FSTAT > Fα
Rejection region:
Reject H0 if FSTAT < F(1-α/2,n2-1,n1-1) or
FSTAT > F(α/2,n1-1,n2-1)
03-Oct-18 F distribution and ANNOVA 4
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
F Test: An Example
You are a financial analyst for a brokerage firm. You want to
compare dividend yields between stocks listed on the NYSE &
NASDAQ. You collect the following data:
NYSE NASDAQ
Number 21 25
Mean 3.27 2.53
Std dev 1.30 1.16

Is there a difference in the


variances between the NYSE
& NASDAQ at the  = 0.05 level?
03-Oct-18 F distribution and ANNOVA 5
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
F Test: Example Solution
• Form the hypothesis test:
H 0: σ 12  σ 22 (there is no difference between variances)
H 1: σ 12  σ 22 (there is a difference between variances)
 Find the F critical value for  = 0.05:
 Numerator d.f. = n1 – 1 = 21 –1 = 20
 Denominator d.f. = n2 – 1 = 25 –1 = 24
 Fα/2 = F.025, 20, 24 = 2.33

 F(1-α/2), 24, 20 = F.975, 24, 20 = 0.4292


03-Oct-18 F distribution and ANNOVA 6
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
03-Oct-18 F distribution and ANNOVA 7
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
03-Oct-18 F distribution and ANNOVA 8
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
F Test: Example Solution
(continued)

• The test statistic is: H0: σ12 = σ22


H1: σ12 ≠ σ22
S12 1.302
FSTAT  2  2
 1.256
S 2 1.16
/2 = .025
/2 = .025
0 F
Reject H0 Do not Reject H0
reject H0
F0.025=2.33
F0.975=0.4292
 FSTAT = 1.256 is not in the rejection
region, so we do not reject H0
 Conclusion: There is insufficient evidence of
a difference in variances at  = .05
03-Oct-18 F distribution and ANNOVA 9
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
F Test: Example Solution

03-Oct-18 F distribution and ANNOVA 10


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
03-Oct-18 F distribution and ANNOVA 11
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
03-Oct-18 F distribution and ANNOVA 12
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
F test-Excel

03-Oct-18 F distribution and ANNOVA 13


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
F test-Excel

03-Oct-18 F distribution and ANNOVA 14


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
F test-Excel

03-Oct-18 F distribution and ANNOVA 15


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
F test-Excel

03-Oct-18 F distribution and ANNOVA 16


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example
The variability in the amount of impurities present in a batch of
chemicals used for a particular process depends on the length of
time that the process is in operation.

Suppose a sample of size 25 is drawn from the normal process


which is to be compared to a sample of a new process that has
been developed to reduce the variability of impurities.
Sample 1 Sample 2
n 25 25
s2 1.04 0.51
(Assume  = 0.05)

fdist 17
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example:continued
H0: s12 = s22
H a: s12 > s22

F(24,24) = s12/s22 = 1.04/.51 = 2.04

Assuming  = 0.05, Ftab = 1.98 < 2.04

Thus, reject H0 and conclude that the variability in the new


process (Sample 2) is less than the variability in the
original process.

fdist 18
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
One-Way Analysis Of Variance
(ANOVA) Setting
• Want to examine differences among more than two groups
• The groups involved are classified according to levels of a
factor of interest (numerical or categorical)
– Different levels produce different groups
– Think of each group as a sample from a different
population
• Observe effects on the dependent variable
– Are the groups the same?
• When there is only 1 factor the design is called a completely
randomized design
03-Oct-18 F distribution and ANNOVA 19
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
One-Way Analysis of Variance
• Evaluate the difference among the means of three or
more groups
Examples: Accident rates for 1st, 2nd, and 3rd shift
Expected mileage for five brands of tires

• Assumptions
– Populations are normally distributed
– Populations have equal variances
– Samples are randomly and independently drawn

03-Oct-18 F distribution and ANNOVA 20


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Hypotheses of One-Way ANOVA

• H0 : μ1  μ2  μ3    μc
– All population means are equal
– i.e., no factor effect (no variation in means
among groups)
H1 : Not all of the population means are the same

– At least one population mean is different
– i.e., there is a factor effect
– Does not mean that all population means are
03-Oct-18 different (some pairs
F distribution may be the same)
and ANNOVA 21
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
One-Way ANOVA
H0 : μ1  μ2  μ3    μc
H1 : Not all μj are the same
The Null Hypothesis is True
All Means are the same:
(No Factor Effect)

03-Oct-18
 μ 2andANNOVA
μF1distribution μ3 22
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
One-Way ANOVA
(continued)
H0 : μ1  μ2  μ3    μc
H1 : Not all μj are the same
The Null Hypothesis is NOT true
At least one of the means is different
(Factor Effect is present)

or

μ1  μ2  μ3 μ1  μ2  μ3
03-Oct-18 F distribution and ANNOVA 23
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Partitioning the Variation

• Total variation can be split into two parts:


SST = SSA + SSW

SST = Total Sum of Squares


(Total variation)
SSA = Sum of Squares Among Groups
(Among-group variation)
SSW = Sum of Squares Within Groups
(Within-group variation)

03-Oct-18 F distribution and ANNOVA 24


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Partitioning the Variation
(continued)

SST = SSA + SSW

Total Variation = the aggregate variation of the individual


data values across the various factor levels (SST)

Among-Group Variation = variation among the factor


sample means (SSA)

Within-Group Variation = variation that exists among


the data values within a particular factor level (SSW)

03-Oct-18 F distribution and ANNOVA 25


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Partition of Total Variation
Total Variation (SST)

Variation Due to Variation Due to Random


= Factor (SSA) + Error (SSW)

03-Oct-18 F distribution and ANNOVA 26


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Total Sum of Squares
SST = SSA + SSW
c nj

SST   ( X ij  X ) 2

Where: j 1 i 1

SST = Total sum of squares


c = number of groups or levels
nj = number of observations in group j
Xij = ith observation from group j
X = grand mean (mean of all data values)
03-Oct-18 F distribution and ANNOVA 27
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Total Variation
(continued)

Response, X

Group 1 Group 2 Group 3

03-Oct-18 F distribution and ANNOVA 28


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Among-Group Variation
SST = SSA + SSW
c
SSA   n j ( X j  X ) 2

j 1
Where:
SSA = Sum of squares among groups
c = number of groups
nj = sample size from group j
Xj = sample mean from group j
X = grand mean (mean of all data values)
03-Oct-18 F distribution and ANNOVA 29
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Among-Group Variation
(continued)
c
SSA   n j ( X j  X ) 2

j 1

SSA
Variation Due to
Differences Among Groups MSA 
c 1
Mean Square Among =
SSA/degrees of freedom

i j
03-Oct-18 F distribution and ANNOVA 30
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Among-Group Variation
(continued)

SSA  n1 ( X 1  X )2  n2 ( X 2  X )2      nc ( X c  X )2

Response, X

X3
X2 X
X1

Group 1 Group 2 Group 3


03-Oct-18 F distribution and ANNOVA 31
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Within-Group Variation
SST = SSA + SSW
c nj

SSW    ( X ij  X j ) 2

j 1 i 1
Where:
SSW = Sum of squares within groups
c = number of groups
nj = sample size from group j
Xj = sample mean from group j

03-Oct-18
Xij = ith observation in group j
F distribution and ANNOVA 32
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Within-Group Variation
(continued)

c nj

SSW    ( X ij  X j )2
j 1 i 1

Summing the variation SSW


within each group and then MSW 
adding over all groups nc
Mean Square Within =
SSW/degrees of freedom

μj
03-Oct-18 F distribution and ANNOVA 33
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Within-Group Variation
(continued)

Response, X

X3
X2
X1

Group 1 Group 2 Group 3

03-Oct-18 F distribution and ANNOVA 34


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Obtaining the Mean Squares
The Mean Squares are obtained by dividing the various
sum of squares by their associated degrees of freedom

SSA Mean Square Among


MSA  (d.f. = c-1)
c 1
SSW Mean Square Within
MSW  (d.f. = n-c)
nc
SST
MST  Mean Square Total
n1 (d.f. = n-1)
03-Oct-18 F distribution and ANNOVA 35
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
One-Way ANOVA Table

Source of Degrees of Sum Of Mean Square F


Variation Freedom Squares (Variance)

Among SSA FSTAT =


c-1 SSA MSA =
Groups c-1
MSA
Within SSW
n-c SSW MSW = MSW
Groups n-c

Total n–1 SST

c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom
03-Oct-18 F distribution and ANNOVA 36
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
One-Way ANOVA
F Test Statistic
H0: μ1= μ2 = … = μc
H1: At least two population means are different

• Test statistic MSA


FSTAT 
MSW
MSA is mean squares among groups
MSW is mean squares within groups

• Degrees of freedom
– df1 = c – 1 (c = number of groups)
– df2 = n – c (n = sum of sample sizes from all populations)
03-Oct-18 F distribution and ANNOVA 37
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Interpreting One-Way ANOVA
F Statistic
• The F statistic is the ratio of the among
estimate of variance and the within
estimate of variance
– The ratio must always be positive
– df1 = c -1 will typically be small
– df2 = n - c will typically be large

Decision Rule:
 Reject H0 if FSTAT > Fα, 
otherwise do not reject
H0 0 Do not Reject H0
reject H0

03-Oct-18 F distribution and ANNOVA


Fα 38
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
One-Way ANOVA
F Test Example
You want to see if when three Club 1 Club 2 Club 3
different golf clubs are used, 254 234 200
they hit the ball different 263 218 222
distances. You randomly select 241 235 197
five measurements from trials 237 227 206
on an automated driving 251 216 204
machine for each club. At the
0.05 significance level, is there a
difference in mean distance?

03-Oct-18 F distribution and ANNOVA 39


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
One-Way ANOVA Example:
Scatter Plot
Distance
Club 1 Club 2 Club 3 270
254 234 200 260 •
••
263
241
218
235
222
197
250 X1
240 •
237 227 206 • ••
230
251 216 204
220

X2 • X
••
210
•• X3
X1  249.2 X 2  226.0 X3  205.8 200 ••
X  227.0 190

03-Oct-18 F distribution and ANNOVA


1 2 3 40
Clubunder Section 3 of UGC Act, 1956
BITS Pilani, Deemed to be University
One-Way ANOVA Example
Computations
Club 1 Club 2 Club 3 X1 = 249.2 n1 = 5
254 234 200 X2 = 226.0 n2 = 5
263 218 222
X3 = 205.8 n3 = 5
241 235 197
237 227 206 n = 15
X = 227.0
251 216 204 c=3
SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4,716.4
SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1,119.6

MSA = 4,716.4 / (3-1) = 2,358.2 2,358.2


FSTAT   25.275
MSW = 1,119.6 / (15-3) = 93.3 93.3
03-Oct-18 F distribution and ANNOVA 41
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
One-Way ANOVA Example
Solution
H 0: μ 1 = μ 2 = μ 3 Test Statistic:
H1: μj not all equal
 = 0.05
MSA 2358.2
FSTAT    25.275
df1= 2 df2 = 12 MSW 93.3

Critical Decision:
Value:
Reject H0 at  = 0.05
Fα = 3.89
 = .05 Conclusion:
There is evidence that
0 Do not Reject H
0
at least one μj differs
reject H0
03-Oct-18
Fα = 3.89
FSTAT F=distribution
25.275 from the rest
and ANNOVA 42
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
The Tukey-Kramer Procedure

• Tells which population means are significantly


different
– e.g.: μ1 = μ2  μ3
– Done after rejection of equal means in ANOVA
• Allows paired comparisons
– Compare absolute mean differences with critical
range

μ1= μ2 μ3 x

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Tukey-Kramer Critical Range

MSW  1 1 
Critical Range  Q α   
2  n j n j' 
 

where:
Qα = Upper Tail Critical Value from Studentized
Range Distribution with c and n - c degrees
of freedom
MSW = Mean Square Within
nj and nj’ = Sample sizes from groups j and j’

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


The Tukey-Kramer Procedure:
Example
1. Compute absolute mean
Club 1 Club 2 Club 3 differences:
254 234 200
263 218 222 x1  x 2  249.2  226.0  23.2
241 235 197 x1  x 3  249.2  205.8  43.4
237 227 206
251 216 204 x 2  x 3  226.0  205.8  20.2

2. Find the Qα value from the table in appendix E.10 with


c = 3 and (n – c) = (15 – 3) = 12 degrees of freedom:

Q α  3.77
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
03-Oct-18 F distribution and ANNOVA 46
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
The Tukey-Kramer Procedure:
Example
(continued)
3. Compute Critical Range:
MSW  1 1  93.3  1 1 
Critical Range  Q α    3.77     16.285
2  n j n j'  2 5 5
 

4. Compare:
5. All of the absolute mean x1  x 2  23.2
differences are greater than critical
range. Therefore there is a significant x1  x 3  43.4
difference between each pair of
means at 5% level of significance. x 2  x 3  20.2
Thus, with 95% confidence we can conclude
that the mean distance for club 1 is greater
than club 2 and 3, and club 2 is greater than
club 3.
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
ANOVA Assumptions

• Randomness and Independence


– Select random samples from the c groups (or
randomly assign the levels)
• Normality
– The sample values for each group are from a
normal population
• Homogeneity of Variance
– All populations sampled from have the same
variance
03-Oct-18 F distribution and ANNOVA 48
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example
• Vishal Food Ltd is a leading manufacturer of biscuits. The
company has launched a new brand in the four metros; Delhi,
Mumbai, Kolkata and Chennai. After one month, the company
realizes that there is a difference in the retail price par pack of
biscuits across cities. Before the launch, the company had
promised its employees and newly appointed retailers that the
biscuits would be sold at a uniform price in the country. The
difference in price can tarnish the image of the company. In
order to make a quick inference, the company collected data
about the price from six randomly selected stores across the
four cities. Based on the sample information, the price per pack
of the biscuits (in Rs.) is given in the following table:

03-Oct-18 F distribution and ANNOVA 49


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example
Table: Price per pack of the Biscuits (in Rs)
Delhi Mumbai Kolkata Chennai
22 19 18 21
22.5 19.5 17 20
21.5 19 18.5 21.5
22 20 17 20
22.5 19 18.5 21
21.5 21 17 20

Use one-way ANOVA to analyze the significant


difference in the prices. Take 95% as the
confidence interval.
03-Oct-18 F distribution and ANNOVA 50
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Solution
H 0: μ 1 = μ 2 = μ 3 = μ 4
H1: μj not all equal;  = 0.05
Delhi Mumbai Kolkata Chennai

22 19 18 21
22.5 19.5 17 20 X1  22 X 2  19.5833
21.5 19 18.5 21.5
X 3  17.6667 X 4  20.5833
22 20 17 20
22.5 19 18.5 21 X  19.95833
21.5 21 17 20

Mean 22 19.5833 17.6667 20.5833


n1 = n2 = n3 = n4 = 6
Grand
Mean 19.95833

SSA = 6 (22 – 19.95833)2 + 6 (19.5833 – 19.95833)2 + 6 (17.6667 – 19.95833)2 + 6


(20.5833 – 19.95833)2 = 59.7083

03-Oct-18 F distribution and ANNOVA 51


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Solution
Delhi Mumbai Kolkata Chennai

22 19 18 21
22.5 19.5 17 20
21.5 19 18.5 21.5
22 20 17 20
22.5 19 18.5 21
21.5 21 17 20

Mean 22 19.5833 17.6667 20.5833


Grand
Mean 19.95833

SSW = (22 – 22)2 + (22.5 – 22)2 + (21.5 – 22)2 + (22 – 22)2 + (22.5 – 22)2
+ (21.5 – 22)2 + (22.5 – 19.5833)2 + (22.5 – 19.5833)2 + (22.5 – 19.5833)2
+ (22.5 – 19.5833)2 + (22.5 – 19.5833)2 + (22.5 – 19.5833)2 + (18 –
17.6667)2 + (17 – 17.6667)2 + (18.5 – 17.6667)2 + (17 – 17.6667)2 + (18.5
– 17.6667)2 + (17 – 17.6667)2 + (21 – 20.5833)2 + (20 – 20.5833)2 + (21.5
– 20.5833)2 + (20 – 20.5833)2 + (21 – 20.5833)2 + (20 – 20.5833)2 = 9.25
03-Oct-18 F distribution and ANNOVA 52
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Solution: ANOVA Table

Source of Degrees of Sum Of Mean Square F


Variation Freedom Squares (Variance)

Among c–1 SSA FSTAT =


19.9027
Groups = 4-1 =3 59.7083
MSA
Within n–c SSW
9.25 0.4625 MSW
Groups =24-4 =20
n–1 SST = 43.03
Total
= 24-1 =23 68.9583
c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom
03-Oct-18 F distribution and ANNOVA 53
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
One-Way ANOVA Example
Solution
H 0: μ 1 = μ 2 = μ 3 = μ 4 Test Statistic:
H1: μj not all equal
 = 0.05
MSA 19.9027
FST AT    43.03
df1= 3 df2 = 20 MSW 0.4625

Critical Decision:
Value:
Reject H0 at  = 0.05
Fα = 3.10
 = .05 Conclusion:
There is evidence that
0 Do not Reject H
0
at least one μj differs
reject H0
FSTAT = 43.03 from the rest
Fα = 3.10
03-Oct-18 54
F distribution and BITS
ANNOVA
Pilani, Deemed to be University under Section 3 of UGC Act, 1956
ANOVA: EXCEL

03-Oct-18 F distribution and ANNOVA 55


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
ANOVA: EXCEL

03-Oct-18 F distribution and ANNOVA 56


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
ANOVA: EXCEL

03-Oct-18 F distribution and ANNOVA 57


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
ANOVA: EXCEL

03-Oct-18 F distribution and ANNOVA 58


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Lecture Summary
• Compared two population variances from two independent
populations via the F test
• Described one-way analysis of variance
– The logic of ANOVA
– ANOVA assumptions
– F test for difference in c means

03-Oct-18 F distribution and ANNOVA 59


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956

Potrebbero piacerti anche