Sei sulla pagina 1di 22

Analysis of Variance

Dr. Lloyd C. Bautista


Background
Ordinarily we analyse between two groups – male versus female,
urban versus rural, liberal vs conservatives.
But reality tells us that we cannot simply divide groups into two.
Example upper class, middle class, lower class, etc.
In T-testing…for instance between classes, there can be six paired:
o Upper vs middle
o Upper vs working
o Upper vs lower
o Middle vs working
o Middle vs lower
o Working vs lower
ANOVA (Analysis of Variance)
The purpose is to be able to analyse and compare the DIFFERENCES OF
MEANS among more than 2 groups.
We treat the total variation in a set of data as being divisible into two
components:
• Variation within groups – distance or deviation of the raw scores from their group
mean
• Variation between groups – distance or deviation of group means from one another
• REMEMBER THIS…two kinds “within” and “between”.
At the heart of the analysis is the SUM OF SQUARES, which measures the
variation between and within groups ෌ 𝑥 − 𝑥ҧ 2 . There are three:
• total sum of squares
• between-groups sum of squares
• within-groups sum of squares
Is it still difficult to understand? I saw this good example in the internet.

 A study looked at the average annual per capita


consumption of beer across three regions – Asia,
Europe, and America.
 Based on the data, Asians drink 30 L, Europeans drink
75 L while Americans 65 L.
 Are the difference statistically significant (real) or
merely sampling error.
 Logic of ANOVA is
o If the variation in the BETWEEN-GROUP is the
same as WITHIN GROUP, then the group means
are likely to differ only due to random error.
o If the variation in the BETWEEN-GROUP is large
relative to the WITHIN GROUP, then the group
means are likely to be STATISTICALLY different.
 In ANOVA, we are actually comparing the BETWEEN
GROUP variation to the WITHIN GROUP variation.
Case: Three government agencies (LTO, BIR and BOC). Their clients were
asked if these agencies are graft ridden with 1 = minimal graft and
corruption and 10 extreme graft and corruption. The table below is shown

BOC
2
LTO
2
BIR
2
Null hypothesis (H0):
X1 X1 X2 X2 X3 X3
6 2 3
(μ1 = μ2) all the population means
36 4 9
7 49 5 25 2 4
are equal.
8 64 4 16 4 16
6 36 3 9 4 16 Research hypothesis (Ha):
4 16 5 25 3 9 (μ1 ≠ μ2) at least one is different from
the rest.
ΣX1 31 ΣX12 201 ΣX2 19 ΣX22 79 ΣX3 16 ΣX32 54
2 2 2
X̅ 1 6.2 X̅ 1 40 X̅ 2 3.8 X̅ 2 15.8 X̅ 3 3.2 X̅ 3 10.8
Find the SUM OF SCORES Sets N Sum of Scores Sum of Squared

BOC 5 ΣX1 31 ΣX12 201


Σ𝑋 𝑡𝑜𝑡𝑎𝑙 = 𝑋1 + 𝑋2 + 𝑋3 ΣX2 ΣX22
LTO 5 19 79
Σ𝑋 𝑡𝑜𝑡𝑎𝑙 = 31 + 19 + 16 = 𝟔𝟔 BIR 5 ΣX3 16 ΣX32 54
Total 15 66 334

Find the SUM OF SQUARED SCORES

Σ𝑋2 𝑡𝑜𝑡𝑎𝑙 = 𝑋21 + 𝑋22 + 𝑋23 Find the MEAN FOR ALL GROUPS

Σ𝑋2 𝑡𝑜𝑡𝑎𝑙 = 201 + 79 + 54 = 𝟑𝟑𝟒 Σ𝑋 𝑡𝑜𝑡𝑎𝑙


𝑥ҧ 𝑡𝑜𝑡𝑎𝑙 =
𝑁
Find the NUMBER OF SUBJECTS
66
𝑁𝑡𝑜𝑡𝑎𝑙 = 𝑁1 + 𝑁2 + 𝑁3 𝑥ҧ 𝑡𝑜𝑡𝑎𝑙 =
15
𝑁𝑡𝑜𝑡𝑎𝑙 = 5 + 5 + 5 = 𝟏𝟓 𝑥ҧ 𝑡𝑜𝑡𝑎𝑙 = 𝟒. 𝟒
Find the TOTAL SUM OF SCORES Sets N Sum of Scores Sum of Squared Mean

BOC 5 ΣX1 31 ΣX12 201 X̅ 1 6.2

𝑆𝑆𝑡𝑜𝑡𝑎𝑙 = Σ𝑋2𝑡𝑜𝑡𝑎𝑙 − 𝑁𝑡𝑜𝑡𝑎𝑙 𝑥ҧ𝑡𝑜𝑡𝑎𝑙 2 LTO 5 ΣX2 19 ΣX22 79 X̅ 2 3.8


∗ ΣX3 ΣX32
BIR 5 16 54 X̅ 3 3.2
𝑆𝑆𝑡𝑜𝑡𝑎𝑙 = 66 − 15 ∗ 4.4 2 Total 15 66 334

𝑆𝑆𝑡𝑜𝑡𝑎𝑙 = 𝟒𝟑. 𝟔 X̅total 4.4

Find the ‘BETWEEN-GROUPS’ SUM OF SQUARES

𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = Σ𝑁𝑔𝑟𝑜𝑢𝑝 ∗ 𝑥ҧ 2𝑔𝑟𝑜𝑢𝑝 − 𝑁𝑡𝑜𝑡𝑎𝑙 ∗ 𝑥ҧ 2𝑡𝑜𝑡𝑎𝑙


𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = [ 5 6.2 2 + 5 3.8 2 + 5 3.2 2)] − (15)(4.4)2
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 25.2
Find the ‘WITHIN GROUPS’ SUM OF SQUARES

𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = Σ𝑋2𝑡𝑜𝑡𝑎𝑙 − Σ𝑁𝑔𝑟𝑜𝑢𝑝 ∗ 𝑥ҧ 2𝑔𝑟𝑜𝑢𝑝


𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 334 − [ 5 6.2 2 + 5 3.8 2 + 5 3.2 2)]
𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 18.4
Σ𝑁𝑔𝑟𝑜𝑢𝑝 ∗
𝑥ҧ 2𝑔𝑟𝑜𝑢𝑝
Find the ‘BETWEEN-GROUPS’ DEGREES OF FREEDOM 𝑘=𝟑
𝑁𝑡𝑜𝑡𝑎𝑙 = 𝟏𝟓
𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑘 − 1
𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 3 − 1
𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 2

Find the ‘WITHIN-GROUPS’ SUM OF SQUARES

𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑁𝑡𝑜𝑡𝑎𝑙 − 𝑘
𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 15 − 3
𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 12
Find the ‘BETWEEN-GROUPS’ MEAN SQUARE 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 25.2
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 2
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛
25.2
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
2
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 12.6
𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 18.4
Find the ‘WITHIN-GROUPS’ MEAN SQUARE
𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 12
𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 =
𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛
18.4
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 =
12
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 1.53
Find the f ratio by dividing the Msbetween with MSwithin

𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 12.6


𝑓=
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 1.53
12.6
𝑓=
1.53

𝑓 = 8.24
 The calculated f ratio is 8.24 with degrees of freedom =2 and 12 and α= .05 significance level.
 Since it is more than the f table ratio of 3.88, we REJECT the NULL HYPOTHESIS, and conclude that there are
differences between groups.
 See the graph below.

Source of variation SS df MS F

Total 43.6 15.0 8.24


Between groups 25.2 2.0 12.6
Within groups 18.4 12.0 1.5

TABLE D. Critical Values of F at the .05 and 0.1


Significance Level

df for the
df for the Numerator α = .05
Denominator
12 2
3.88

df for the
df for the Numerator α = .01
Denominator
12 2
6.93
What if we examine the influence
of two independent variables to a
dependent variable?
 We want to know the perception of Filipinos on the effect of two independent variables (marriage
and having kids) with respect to the use of contraceptives.
 We have four groups of four subjects each.
 For each level of factor A (marriage) is aligned in the rows, and factor B (kids) aligned in the
columns.
KIDS (B)
w/ KIDS (B) NO KIDS TOTAL
X1 X1 2
X2 X2 2 Null hypothesis (H0):
8 64
100
9 81
25
N= 8 X̅ 7.75
(μ1 = μ2) Marriage and kids HAVE NO
MARRIED 10 5
7 49 7 49 EFFECT on use of contraceptives.
9 81 7 49
34 294 28 204
Research hypothesis (Ha):
MARRIAGE
(A)

N= 4 X̅ 8.5 N= 4 X̅ 7

w/ KIDS NO KIDS
(μ1 ≠ μ2) Marriage and kids HAVE
X1 X1 2
X2 X22 EFFECT on use of contraceptives.
6 36 2 4 N= 8 X̅ 3.75
NOT MARRIED 4 16 1 1

8 64 1 1

6 36 2 4
24 152 6 10
N= 4 X̅ 6 N= 4 X̅ 1.5

TOTAL 8 7.25 N= 8 X̅ 4.25 N= 16 X̅ 5.75


N= 58 446 34
Find the MEAN FOR EACH GROUP

Σ𝑋 𝑔𝑟𝑜𝑢𝑝
𝑋ത 𝑔𝑟𝑜𝑢𝑝 =
𝑁𝑔𝑟𝑜𝑢𝑝
34
= = 8.5 (married/with kids)
4
28
= = 7.0 (married/no kids)
4
24
= = 6.0 (with kids/ not married)
4
6
= = 1.5 (not married / no kids)
4
Find the TOTAL MEAN

Σ𝑋 𝑡𝑜𝑡𝑎𝑙
𝑋ത 𝑡𝑜𝑡𝑎𝑙 =
𝑁𝑡𝑜𝑡𝑎𝑙
96
=
16 Σ𝑿 = 34 + 28 + 24 + 6 = 96

𝑁𝑡𝑜𝑡𝑎𝑙 = 𝑁 + 𝑁 = 16
𝑋ത 𝑡𝑜𝑡𝑎𝑙 = 5.75
Find the MEAN FOR EACH LEVEL OF FACTOR A

Rows not married = XA + XA = 34+28 = 62


Σ𝑋 𝐴 62

𝑋𝐴 = = = 7.75 (married)
𝑁𝐴 8
30 Rows married = XA + XA = 24 + 6 = 30
= = 3.75 (not married)
8

Find the MEAN FOR EACH LEVEL OF FACTOR B

Σ𝑋 𝐵 58 Columns w/kids= XB + XB = 34+24 = 58


𝑋ത 𝐵 = = = 7.25 (with kids)
𝑁𝐵 8

34 Columns w/o kids= XB + XB = 28+6 = 34


= = 4.25 (no kids)
8
Find the TOTAL SUM OF SQUARES 𝑋ത 𝑡𝑜𝑡𝑎𝑙 = 5.75
𝑁𝑡𝑜𝑡𝑎𝑙 = 𝟏𝟔
𝑆𝑆𝑡𝑜𝑡𝑎𝑙 = Σ𝑋2 𝑡𝑜𝑡𝑎𝑙 − 𝑁𝑡𝑜𝑡𝑎𝑙 ∗
𝑥ҧ 2 𝑡𝑜𝑡𝑎𝑙
2
Σ𝑁𝑔𝑟𝑜𝑢𝑝 = 𝟒
𝑆𝑆𝑡𝑜𝑡𝑎𝑙 = 294 + 204 + 152 + 10 − 16 ∗ 5.75
𝑆𝑆𝑡𝑜𝑡𝑎𝑙 = 𝟏𝟑𝟏 Σ𝑋2 𝑡𝑜𝑡𝑎𝑙 = 660

Find the ‘WITHIN-GROUPS’ SUM OF SQUARES Σ𝑋2 𝑡𝑜𝑡𝑎𝑙 = 𝑋21 + 𝑋22 + 𝑋23 + 𝑋24

𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = Σ𝑋2𝑡𝑜𝑡𝑎𝑙 − Σ𝑁𝑔𝑟𝑜𝑢𝑝 ∗ 𝑥ҧ 2𝑔𝑟𝑜𝑢𝑝 Σ𝑋2 𝑡𝑜𝑡𝑎𝑙 = 294 + 204 + 152 + 10

𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 660 − [ 4 8.5 2 + 4 6.0 2 + 4 7.0 2 + 4 1.5 2)]


𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 22
Find the SUM OF SQUARES OF FACTOR A 𝑋ത 𝑡𝑜𝑡𝑎𝑙 = 5.75
𝑆𝑆𝐴 = Σ𝑁𝑎𝑥ҧ 2𝑎 − 𝑁𝑡𝑜𝑡𝑎𝑙 ∗ 𝑥ҧ 2𝑡𝑜𝑡𝑎𝑙 𝑋ത 𝑎 = 7.75 𝑋ത 𝑎 = 3.75
𝑆𝑆𝐴 = [ 8 7.75 2 + 8 3.75 2] − (16)(5.75)2
𝑆𝑆𝐴 = 64

Find the SUM OF SQUARES OF FACTOR A


𝑋ത 𝑎 = 7.25 𝑋ത 𝑎 = 4.25
𝑆𝑆𝐵 = Σ𝑁𝑏𝑥ҧ 2𝑏 − 𝑁𝑡𝑜𝑡𝑎𝑙 ∗ 𝑥ҧ 2𝑡𝑜𝑡𝑎𝑙
𝑆𝑆𝐵 = [ 8 7.25 2 + 8 4.25 2] − (16)(5.75)2
𝑆𝑆𝐵 = 36
Find the SUM OF SQUARES OF FACTOR A & B

𝑆𝑆𝐴𝐵 = Σ𝑁𝑔𝑟𝑜𝑢𝑝𝑥ҧ 2𝑔𝑟𝑜𝑢𝑝 − Σ𝑁𝑎𝑥ҧ 2𝑎 − Σ𝑁𝑏𝑥ҧ 2𝑏 + 𝑁𝑡𝑜𝑡𝑎𝑙 ∗ 𝑥ҧ 2𝑡𝑜𝑡𝑎𝑙


𝑆𝑆𝐴𝐵 4 8.5 2 + 4 6.0 2 + 4 7.0 2 + 4 1.5 2)] − [ 8 7.75 2 + 8 3.75 2] − [ 8 7.75 2 + 8 3.75 2] + (16)(5.75)2
= [

𝑆𝑆𝐴𝐵 = 9
Find the DEGREES OF FREEDOM FOR WITHIN

𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑁𝑡𝑜𝑡𝑎𝑙 − 𝑎𝑏 = 16 − 2 2 = 12

Find the DEGREES OF FREEDOM FOR MAIN EFFECTS

𝑑𝑓𝐴 = 𝑎 − 1 = 2 − 1 = 1
𝑑𝑓𝐵 = 𝑏 − 1 = 2 − 1 = 1
Find the DEGREES OF FREEDOM FOR INTERACTIONS

𝑑𝑓𝐴𝐵 = (𝑎 − 1)(𝑏 − 1) = 1
Find the ‘WITHIN-GROUPS’ MEAN SQUARE, MAIN EFFECTS, AND INTERACTIONS SUMS OF SQUARE

𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 22
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = = = 1.833
𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 12

𝑆𝑆𝐴 64
𝑀𝑆𝐴 = = = 64
𝑑𝑓𝐴 1
𝑆𝑆𝐵 36
𝑀𝑆𝐵 = = = 36
𝑑𝑓𝐵 1

𝑆𝑆𝐴𝐵 9
𝑀𝑆𝐴𝐵 = = =9
𝑑𝑓𝐴𝐵 1
Find the f ratio by dividing the Msbetween with MSwithin

𝑀𝑆𝐴 64 𝑆𝑆𝐴 = 64
𝑓𝐴 = = = 34.909
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 1.833
𝑆𝑆𝐵 = 36
𝑀𝑆𝐵 36 𝑆𝑆𝐴𝐵 = 9
𝑓𝐵 = = = 19.636
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 1.833
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 1.833
𝑀𝑆𝐴𝐵 9
𝑓𝐴𝐵 = = = 4.909
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 1.833
 All calculated f ratio is 8.24 with degrees of freedom =1 and 12 and α= .05 significance level are MORE THAN
the f table ratio of 3.88, we REJECT the NULL HYPOTHESIS, and conclude that there are differences between
groups.
 There are significant ‘main effects’ of marriage and kids as well as their interations.

Critical values of F at .05 significance level


df for the Denominator df for the Numerator α = .05
12 2

3.88

Means
Calculated
Source SS df of
f ratio
Square
Marriage (A) 64 1 64.0 34.909
Kids (B) 36 1 26.0 19.636
Interaction (AB) 9 1 9.0 4.909
Within Group 22 12 1.833
131 15

Potrebbero piacerti anche