Sei sulla pagina 1di 136

Trust that everything is

fine with you people!!!!

Nobody will be like

Everyone will be like

What
NEXT?

First, lets revisit what


TESTS we have covered so
far.

We started with
PARAMETRIC TESTS!
First we took tests related to MEAN.
We have ONE Sample Mean Test Large
Sample.
We have ONE Sample Proportion Test
Large Sample.

Then, we took TWO


SAMPLE TESTS!
We took tests related to MEAN.
We have TWO Samples Difference of
Mean Test Large Sample:
Samples coming from SAME population
Samples coming from TWO INDEPENDENT
POPULATION.

We have TWO Sample Difference of


Proportion Test Large Sample.
Samples coming from SAME population
Samples coming from TWO INDEPENDENT
POPULATION.

Then,
Then, we
we move
move to
to cases
cases
where
where the
the population
population is
is
NORMAL
NORMAL but
but the
the standard
standard
Deviation
Deviation of
of the
the
Population
Population is
is NOT
NOT known;
known;
and
and above
above the
the sample
sample
size
size is
is SMALL.
SMALL.

We used t-Test which is a


PARAMETRIC TESTS!
We took tests related to MEAN.
We have ONE Sample Mean Test Small
Sample.
We have ONE Sample Proportion Test
Small Sample.

Then, we took TWO


SAMPLE TESTS!
We took tests related to MEAN.
We have TWO Samples Difference of
Mean Test Small Sample:
Samples coming from SAME population
Samples coming from TWO INDEPENDENT
POPULATION.

We have TWO Sample Difference of


Proportion Test Small Sample.
Samples coming from SAME population
Samples coming from TWO INDEPENDENT
POPULATION.

Then,
Then, we
we move
move from
from

Testing about Mean

Testing about VARIANCE


One Sample

Use Chi-Square ( 2)

Testing about VARIANCES


TWO Samples

Use F-TEST (F)

Lets apply.

These are the marks


obtained by YOU
people in the second
quiz out of 20.

What do
you think
Which is more
important from

The Mean marks


obtained are 7.42 with
a standard deviation
of 3.95 .

evaluation point of
view Mean or
Standard
Deviation?

Assume that
There are 31 students who appeared in the QuizII.
Desirable level of variation should NOT be less
than 4.

Marks obtained by the students are independent


of each others marks.

Hypothesis Testing About Variance


Hypotheses:

H 0 : 2 42
H a : 2 42

It is a ONE-Tailed Test.
Test Statistic
2
(
n

1
)
s
2
20

The critical
value of 2
for 30 df
and = 5%
is 43.773

The value of the 2 is (30 x 3.952)/(42) =


29.2547 which is less than the Critical
Value of 43.773; hence, null hypothesis is
not rejected.

Whats the conclusion?

Any Question before we proceed?

R
Is there a
el
relation
at
io between F Test
n

Statistics and tStatistics?

Yes, there is a relation!


Remember that F is a ratio of TWO VARIANCES!
t2 =

=F

It shows that SQUARE OF t is equal


to F.

Whats Next?

Lets start with something


interesting

Badnaam
Kulfi

What do you think


Nitin
can you prove that this kind of
marketing strategy results in more
sales?

Lets come back to our


MAIN BUSINESS the one
which we have to do!!!

Are these on an average different?

What do
you say?
Are they
different?

Do you think
that there are
differences?

Do you think
that there are
differences?

ANALYSIS OF
VARIANCE

Lets consider an example

Assume that you have collected


following information
Motivation Level
(Measured on a composit index of 20 points)
Plant 1

Plant 2

Plant 3

Respondent 1

14

Respondent 2

10

14

Respondent 3

11

Respondent 4

What would you like to


infer from the data?
Motivation Level
(Measured on a composit index of 20 points)
Plant 1

Plant 2

Plant 3

Respondent 1

14

Respondent 2

10

14

Respondent 3

11

Respondent 4

Are you looking at .???

Is Motivation f ( Plants ) ?
Dependent
Variable

Ratio/Interval Scale
Variable

Independent
Variable

Categorical
Variable

If you are looking for an answer

Is the motivational level of workers effected


by the plant in which they are working?

If yes, then we should go for

ANALYSIS OF
VARIANCE

O
AN
VA

AN ANOVA
OANOVA
VA

ANOVA

Copyright, 1996 Dale Carnegie & Associates, Inc.

ANALYSIS OF VARIANCE ...


It is applied when one has more than one
samples under study coming from
different populations or same population
with different treatments.
It is a hypothesis testing procedure used
to determine whether mean differences
exist for two or more samples or
treatments.
The term ANALYSIS OF VARIANCE appears
to be a misnomer since the objective is to
analyse differences among means of the
samples rather than the variances of the

ANALYSIS OF VARIANCE
(continued)

If we assume that all the samples are coming from


the same population or the treatments across has
no effect, then the variation among the means and
the standard deviation across the group will be due
to only sampling error. If it is so then the variation
within groups and the variation among the
groups should be same. To test whether these
variations are equal is the main purpose of ANOVA.
Therefore, ANOVA tests equality of means among a
number of groups via testing of equality of
variations due to WITHIN GROUPS and due to
AMONG GROUPS.

ANALYSIS OF VARIANCE
(continued)

When the observations are subjected to experiment related


to one factor then such an analysis is called as ONE-WAY
ANALYSIS OF VARIANCE.
In it, if differences are found to be significant then it
could be concluded that the differences are due to the
treatment of ONE FACTOR.
ANOVA divides the TOTAL VARIATION into two - AMONGGROUP VARIATION and WITHIN-GROUP VARIATION.

One-Way Analysis of
Variance: Partitioning of
TOTAL SUM OF SQUARES OF
VARIATION
TOTAL VARIATION
(TOTAL SUM OF SQUARES)

VARIATION AMONG GROUPS


DUE TO TREATMENT
(TREATMENT SUM OF SQUARES)

VARIATION WITHIN GROUPS


(ERROR SUM OF SQUARES)

Composition of VARIATION
Let
SStot = Sum of Squares (TOTAL)

SSbet = Sum of Squares (BETWEEN GROUPS)


SSwith = Sum of Squares (WITHIN A GROUP)
Then,

SS tot ( xi Grand Mean) 2


i 1

SS bet ng (Group Mean j Grand Mean) 2


j 1

SS with ( x ji Group Mean j ) 2


j 1 i 1

ANALYSIS OF VARIANCE
(continued)

ANOVA proceeds with a Null Hypothesis of no


differences among the means of populations.
That is-

Null Hypothesis H0: 1 = 2 = 3 = ...


And,
Alternative Hypothesis H1: At least two
means are different.
It assumes that there is no difference between K
samples behaviour and hence, their means should
be same; alternatively their means should not be

ANALYSIS OF VARIANCE
(continued)
Therefore, the variation between the groups and among the
groups must be same. Thus, it uses test statistics called F
statistics which is a ratio defined as below F=

VARIATION BETWEEN THE GROUPS


VARIATION WITHIN THE GROUPS

And, the total variation is equal to the sum of the variations


within the groups and the variation between the groups.
If Null Hypothesis were to be true then F statistic is
approximately equal to 1 otherwise it would be substantially
larger than 1.

ANALYSIS OF VARIANCE
(continued)

Note the following points about ANOVA -

It is a test of equality among the


means and not that of variances.
That is to say that it has nothing to
test about the variances of the
samples.
It says that at least two means are
different out of the K-means of the
K-samples.
It assumes that the variances of the

ANOVA TABLE...
ANOVA TABLE ---- ONE WAY
SOURCE OF
VARI ATI ON

SUM OF
SQUARES

BETWEEN
WI THI N
TOTAL

BETWEEN SS
WI THI N SS
TOTAL SS

DEGREE
OF
FREEDOM
(C-1)
(N-C)
(N-1)

MEAN SUM OF
SQARE

F- RATI O

MSS(BETWEEN)
MSS(WI THI N)

MSS(BETWEEN)
MSS(WI THI N)

Lets consider the following


problem:
Plant (Employee Age)
1

29

32

25

27

33

24

30

31

24

27

34

25

28

30

26

Can we say that the average age of


employees across the plants are different?

Working of the problem:


Plant (Employee Age)
1

29

32

25

27

33

24

30

31

24

27

34

25

28

30

26

141

160

124

Count

Mean

28.2

32.0

24.8

Sum

Grand
Mean

28.33

Working of the problem:


TREATMENT EFFECT

RESIDUALS

Plant (Employee Age)

Plant (Employee Age)

-0.13

3.67

-3.53

0.80

0.00

0.20

-0.13

3.67

-3.53

-1.20

1.00

-0.80

-0.13

3.67

-3.53

1.80

-1.00

-0.80

-0.13

3.67

-3.53

-1.20

2.00

0.20

-0.13

3.67

-3.53

-0.20

-2.00

1.20

(Xi Grand Mean)

(Xi Mean of Each Group)

Working of the problem:


TREATMENT EFFECT
(SQUARE)

RESIDUALS (SQUARE)

Plant (Employee Age)

Plant (Employee Age)

0.0178

13.4444

12.4844

0.6400

0.0000

0.0400

0.0178

13.4444

12.4844

1.4400

1.0000

0.6400

0.0178

13.4444

12.4844

3.2400

1.0000

0.6400

0.0178

13.4444

12.4844

1.4400

4.0000

0.0400

0.0178

13.4444

12.4844

0.0400

4.0000

1.4400

ANOVA TABLE
ANOVA-TABLE
Source of Variation

Between Groups
Within Groups

Total

SS

df

MS

129.733

19.6

12

149.333

14

64.867
1.633

39.714

Lets consider another


problem:
Machine Operator (Valve Opening)
1

6.33

6.26

6.44

6.29

6.26

6.36

6.38

6.23

6.31

6.23

6.58

6.19

6.29

6.27

6.54

6.21

6.40

6.19

6.56

6.50

6.34

6.19

6.58

6.22

Can we say that the average Opening of


Valves across the machine operators are
different?

Go to

Revisiting our earlier example


Motivation Level
(Measured on a composit index of 20 points)
Plant 1

Plant 2

Plant 3

Respondent 1

14

Respondent 2

10

14

Respondent 3

11

Respondent 4

Can we conclude that motivational level is


dependent on which plant?

Lets consider another


example..

Consider the following:


An experiment was conducted to understand recall
mechanism among the people. 50 people were
selected and they were randomly assigned into five
groups four incidental-learning groups
Counting Group
Rhyming Group
Adjective Group

ISrecall
recallmechanism
mechanismaafunction
functionof
of
IS
towhich
whichgroup
groupaaperson
personbelongs?
belongs?
to

Imagery Group

and one intentional-learning group.


Each group was given a list of 27 words and each
one was asked to read them carefully 3 times.

No. of words recalled as a


function of level of
processing
NO. OF WORDS RECALLED AS A FUNCTION OF LEVEL OF

COUNTING
9
8
6
8
10
4
6
5
7
7

RHYMING
7
9
6
6
6
11
6
3
8
7

ADJECTIVE
11
13
8
6
14
11
13
13
10
11

IMAGERY
12
11
16
11
9
23
12
10
19
11

What is here which you would like to put on test?

INTENTIONAL
10
19
14
5
10
11
14
15
11
11

Anova: Single Factor


SUMMARY
Groups

Count

Sum

Average

Variance

COUNTING
RHYMING

10
10

70
69

7.00
6.90

3.33
4.54

ADJECTIVE

10

110

11.00

6.22

IMAGERY
INTENTIONAL

10
10

134
120

13.40
12.00

20.27
14.00

Source of Variation
Between Groups
Within Groups

SS
351.52
435.3

df

Total

786.82

ANOVA
4
45

MS
87.880
9.673

49

What you can conclude here?

P-value
9.085
0.000

F crit
2.579

Does the story


end here?

First, Test of Homogeneity


of Variance
Test of Homogeneity of Variances
NUMBER OF WORDS RECALLED
Levene
Statistic
2.529

df1

df2
4

45

What do you say?

Sig.
.054

Second, POST-HOC
ANALYSIS
Multiple Comparisons

Dependent Variable: NUMBER OF WORDS RECALLED


Bonferroni

(I) LEVEL OF
PROCESSING
COUNTING

RHYMING

ADJECTIVE

IMAGERY

INTENTIONAL

(J) LEVEL OF
PROCESSING
RHYMING
ADJECTIVE
IMAGERY
INTENTIONAL
COUNTING
ADJECTIVE
IMAGERY
INTENTIONAL
COUNTING
RHYMING
IMAGERY
INTENTIONAL
COUNTING
RHYMING
ADJECTIVE
INTENTIONAL
COUNTING
RHYMING
ADJECTIVE
IMAGERY

Mean
Difference
(I-J)
1.00E-01
-4.00
-6.40*
-5.00*
-1.00E-01
-4.10
-6.50*
-5.10*
4.00
4.10
-2.40
-1.00
6.40*
6.50*
2.40
1.40
5.00*
5.10*
1.00
-1.40

Std. Error
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39

*. The mean difference is significant at the .05 level.

Sig.
1.000
.061
.000
.008
1.000
.051
.000
.006
.061
.051
.913
1.000
.000
.000
.913
1.000
.008
.006
1.000
1.000

95% Confidence Interval


Lower Bound
Upper Bound
-4.01
4.21
-8.11
.11
-10.51
-2.29
-9.11
-.89
-4.21
4.01
-8.21
6.11E-03
-10.61
-2.39
-9.21
-.99
-.11
8.11
-6.11E-03
8.21
-6.51
1.71
-5.11
3.11
2.29
10.51
2.39
10.61
-1.71
6.51
-2.71
5.51
.89
9.11
.99
9.21
-3.11
5.11
-5.51
2.71

M ean of NUM BER O F W O RDS RECALLED

Third, MEAN-PLOT.
14

12

10

6
COUNTING

RHYMING

ADJECTIVE

LEVEL OF PROCESSING

IMAGERY

INTENTIONAL

Now, we get some sense


out of the
story!!!!!!!!!!!!!!!!

Meet this boy


Sudhanshu!!!
Should I major
in Marketing,
HR or Finance?

He contacted his seniors who had majored in these three


areas and collected information about their pay-package.

HR

Marketin
Finance
g

2.70

2.30

4.80

2.20

3.60

3.50

3.30

2.70

4.60

2.50

4.40

3.60

3.80

3.90

2.80

2.90

3.20

2.90

(Numbers are annual pay-package in Rs. lakhs)

How do I know whether


there are significant
differences in the paypackages?.... Ohhhhh!!!
Yes! I can use ANOVA!!!

Sudhanshu got ANOVA


results

I think Finance
would be the
right choice for
me!!!

Anova: Single Factor


SUMMARY

Groups

Average Variance

Count

Sum

HR

17.4

2.90

0.332

Marketing

20.1

3.35

0.603

Finance

28.05

4.675

0.9958

ANOVA
Source of
Variation

SS

df

MS

PF crit
value

Between Groups

10.2175

5.1088

7.938

0.0045 3.6823

Within Groups

9.65375

15

0.6436

19.87125

17

Total

Sudhanshu! It is not
the Area that matters,
it is the Grade that
matters in getting a
good package.

Is it
really
so?!!!

Sudhanshu also collected information about the grades of the


seniors and the collected data is ...

Now, he is lost!!!
What to do with
the data?

Grade
s

HR

A+

3.80

4.40

5.80

3.30

3.90

5.75

A-

2.90

3.60

4.80

B+

2.70

3.20

4.60

2.50

2.70

3.60

B-

2.20

2.30

3.50

Marketing Finance

(Numbers are annual pay-package in Rs. lakhs)

For
For that
that we
we have
have to
to
stretch
stretch ourselves
ourselves
and
and

Stretching our selves and


...

MOVING FROM
ONE FACTOR TO
TWO FACTORS

Mr. Padam Jain is working in a


firm that does electrical work
in Offices.
In India, there is lot of variation in the voltage of
electricity supplied which may have a bad impact
on the life of electrical appliances. Mr. Padam Jain
wants to study the impact of Voltage on the Life
of Tubes. For that he tool 10 tubes and
experimented with variations in the voltage he
took three levels of voltage Low, Medium
(correct one) and High. And, he collected the data
about their lives measured in terms of hours.

Data
LIFE OF TUBES IN (THOUSAND
HOURS)
SLOW

MEDIUM

HIGH

3.7

4.5

3.1

3.4

3.9

2.8

3.5

4.1

3.0

3.2

3.5

2.6

3.9

4.8

3.4

Picture speaks

What do say
about the
impact of
variations in
Voltage on the
life of tubes?

Assume that Mr. Jain took this five


tubes from five different
manufactures

THINK!!!!!
###$$
$?????!!
!!!!!!!!!!
???

So
what?

How to filter the


effects of Brands so
at to get the impact
of only voltage on
the life of tubes?

Mr. Jain provides the


following information

LIFE OF TUBES IN (THOUSAND HOURS)


BRAND

SLOW

MEDIUM

HIGH

SURYA

3.7

4.5

3.1

BAJAJ

3.4

3.9

2.8

PHILIPS

3.5

4.1

3.0

OSRAM

3.2

3.5

2.6

HAVELLS

3.9

4.8

3.4

Understand what a chart


speaks!!!

Thats a challenge for all


of us!

TWO-WAY ANALYSIS
OF VARIANCE
WITHOUT
REPLICATION
(RANDOMIZED BLOCK DESIGN )

TWO-WAY ANALYSIS OF VARIANCE


WITHOUT REPLICATION
(RANDOMIZED BLOCK DESIGN )
When the observations are subjected to experiment related
to two factors then such an analysis is called as TWO-WAY
ANALYSIS OF VARIANCE.
If the variation among the samples is due to TWO
FACTORS, then a researcher may be interested in
knowing whether variation due to each factor is significant.
The advantage of such an experimental design over ONEWAY design is that it could better explain the variation
among the observations and therefore, the variation due to
error term would be minimized. Further, such a model has
more power of explaining the observed variations.

SCHEME OF ANALYSIS OF VARIANCE:


RANDOMIZED BLOCK DESIGN
Are all the three machines properly adjusted in a manner that they are on the
average filling same quantity of cola drink given the fact that these machines
are operated at different points of time by different operators?

3
BLOCKING

Measurements
of quantity
filledIndividual
observations

VARIABLE
(OPERATORS)

What are you looking for


in this example .???
Is Measurements of Quantity Filled f (Time , Operator ) ?

Dependent
Variable
Ratio/Interval Scale
Variable

Independent
Variables
Categorical
Variable

Lets consider the


following example

TWO-WAY ANALYSIS OF
VARIANCE (continued)
In case of TWO-WAY ANOVA without replication, no value
of a particular factor is allowed to repeat in the trail. As a
result, the values of factors are independent and so their
effects are. In such a case, there will not be any interaction
effect between factors.

TWO-WAY ANALYSIS OF
VARIANCE (continued)
In case of TWO-WAY ANOVA without replication, no value
of a particular factor is allowed to repeat in the trail. As a
result, the values of factors are independent and so their
effects are. In such a case, there will not be any interaction
effect between factors.

BLOCKING
VARIABLE

TWO-WAY ANALYSIS OF
VARIANCE (continued)
In it, if differences are found to be significant then
they could be attributed to either of factors.

ANOVA divides the TOTAL VARIATION into two AMONG-GROUP

VARIATION

which

is

further

segregated into variations due to each factor and


WITHIN-GROUP VARIATION.That is-

RANDOMIZED BLOCK DESIGN


Analysis of Variance: Partitioning of
TOTAL SUM OF SQUARES OF VARIATION
TOTAL VARIATION
(TOTAL SUM OF SQUARES)

VARIATION AMONG GROUPS


DUE TO TREATMENT
(TREATMENT SUM OF SQUARES)

VARIATION EXPLAINED BY THE


BLOCKING VARIABLE
(BLOCKS SUM OF SQUARES)

VARIATION WITHIN GROUPS


(ERROR SUM OF SQUARES)

VARIATION DUE TO ERROR


(NEW ERROR SUM OF SQUARES)

ANALYSIS OF VARIANCE..

TWO - WAY ANOVA TABLE ...................

-- RANDOMIZED BLOCK DESIGN -SOURCE OF


VARIATION
BETWEEN
THE
TREATMENTS
WITHIN THE
BLOCKS
RESIDUAL/
ERROR
TOTAL

SUM OF
SQUARES

d.f.

MEAN SUM OF
SQUARES

FRATIOS

Assuming that there is no


second factor in our
example

Assuming that there is a


blocking factor store in our
example

SPSS output
Tests of Between-Subjects Effects
Dependent Variable: SALES OF CHAIRS
Source
Model
STYLE
STORERS
Error
Total

Type III Sum


of Squares
45324.417a
935.083
698.667
487.583
45812.000

df
10
2
7
14
24

Mean Square
4532.442
467.542
99.810
34.827

a. R Squared = .989 (Adjusted R Squared = .982)

F
130.140
13.425
2.866

Sig.
.000
.001
.044

SPSS Graph

Estimated Marginal Means of SALES OF CHAI

E s tim a te d M a r g in a l M e a n s

70

60

50

40

STYLE OF CHAIR
CHAIR STYLE#1

30
CHAIR STYLE#2
20

CHAIR STYLE#3
A

STORES

Sudhanshu got the


results!!!!
Which is more
important
Area or
Grades?

Anova: Two-Factor Without Replication


SUMMARY

Averag Varianc

e
e

Count

Sum

A+

14.00

4.67

1.05

12.95

4.32

1.63

A-

11.30

3.77

0.92

B+

10.50

3.50

0.97

8.80

2.93

0.34

B-

8.00

2.67

0.52

HR

17.40

2.90

0.33

Marketing

20.10

3.35

0.60

Finance

28.05

4.68

1.00

ANOVA
Source of
Variation

df

Rows (Grades)

8.982916667

Columns (Area)

10.2175

0.670833333

10

0.0671

19.87125

17

Total

PF crit
value

SS

Error

MS

1.74E3.3258
05
8.88E5.1088 76.155
4.1028
07
1.7966 26.781

Compare results !!!


Anova: Two-Factor Without
Replication

Anova: Single Factor


SUMMARY

Groups

Average Variance

Count

Sum

HR

17.4

2.90

0.332

Marketing

20.1

3.35

0.603

Finance

28.05

4.675

0.9958

ANOVA
Source of
Variation

Between Groups

Within Groups

Total

SS

10.2175

df

MS

5.1088

SUMMARY

Count

Sum

Average

A+
A
AB+
B
BHR
Marketing
Finance

3
3
3
3
3
3
6
6
6

14.00
12.95
11.30
10.50
8.80
8.00
17.40
20.10
28.05

4.67
4.32
3.77
3.50
2.93
2.67
2.90
3.35
4.68

ANOVA

PF crit
value

0.004 3.682
7.938
5
3

9.65375

15

0.6436

19.87125

17

Varianc

e
1.05
1.63
0.92
0.97
0.34
0.52
0.33
0.60
1.00

PF
valu
crit
e

Source of
Variation

SS

df

MS

Rows (Grades)

8.98291666
7

1.7966

26.781

Columns (Area)

10.2175

5.1088

10

0.0671

17

Error

Total

0.67083333
3

19.87125

1.74E- 3.32
05
58
8.88E- 4.10
76.155
07
28

Lets look at SPSS output


Tests of Between-Subjects Effects
Dependent Variable:Annual Pay Package (Rs. in Lakhs)
Type III Sum of
Source
Grades

Squares

Sig.

8.983

.671

10

10.217

.671

10

1.797 26.781

.000

.067a

Hypothesis

Error
a. MS(Error)

Mean Square

Hypothesis

Error
Area

df

5.109 76.155
.067a

.000

Plot

Another example Mr.


Jains Experiment!
LIFE OF TUBES IN (THOUSAND HOURS)
BRAND

SLOW

MEDIUM

HIGH

SURYA

3.7

4.5

3.1

BAJAJ

3.4

3.9

2.8

PHILIPS

3.5

4.1

3.0

OSRAM

3.2

3.5

2.6

HAVELLS

3.9

4.8

3.4

Single Factor
Anova: Single
Factor

SUMMARY

Averag Varian
Count Sum
e
ce

5 17.7
3.54 0.073
5 20.8
4.16 0.258
5 14.9
2.98 0.092

Groups
SLOW
MEDIUM
HIGH

ANOVA
Source of
Variation
SS
Between Groups 3.484
Within Groups
1.692
Total
5.176

Pdf
MS
F
value F crit
2 1.742 12.355 0.0012
3.8853
12 0.141

14

Two Factor Model


Anova: Two-Factor Without
Replication

SUMMARY

Count

Sum

Average Variance

SURYA

11.3

3.767

0.493

BAJAJ

10.1

3.367

0.303

PHILIPS

10.6

3.533

0.303

OSRAM

9.3

3.100

0.210

HAVELLS

12.1

4.033

0.503

SLOW

17.7

3.540

0.073

MEDIUM

20.8

4.160

0.258

HIGH

14.9

2.980

0.092

ANOVA

Source of Variation

SS

df

MS

P-value

F crit

Rows (Brands)

1.5493

0.3873

21.7196

0.0002

3.8379

Columns (Voltage)

3.4840

1.7420

97.6822

0.0000

4.4590

Error

0.1427

0.0178

Total

5.176

14

How does Two-Way ANOVA


filters out the effects of
FACTORS?

Whats the MYSTERY?

TOTAL VARIANCE Mr.


Jains Experiment!
Total

Variance

LIFE OF TUBES IN (THOUSAND HOURS)


SLOW

MEDIUM

HIGH

MEAN

SURYA

3.7

4.5

3.1

3.77

BAJAJ

3.4

3.9

2.8

3.37

PHILIPS

3.5

4.1

3.0

3.53

OSRAM

3.2

3.5

2.6

3.10

HAVELL
S

3.9

4.8

3.4

4.03

3.54

4.16

2.98

3.56

BRAND

MEANS

Look at the following:

Notice!!!

LIFE OF TUBES IN (THOUSAND HOURS)


BRAND

MEAN

MEDIUM

HIGH

3.77

3.77

3.77

3.77
3.37

PHILIPS

3.37
3.37
Same Across Voltage
3.53
3.53
3.53

OSRAM

3.10

3.10

3.10

3.10

HAVELLS

4.03

4.03

4.03

4.03

3.56

3.56

3.56

3.56

SURYA
BAJAJ

Different Across
Brand

SLOW

MEANS

3.37

3.53

In this way, we can capture Variation Due to Brands

Look at the following:

Notice!!!

LIFE OF TUBES IN (THOUSAND HOURS)


BRAND

BAJAJ

Same Across Brand

SURYA

LOW

MEDIUM

HIGH

MEAN

3.54

4.16

2.98

3.56

3.54

4.16

2.98

3.56

Different Across Voltage

PHILIPS

3.54

4.16

2.98

3.56

OSRAM

3.54

4.16

2.98

3.56

HAVELLS

3.54

4.16

2.98

3.56

3.54

4.16

2.98

3.56

MEANS

In this way, we can capture Variation Due to Voltage

It shows how do we demystify


the filtering process of
variation?

Take means
and filter out
the variation!

Now. Lets move to

Meet this curious boy


He has data about the
Life of Inverters Batteries
produced
by
three
manufacturers and sold
under
two
model

Standard and Premium.


He is curious to know
whether life of batteries
differs across model and
manufacturers?

Is life of batteries function


of Model and of
Manufacturers?

Is Life = f(Model, Manufacturer)?

He has collected the following


data

STANDARD

PREMIUM

MANUFACTURER-1

61.50

66.00

66.00

63.00

62.50

63.00

MANUFACTURER-2

64.50

67.00

65.00

70.50

67.00

67.50

MANUFACTURER-3

65.50

61.50

68.00

63.00

66.50

66.00
Life in thousand hours.

Picture speaks
something!!!

Can you help this boy in


answering the question?

Is Life = f(Model, Manufacturer)?

For
For that
that we
we have
have to
to
stretch
stretch ourselves
ourselves
and
and

ANALYSIS OF
VARIANCE:

TWO-WAY
FACTORIAL
DESIGN

But, first consider the


following problem:
A professor from MDI was curious to know
whether students used to score more in
closed-book examination or from the
open-book. He took eight students on
random basis from PGPM, NMP and EMP;
four students from each programme were
administered a closed-book examination
and the remaining four students from
each programme took the same question
paper but under open-book examination.
The scores obtained by the students
are

Marks obtained out of


100
MARKS OUT OF 100

PGPM

NM P

EM P

OPEN-BOOK EXAM INATION

75

58

61

68

56

63

71

61

65

75

60

64

66

62

61

70

60

66

68

59

63

68

68

61

CLOSED-BOOK EXAM INATION

THINK OF THE ISSUES.


ONE..
TWO..

ANOTHER OTHER

SCHEME OF ANALYSIS OF VARIANCE:

TWO-WAY FACTORIAL DESIGN


Are the students from PGPM, NPM, and EMP scoring same marks for the
same question paper AND THAT TOO IN OPEN-BOOK AND CLOSEDBOOK EXAMINATION?
PROGRAMMES (COLUMN TREATMENT)

PGPM
Marks
obtained in
the
examination
-Individual
observations

NMP

EMP

1
TYPE OF

Here, 1 means: OPEN-BOOK


and
2 means CLOSED-BOOK.

EXAM
(ROW
TREATMENT)
2

EACH CELL MAY HAVE A NUMBER OF OBSERVATIONS.

TWO-WAY FACTORIAL DESIGN


Analysis of Variance: Partitioning of
TOTAL SUM OF SQUARES OF VARIATION
TOTAL VARIATION
(TOTAL SUM OF SQUARES)

VARIATION AMONG GROUPS DUE


TO ROW TREATMENT
(ROW TREATMENT SUM OF
SQUARES)
VARIATION AMONG GROUPS DUE
TO COLUMN TREATMENT
(COLUMN TREATMENT SUM OF
SQUARES)

VARIATION WITHIN GROUPS


(ERROR SUM OF SQUARES)

VARIATION AMONG GROUPS DUE


TO INTERACTION BETWEEN ROW
AND COLUMN TREATMENTS
(INTERACTION SUM OF SQUARES)

ANALYSIS OF VARIANCE..

TWO
TWO -- WAY
WAY ANOVA
ANOVA
TABLE
TABLE ...................
...................
---- FACTORIAL
FACTORIAL DESIGN
DESIGN ----

Anova: Two-Factor With Replication


SUMMARY

PGPM

NMP

EMP

Total

OPEN-BOOK EXAMINATION
Count

4.0000

4.0000

4.0000

12.0000

289.0000

235.0000

253.0000

777.0000

Average

72.2500

58.7500

63.2500

64.7500

Variance

11.5833

4.9167

2.9167

39.6591

4.0000

4.0000

4.0000

12.0000

272.0000

249.0000

251.0000

772.0000

Average

68.0000

62.2500

62.7500

64.3333

Variance

2.6667

16.2500

5.5833

14.0606

8.0000

8.0000

8.0000

561.0000

484.0000

504.0000

Average

70.1250

60.5000

63.0000

Variance

11.2679

12.5714

3.7143

Sum

CLOSED-BOOK EXAMINATION
Count
Sum

Total
Count
Sum

ANOVA
Source of Variation
Sample

SS

df

MS

P-value

F crit

1.0417

1.0000

1.0417

0.1423

0.7104

4.4139

399.0833

2.0000

199.5417

27.2619

0.0000

3.5546

60.0833

2.0000

30.0417

4.1044

0.0340

3.5546

Within

131.7500

18.0000

7.3194

Total

591.9583

23.0000

Columns
Interaction

Tests of Between-Subjects Effects


Dependent Variable: MARKS OBTAINED OUT OF 100
Source
Corrected Model
Intercept
PROG
EXAM
PROG * EXAM
Error
Total
Corrected Total

Type III Sum


of Squares
460.208a
99975.042
399.083
1.042
60.083
131.750
100567.000
591.958

df
5
1
2
1
2
18
24
23

Mean Square
F
92.042
12.575
99975.042 13658.829
199.542
27.262
1.042
.142
30.042
4.104
7.319

a. R Squared = .777 (Adjusted R Squared = .716)

Sig.
.000
.000
.000
.710
.034

Estimated Marginal Means of MARKS


OBTAINED OUT OF 100
74

Estimated Marginal Means

72
70
68

TYPE OF EXAMINATION

66
64

OPEN-BOOK EXAMINATIO

62

60

CLOSED-BOOK EXAMINAT

58
PGPM

ION

PROGRAMME

NMP

EMP

Revisiting our earlier


example

Consider the following:


An experiment was conducted to understand recall
mechanism among the people. 100 people were
selected 50 OLD and 50 YOUNG; and they were
randomly assigned into five groups four
incidental-learning groups
Counting Group
Rhyming Group
Adjective Group

This time the people are


divided into 2 groups
OLD and YOUNG.

Imagery Group

and one intentional-learning group.


Each group was given a list of 27 words and each
one was asked to read them carefully 3 times.

No. of words recalled as a


function of level of
processing
AGE
GROUP

NO. OF WORDS RECALLED AS A FUNCTION OF LEVEL OF

COUNTING
OLD

YOUNG

9
8
6
8
10
4
6
5
7
7

8
6
4
6
7
6
5
7
9
7

RHYMING
7
9
6
6
6
11
6
3
8
7

10
7
8
10
4
7
10
6
7
7

ADJECTIVE
11
13
8
6
14
11
13
13
10
11

14
11
18
14
13
22
17
16
12
11

IMAGERY
12
11
16
11
9
23
12
10
19
11

20
16
16
15
18
16
20
22
14
19

INTENTIONAL
10
19
14
5
10
11
14
15
11
11

21
19
17
15
22
16
22
22
18
21

Anova: Two-Factor With Replication

SUMMARY

COUNTING

RHYMING

ADJECTIVE IMAGERY

INTENTIONAL Total

OLD
Count
Sum

10.0000
70.0000

10.0000
69.0000

10.0000
110.0000

10.0000
134.0000

10.0000
120.0000

50.0000
503.0000

7.0000
3.3333

6.9000
4.5444

11.0000
6.2222

13.4000
20.2667

12.0000
14.0000

10.0600
16.0576

10.0000
65.0000

10.0000
76.0000

10.0000
148.0000

10.0000
176.0000

10.0000
193.0000

50.0000
658.0000

Average

6.5000

7.6000

14.8000

17.6000

19.3000

13.1600

Variance

2.0556

3.8222

12.1778

6.7111

7.1222

33.4841

Count

20.0000

20.0000

20.0000

20.0000

20.0000

Sum

135.0000

145.0000

258.0000

310.0000

313.0000

Average

6.7500

7.2500

12.9000

15.5000

15.6500

Variance

2.6184

4.0921

12.5158

17.4211

24.0289

Average
Variance
YOUNG
Count
Sum

Total

ANOVA
Source of Variation

SS

df

MS

P-value

F crit

Sample

240.2500

1.0000

240.2500

29.9356

0.0000

3.9469

Columns

1514.9400

4.0000

378.7350

47.1911

0.0000

2.4729

Interaction

190.3000

4.0000

47.5750

5.9279

0.0003

2.4729

Within

722.3000

90.0000

8.0256

Total

2667.7900

99.0000

Estimated Marginal Means of NUMBER OF


WORDS RECALLED
22

Estimated Marginal Means

20
18

LEVEL OF PROCESSING

16
14

COUNTING

12

RHYMING

10

ADJECTIVE

8
IMAGERY
6
4

INTENTIONAL

OLD

AGE GROUP

YOUNG

Estimated Marginal Means of NUMBER OF


WORDS RECALLED
22

Estimated Marginal Means

20
18
16
14
12

AGE GROUP

10
8

OLD

6
4
COUNTING

YOUNG
RHYMING

ADJECTIVE

LEVEL OF PROCESSING

IMAGERY

INTENTIONAL

Revisiting the Problem

Is Life = f(Model, Manufacturer)?

He has collected the following


data

STANDARD

PREMIUM

MANUFACTURER-1

61.50

66.00

66.00

63.00

62.50

63.00

MANUFACTURER-2

64.50

67.00

65.00

70.50

67.00

67.50

MANUFACTURER-3

65.50

61.50

68.00

63.00

66.50

66.00
Life in thousand hours.

The Result of Two-Way


ANOVA
Tests of Between-Subjects Effects
Dependent Variable:LIFE OF BATTERY IN HOURS THOUSANDS
Type III Sum
of Squares

df

76630.5000

12771.7500

MANUF

31.8611

15.9306

4.6064

0.033

MODEL

0.0556

0.0556

0.0161

0.901

27.6944

13.8472

4.0040

0.047

Source
Model

MANUF * MODEL
Error
Total

41.5
76672

12
18

Mean
Square

3.4583

Sig.

3693.0361 0.000

Is there any interaction


effect?

Thats all about Analysis of


Variance!!!!!

Any Question?