Sei sulla pagina 1di 136

Trust that everything is

What
NEXT?

First, lets revisit what

TESTS we have covered so
far.

We started with
PARAMETRIC TESTS!
First we took tests related to MEAN.
We have ONE Sample Mean Test Large
Sample.
We have ONE Sample Proportion Test
Large Sample.

Then, we took TWO

SAMPLE TESTS!
We took tests related to MEAN.
We have TWO Samples Difference of
Mean Test Large Sample:
Samples coming from SAME population
Samples coming from TWO INDEPENDENT
POPULATION.

We have TWO Sample Difference of

Proportion Test Large Sample.
Samples coming from SAME population
Samples coming from TWO INDEPENDENT
POPULATION.

Then,
Then, we
we move
move to
to cases
cases
where
where the
the population
population is
is
NORMAL
NORMAL but
but the
the standard
standard
Deviation
Deviation of
of the
the
Population
Population is
is NOT
NOT known;
known;
and
and above
above the
the sample
sample
size
size is
is SMALL.
SMALL.

We used t-Test which is a

PARAMETRIC TESTS!
We took tests related to MEAN.
We have ONE Sample Mean Test Small
Sample.
We have ONE Sample Proportion Test
Small Sample.

Then, we took TWO

SAMPLE TESTS!
We took tests related to MEAN.
We have TWO Samples Difference of
Mean Test Small Sample:
Samples coming from SAME population
Samples coming from TWO INDEPENDENT
POPULATION.

We have TWO Sample Difference of

Proportion Test Small Sample.
Samples coming from SAME population
Samples coming from TWO INDEPENDENT
POPULATION.

Then,
Then, we
we move
move from
from

One Sample

Use Chi-Square ( 2)

TWO Samples

Lets apply.

These are the marks

obtained by YOU
people in the second
quiz out of 20.

What do
you think
Which is more
important from

The Mean marks

obtained are 7.42 with
a standard deviation
of 3.95 .

evaluation point of
view Mean or
Standard
Deviation?

Assume that
There are 31 students who appeared in the QuizII.
Desirable level of variation should NOT be less
than 4.

Marks obtained by the students are independent

of each others marks.

Hypotheses:

H 0 : 2 42
H a : 2 42

It is a ONE-Tailed Test.
Test Statistic
2
(
n

1
)
s
2
20

The critical
value of 2
for 30 df
and = 5%
is 43.773

The value of the 2 is (30 x 3.952)/(42) =

29.2547 which is less than the Critical
Value of 43.773; hence, null hypothesis is
not rejected.

Any Question before we proceed?

R
Is there a
el
relation
at
io between F Test
n

Yes, there is a relation!

Remember that F is a ratio of TWO VARIANCES!
t2 =

=F

to F.

Whats Next?

interesting

Kulfi

What do you think

Nitin
can you prove that this kind of
marketing strategy results in more
sales?

Lets come back to our

which we have to do!!!

What do
you say?
Are they
different?

Do you think
that there are
differences?

Do you think
that there are
differences?

ANALYSIS OF
VARIANCE

Assume that you have collected

following information
Motivation Level
(Measured on a composit index of 20 points)
Plant 1

Plant 2

Plant 3

Respondent 1

14

Respondent 2

10

14

Respondent 3

11

Respondent 4

What would you like to

infer from the data?
Motivation Level
(Measured on a composit index of 20 points)
Plant 1

Plant 2

Plant 3

Respondent 1

14

Respondent 2

10

14

Respondent 3

11

Respondent 4

Are you looking at .???

Is Motivation f ( Plants ) ?
Dependent
Variable

Ratio/Interval Scale
Variable

Independent
Variable

Categorical
Variable

Is the motivational level of workers effected

by the plant in which they are working?

ANALYSIS OF
VARIANCE

O
AN
VA

AN ANOVA
OANOVA
VA

ANOVA

ANALYSIS OF VARIANCE ...

It is applied when one has more than one
samples under study coming from
different populations or same population
with different treatments.
It is a hypothesis testing procedure used
to determine whether mean differences
exist for two or more samples or
treatments.
The term ANALYSIS OF VARIANCE appears
to be a misnomer since the objective is to
analyse differences among means of the
samples rather than the variances of the

ANALYSIS OF VARIANCE
(continued)

If we assume that all the samples are coming from

the same population or the treatments across has
no effect, then the variation among the means and
the standard deviation across the group will be due
to only sampling error. If it is so then the variation
within groups and the variation among the
groups should be same. To test whether these
variations are equal is the main purpose of ANOVA.
Therefore, ANOVA tests equality of means among a
number of groups via testing of equality of
variations due to WITHIN GROUPS and due to
AMONG GROUPS.

ANALYSIS OF VARIANCE
(continued)

When the observations are subjected to experiment related

to one factor then such an analysis is called as ONE-WAY
ANALYSIS OF VARIANCE.
In it, if differences are found to be significant then it
could be concluded that the differences are due to the
treatment of ONE FACTOR.
ANOVA divides the TOTAL VARIATION into two - AMONGGROUP VARIATION and WITHIN-GROUP VARIATION.

One-Way Analysis of
Variance: Partitioning of
TOTAL SUM OF SQUARES OF
VARIATION
TOTAL VARIATION
(TOTAL SUM OF SQUARES)

VARIATION AMONG GROUPS

DUE TO TREATMENT
(TREATMENT SUM OF SQUARES)

VARIATION WITHIN GROUPS

(ERROR SUM OF SQUARES)

Composition of VARIATION
Let
SStot = Sum of Squares (TOTAL)

SSbet = Sum of Squares (BETWEEN GROUPS)

SSwith = Sum of Squares (WITHIN A GROUP)
Then,

i 1

j 1

SS with ( x ji Group Mean j ) 2

j 1 i 1

ANALYSIS OF VARIANCE
(continued)

ANOVA proceeds with a Null Hypothesis of no

differences among the means of populations.
That is-

Null Hypothesis H0: 1 = 2 = 3 = ...

And,
Alternative Hypothesis H1: At least two
means are different.
It assumes that there is no difference between K
samples behaviour and hence, their means should
be same; alternatively their means should not be

ANALYSIS OF VARIANCE
(continued)
Therefore, the variation between the groups and among the
groups must be same. Thus, it uses test statistics called F
statistics which is a ratio defined as below F=

VARIATION BETWEEN THE GROUPS

VARIATION WITHIN THE GROUPS

And, the total variation is equal to the sum of the variations

within the groups and the variation between the groups.
If Null Hypothesis were to be true then F statistic is
approximately equal to 1 otherwise it would be substantially
larger than 1.

ANALYSIS OF VARIANCE
(continued)

It is a test of equality among the

means and not that of variances.
That is to say that it has nothing to
test about the variances of the
samples.
It says that at least two means are
different out of the K-means of the
K-samples.
It assumes that the variances of the

ANOVA TABLE...
ANOVA TABLE ---- ONE WAY
SOURCE OF
VARI ATI ON

SUM OF
SQUARES

BETWEEN
WI THI N
TOTAL

BETWEEN SS
WI THI N SS
TOTAL SS

DEGREE
OF
FREEDOM
(C-1)
(N-C)
(N-1)

MEAN SUM OF
SQARE

F- RATI O

MSS(BETWEEN)
MSS(WI THI N)

MSS(BETWEEN)
MSS(WI THI N)

Lets consider the following

problem:
Plant (Employee Age)
1

29

32

25

27

33

24

30

31

24

27

34

25

28

30

26

Can we say that the average age of

employees across the plants are different?

Working of the problem:

Plant (Employee Age)
1

29

32

25

27

33

24

30

31

24

27

34

25

28

30

26

141

160

124

Count

Mean

28.2

32.0

24.8

Sum

Grand
Mean

28.33

TREATMENT EFFECT

RESIDUALS

-0.13

3.67

-3.53

0.80

0.00

0.20

-0.13

3.67

-3.53

-1.20

1.00

-0.80

-0.13

3.67

-3.53

1.80

-1.00

-0.80

-0.13

3.67

-3.53

-1.20

2.00

0.20

-0.13

3.67

-3.53

-0.20

-2.00

1.20

Working of the problem:

TREATMENT EFFECT
(SQUARE)

RESIDUALS (SQUARE)

Plant (Employee Age)

0.0178

13.4444

12.4844

0.6400

0.0000

0.0400

0.0178

13.4444

12.4844

1.4400

1.0000

0.6400

0.0178

13.4444

12.4844

3.2400

1.0000

0.6400

0.0178

13.4444

12.4844

1.4400

4.0000

0.0400

0.0178

13.4444

12.4844

0.0400

4.0000

1.4400

ANOVA TABLE
ANOVA-TABLE
Source of Variation

Between Groups
Within Groups

Total

SS

df

MS

129.733

19.6

12

149.333

14

64.867
1.633

39.714

Lets consider another

problem:
Machine Operator (Valve Opening)
1

6.33

6.26

6.44

6.29

6.26

6.36

6.38

6.23

6.31

6.23

6.58

6.19

6.29

6.27

6.54

6.21

6.40

6.19

6.56

6.50

6.34

6.19

6.58

6.22

Can we say that the average Opening of

Valves across the machine operators are
different?

Go to

Revisiting our earlier example

Motivation Level
(Measured on a composit index of 20 points)
Plant 1

Plant 2

Plant 3

Respondent 1

14

Respondent 2

10

14

Respondent 3

11

Respondent 4

Can we conclude that motivational level is

dependent on which plant?

example..

Consider the following:

An experiment was conducted to understand recall
mechanism among the people. 50 people were
selected and they were randomly assigned into five
groups four incidental-learning groups
Counting Group
Rhyming Group

ISrecall
recallmechanism
mechanismaafunction
functionof
of
IS
towhich
whichgroup
groupaaperson
personbelongs?
belongs?
to

Imagery Group

and one intentional-learning group.

Each group was given a list of 27 words and each

No. of words recalled as a

function of level of
processing
NO. OF WORDS RECALLED AS A FUNCTION OF LEVEL OF

COUNTING
9
8
6
8
10
4
6
5
7
7

RHYMING
7
9
6
6
6
11
6
3
8
7

11
13
8
6
14
11
13
13
10
11

IMAGERY
12
11
16
11
9
23
12
10
19
11

INTENTIONAL
10
19
14
5
10
11
14
15
11
11

Anova: Single Factor

SUMMARY
Groups

Count

Sum

Average

Variance

COUNTING
RHYMING

10
10

70
69

7.00
6.90

3.33
4.54

10

110

11.00

6.22

IMAGERY
INTENTIONAL

10
10

134
120

13.40
12.00

20.27
14.00

Source of Variation
Between Groups
Within Groups

SS
351.52
435.3

df

Total

786.82

ANOVA
4
45

MS
87.880
9.673

49

P-value
9.085
0.000

F crit
2.579

end here?

First, Test of Homogeneity

of Variance
Test of Homogeneity of Variances
NUMBER OF WORDS RECALLED
Levene
Statistic
2.529

df1

df2
4

45

What do you say?

Sig.
.054

Second, POST-HOC
ANALYSIS
Multiple Comparisons

Bonferroni

(I) LEVEL OF
PROCESSING
COUNTING

RHYMING

IMAGERY

INTENTIONAL

(J) LEVEL OF
PROCESSING
RHYMING
IMAGERY
INTENTIONAL
COUNTING
IMAGERY
INTENTIONAL
COUNTING
RHYMING
IMAGERY
INTENTIONAL
COUNTING
RHYMING
INTENTIONAL
COUNTING
RHYMING
IMAGERY

Mean
Difference
(I-J)
1.00E-01
-4.00
-6.40*
-5.00*
-1.00E-01
-4.10
-6.50*
-5.10*
4.00
4.10
-2.40
-1.00
6.40*
6.50*
2.40
1.40
5.00*
5.10*
1.00
-1.40

Std. Error
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39
1.39

Sig.
1.000
.061
.000
.008
1.000
.051
.000
.006
.061
.051
.913
1.000
.000
.000
.913
1.000
.008
.006
1.000
1.000

Lower Bound
Upper Bound
-4.01
4.21
-8.11
.11
-10.51
-2.29
-9.11
-.89
-4.21
4.01
-8.21
6.11E-03
-10.61
-2.39
-9.21
-.99
-.11
8.11
-6.11E-03
8.21
-6.51
1.71
-5.11
3.11
2.29
10.51
2.39
10.61
-1.71
6.51
-2.71
5.51
.89
9.11
.99
9.21
-3.11
5.11
-5.51
2.71

M ean of NUM BER O F W O RDS RECALLED

Third, MEAN-PLOT.
14

12

10

6
COUNTING

RHYMING

LEVEL OF PROCESSING

IMAGERY

INTENTIONAL

Now, we get some sense

out of the
story!!!!!!!!!!!!!!!!

Sudhanshu!!!
Should I major
in Marketing,
HR or Finance?

He contacted his seniors who had majored in these three

areas and collected information about their pay-package.

HR

Marketin
Finance
g

2.70

2.30

4.80

2.20

3.60

3.50

3.30

2.70

4.60

2.50

4.40

3.60

3.80

3.90

2.80

2.90

3.20

2.90

How do I know whether

there are significant
differences in the paypackages?.... Ohhhhh!!!
Yes! I can use ANOVA!!!

results

I think Finance
would be the
right choice for
me!!!

Anova: Single Factor

SUMMARY

Groups

Average Variance

Count

Sum

HR

17.4

2.90

0.332

Marketing

20.1

3.35

0.603

Finance

28.05

4.675

0.9958

ANOVA
Source of
Variation

SS

df

MS

PF crit
value

Between Groups

10.2175

5.1088

7.938

0.0045 3.6823

Within Groups

9.65375

15

0.6436

19.87125

17

Total

Sudhanshu! It is not
the Area that matters,
matters in getting a
good package.

Is it
really
so?!!!

seniors and the collected data is ...

Now, he is lost!!!
What to do with
the data?

s

HR

A+

3.80

4.40

5.80

3.30

3.90

5.75

A-

2.90

3.60

4.80

B+

2.70

3.20

4.60

2.50

2.70

3.60

B-

2.20

2.30

3.50

Marketing Finance

(Numbers are annual pay-package in Rs. lakhs)

For
For that
that we
we have
have to
to
stretch
stretch ourselves
ourselves
and
and

...

MOVING FROM
ONE FACTOR TO
TWO FACTORS

Mr. Padam Jain is working in a

firm that does electrical work
in Offices.
In India, there is lot of variation in the voltage of
electricity supplied which may have a bad impact
on the life of electrical appliances. Mr. Padam Jain
wants to study the impact of Voltage on the Life
of Tubes. For that he tool 10 tubes and
experimented with variations in the voltage he
took three levels of voltage Low, Medium
(correct one) and High. And, he collected the data
about their lives measured in terms of hours.

Data
LIFE OF TUBES IN (THOUSAND
HOURS)
SLOW

MEDIUM

HIGH

3.7

4.5

3.1

3.4

3.9

2.8

3.5

4.1

3.0

3.2

3.5

2.6

3.9

4.8

3.4

Picture speaks

What do say
impact of
variations in
Voltage on the
life of tubes?

Assume that Mr. Jain took this five

tubes from five different
manufactures

THINK!!!!!
###\$\$
\$?????!!
!!!!!!!!!!
???

So
what?

How to filter the

effects of Brands so
at to get the impact
of only voltage on
the life of tubes?

Mr. Jain provides the

following information

BRAND

SLOW

MEDIUM

HIGH

SURYA

3.7

4.5

3.1

BAJAJ

3.4

3.9

2.8

PHILIPS

3.5

4.1

3.0

OSRAM

3.2

3.5

2.6

HAVELLS

3.9

4.8

3.4

speaks!!!

Thats a challenge for all

of us!

TWO-WAY ANALYSIS
OF VARIANCE
WITHOUT
REPLICATION
(RANDOMIZED BLOCK DESIGN )

TWO-WAY ANALYSIS OF VARIANCE

WITHOUT REPLICATION
(RANDOMIZED BLOCK DESIGN )
When the observations are subjected to experiment related
to two factors then such an analysis is called as TWO-WAY
ANALYSIS OF VARIANCE.
If the variation among the samples is due to TWO
FACTORS, then a researcher may be interested in
knowing whether variation due to each factor is significant.
The advantage of such an experimental design over ONEWAY design is that it could better explain the variation
among the observations and therefore, the variation due to
error term would be minimized. Further, such a model has
more power of explaining the observed variations.

SCHEME OF ANALYSIS OF VARIANCE:

RANDOMIZED BLOCK DESIGN
Are all the three machines properly adjusted in a manner that they are on the
average filling same quantity of cola drink given the fact that these machines
are operated at different points of time by different operators?

3
BLOCKING

Measurements
of quantity
filledIndividual
observations

VARIABLE
(OPERATORS)

What are you looking for

in this example .???
Is Measurements of Quantity Filled f (Time , Operator ) ?

Dependent
Variable
Ratio/Interval Scale
Variable

Independent
Variables
Categorical
Variable

Lets consider the

following example

TWO-WAY ANALYSIS OF
VARIANCE (continued)
In case of TWO-WAY ANOVA without replication, no value
of a particular factor is allowed to repeat in the trail. As a
result, the values of factors are independent and so their
effects are. In such a case, there will not be any interaction
effect between factors.

TWO-WAY ANALYSIS OF
VARIANCE (continued)
In case of TWO-WAY ANOVA without replication, no value
of a particular factor is allowed to repeat in the trail. As a
result, the values of factors are independent and so their
effects are. In such a case, there will not be any interaction
effect between factors.

BLOCKING
VARIABLE

TWO-WAY ANALYSIS OF
VARIANCE (continued)
In it, if differences are found to be significant then
they could be attributed to either of factors.

VARIATION

which

is

further

segregated into variations due to each factor and

WITHIN-GROUP VARIATION.That is-

RANDOMIZED BLOCK DESIGN

Analysis of Variance: Partitioning of
TOTAL SUM OF SQUARES OF VARIATION
TOTAL VARIATION
(TOTAL SUM OF SQUARES)

VARIATION AMONG GROUPS

DUE TO TREATMENT
(TREATMENT SUM OF SQUARES)

VARIATION EXPLAINED BY THE

BLOCKING VARIABLE
(BLOCKS SUM OF SQUARES)

VARIATION WITHIN GROUPS

(ERROR SUM OF SQUARES)

VARIATION DUE TO ERROR

(NEW ERROR SUM OF SQUARES)

ANALYSIS OF VARIANCE..

VARIATION
BETWEEN
THE
TREATMENTS
WITHIN THE
BLOCKS
RESIDUAL/
ERROR
TOTAL

SUM OF
SQUARES

d.f.

MEAN SUM OF
SQUARES

FRATIOS

Assuming that there is no

second factor in our
example

Assuming that there is a

blocking factor store in our
example

SPSS output
Tests of Between-Subjects Effects
Dependent Variable: SALES OF CHAIRS
Source
Model
STYLE
STORERS
Error
Total

of Squares
45324.417a
935.083
698.667
487.583
45812.000

df
10
2
7
14
24

Mean Square
4532.442
467.542
99.810
34.827

F
130.140
13.425
2.866

Sig.
.000
.001
.044

SPSS Graph

Estimated Marginal Means of SALES OF CHAI

E s tim a te d M a r g in a l M e a n s

70

60

50

40

STYLE OF CHAIR
CHAIR STYLE#1

30
CHAIR STYLE#2
20

CHAIR STYLE#3
A

STORES

results!!!!
Which is more
important
Area or

Anova: Two-Factor Without Replication

SUMMARY

Averag Varianc

e
e

Count

Sum

A+

14.00

4.67

1.05

12.95

4.32

1.63

A-

11.30

3.77

0.92

B+

10.50

3.50

0.97

8.80

2.93

0.34

B-

8.00

2.67

0.52

HR

17.40

2.90

0.33

Marketing

20.10

3.35

0.60

Finance

28.05

4.68

1.00

ANOVA
Source of
Variation

df

8.982916667

Columns (Area)

10.2175

0.670833333

10

0.0671

19.87125

17

Total

PF crit
value

SS

Error

MS

1.74E3.3258
05
8.88E5.1088 76.155
4.1028
07
1.7966 26.781

Compare results !!!

Anova: Two-Factor Without
Replication

SUMMARY

Groups

Average Variance

Count

Sum

HR

17.4

2.90

0.332

Marketing

20.1

3.35

0.603

Finance

28.05

4.675

0.9958

ANOVA
Source of
Variation

Between Groups

Within Groups

Total

SS

10.2175

df

MS

5.1088

SUMMARY

Count

Sum

Average

A+
A
AB+
B
BHR
Marketing
Finance

3
3
3
3
3
3
6
6
6

14.00
12.95
11.30
10.50
8.80
8.00
17.40
20.10
28.05

4.67
4.32
3.77
3.50
2.93
2.67
2.90
3.35
4.68

ANOVA

PF crit
value

0.004 3.682
7.938
5
3

9.65375

15

0.6436

19.87125

17

Varianc

e
1.05
1.63
0.92
0.97
0.34
0.52
0.33
0.60
1.00

PF
valu
crit
e

Source of
Variation

SS

df

MS

8.98291666
7

1.7966

26.781

Columns (Area)

10.2175

5.1088

10

0.0671

17

Error

Total

0.67083333
3

19.87125

1.74E- 3.32
05
58
8.88E- 4.10
76.155
07
28

Lets look at SPSS output

Tests of Between-Subjects Effects
Dependent Variable:Annual Pay Package (Rs. in Lakhs)
Type III Sum of
Source

Squares

Sig.

8.983

.671

10

10.217

.671

10

1.797 26.781

.000

.067a

Hypothesis

Error
a. MS(Error)

Mean Square

Hypothesis

Error
Area

df

5.109 76.155
.067a

.000

Plot

Another example Mr.

Jains Experiment!
LIFE OF TUBES IN (THOUSAND HOURS)
BRAND

SLOW

MEDIUM

HIGH

SURYA

3.7

4.5

3.1

BAJAJ

3.4

3.9

2.8

PHILIPS

3.5

4.1

3.0

OSRAM

3.2

3.5

2.6

HAVELLS

3.9

4.8

3.4

Single Factor
Anova: Single
Factor

SUMMARY

Averag Varian
Count Sum
e
ce

5 17.7
3.54 0.073
5 20.8
4.16 0.258
5 14.9
2.98 0.092

Groups
SLOW
MEDIUM
HIGH

ANOVA
Source of
Variation
SS
Between Groups 3.484
Within Groups
1.692
Total
5.176

Pdf
MS
F
value F crit
2 1.742 12.355 0.0012
3.8853
12 0.141

14

Two Factor Model

Anova: Two-Factor Without
Replication

SUMMARY

Count

Sum

Average Variance

SURYA

11.3

3.767

0.493

BAJAJ

10.1

3.367

0.303

PHILIPS

10.6

3.533

0.303

OSRAM

9.3

3.100

0.210

HAVELLS

12.1

4.033

0.503

SLOW

17.7

3.540

0.073

MEDIUM

20.8

4.160

0.258

HIGH

14.9

2.980

0.092

ANOVA

Source of Variation

SS

df

MS

P-value

F crit

Rows (Brands)

1.5493

0.3873

21.7196

0.0002

3.8379

Columns (Voltage)

3.4840

1.7420

97.6822

0.0000

4.4590

Error

0.1427

0.0178

Total

5.176

14

How does Two-Way ANOVA

filters out the effects of
FACTORS?

TOTAL VARIANCE Mr.

Jains Experiment!
Total

Variance

SLOW

MEDIUM

HIGH

MEAN

SURYA

3.7

4.5

3.1

3.77

BAJAJ

3.4

3.9

2.8

3.37

PHILIPS

3.5

4.1

3.0

3.53

OSRAM

3.2

3.5

2.6

3.10

HAVELL
S

3.9

4.8

3.4

4.03

3.54

4.16

2.98

3.56

BRAND

MEANS

Notice!!!

LIFE OF TUBES IN (THOUSAND HOURS)

BRAND

MEAN

MEDIUM

HIGH

3.77

3.77

3.77

3.77
3.37

PHILIPS

3.37
3.37
Same Across Voltage
3.53
3.53
3.53

OSRAM

3.10

3.10

3.10

3.10

HAVELLS

4.03

4.03

4.03

4.03

3.56

3.56

3.56

3.56

SURYA
BAJAJ

Different Across
Brand

SLOW

MEANS

3.37

3.53

Notice!!!

BRAND

BAJAJ

SURYA

LOW

MEDIUM

HIGH

MEAN

3.54

4.16

2.98

3.56

3.54

4.16

2.98

3.56

PHILIPS

3.54

4.16

2.98

3.56

OSRAM

3.54

4.16

2.98

3.56

HAVELLS

3.54

4.16

2.98

3.56

3.54

4.16

2.98

3.56

MEANS

It shows how do we demystify

the filtering process of
variation?

Take means
and filter out
the variation!

Meet this curious boy

Life of Inverters Batteries
produced
by
three
manufacturers and sold
under
two
model

He is curious to know
whether life of batteries
differs across model and
manufacturers?

of Model and of
Manufacturers?

He has collected the following

data

STANDARD

MANUFACTURER-1

61.50

66.00

66.00

63.00

62.50

63.00

MANUFACTURER-2

64.50

67.00

65.00

70.50

67.00

67.50

MANUFACTURER-3

65.50

61.50

68.00

63.00

66.50

66.00
Life in thousand hours.

Picture speaks
something!!!

Is Life = f(Model, Manufacturer)?

For
For that
that we
we have
have to
to
stretch
stretch ourselves
ourselves
and
and

ANALYSIS OF
VARIANCE:

TWO-WAY
FACTORIAL
DESIGN

But, first consider the

following problem:
A professor from MDI was curious to know
whether students used to score more in
closed-book examination or from the
open-book. He took eight students on
random basis from PGPM, NMP and EMP;
four students from each programme were
and the remaining four students from
each programme took the same question
paper but under open-book examination.
The scores obtained by the students
are

100
MARKS OUT OF 100

PGPM

NM P

EM P

75

58

61

68

56

63

71

61

65

75

60

64

66

62

61

70

60

66

68

59

63

68

68

61

ONE..
TWO..

ANOTHER OTHER

TWO-WAY FACTORIAL DESIGN

Are the students from PGPM, NPM, and EMP scoring same marks for the
same question paper AND THAT TOO IN OPEN-BOOK AND CLOSEDBOOK EXAMINATION?
PROGRAMMES (COLUMN TREATMENT)

PGPM
Marks
obtained in
the
examination
-Individual
observations

NMP

EMP

1
TYPE OF

Here, 1 means: OPEN-BOOK

and
2 means CLOSED-BOOK.

EXAM
(ROW
TREATMENT)
2

TWO-WAY FACTORIAL DESIGN

Analysis of Variance: Partitioning of
TOTAL SUM OF SQUARES OF VARIATION
TOTAL VARIATION
(TOTAL SUM OF SQUARES)

VARIATION AMONG GROUPS DUE

TO ROW TREATMENT
(ROW TREATMENT SUM OF
SQUARES)
VARIATION AMONG GROUPS DUE
TO COLUMN TREATMENT
(COLUMN TREATMENT SUM OF
SQUARES)

VARIATION WITHIN GROUPS

(ERROR SUM OF SQUARES)

VARIATION AMONG GROUPS DUE

TO INTERACTION BETWEEN ROW
AND COLUMN TREATMENTS
(INTERACTION SUM OF SQUARES)

ANALYSIS OF VARIANCE..

TWO
TWO -- WAY
WAY ANOVA
ANOVA
TABLE
TABLE ...................
...................
---- FACTORIAL
FACTORIAL DESIGN
DESIGN ----

Anova: Two-Factor With Replication

SUMMARY

PGPM

NMP

EMP

Total

OPEN-BOOK EXAMINATION
Count

4.0000

4.0000

4.0000

12.0000

289.0000

235.0000

253.0000

777.0000

Average

72.2500

58.7500

63.2500

64.7500

Variance

11.5833

4.9167

2.9167

39.6591

4.0000

4.0000

4.0000

12.0000

272.0000

249.0000

251.0000

772.0000

Average

68.0000

62.2500

62.7500

64.3333

Variance

2.6667

16.2500

5.5833

14.0606

8.0000

8.0000

8.0000

561.0000

484.0000

504.0000

Average

70.1250

60.5000

63.0000

Variance

11.2679

12.5714

3.7143

Sum

CLOSED-BOOK EXAMINATION
Count
Sum

Total
Count
Sum

ANOVA
Source of Variation
Sample

SS

df

MS

P-value

F crit

1.0417

1.0000

1.0417

0.1423

0.7104

4.4139

399.0833

2.0000

199.5417

27.2619

0.0000

3.5546

60.0833

2.0000

30.0417

4.1044

0.0340

3.5546

Within

131.7500

18.0000

7.3194

Total

591.9583

23.0000

Columns
Interaction

Tests of Between-Subjects Effects

Dependent Variable: MARKS OBTAINED OUT OF 100
Source
Corrected Model
Intercept
PROG
EXAM
PROG * EXAM
Error
Total
Corrected Total

Type III Sum

of Squares
460.208a
99975.042
399.083
1.042
60.083
131.750
100567.000
591.958

df
5
1
2
1
2
18
24
23

Mean Square
F
92.042
12.575
99975.042 13658.829
199.542
27.262
1.042
.142
30.042
4.104
7.319

Sig.
.000
.000
.000
.710
.034

Estimated Marginal Means of MARKS

OBTAINED OUT OF 100
74

Estimated Marginal Means

72
70
68

TYPE OF EXAMINATION

66
64

OPEN-BOOK EXAMINATIO

62

60

CLOSED-BOOK EXAMINAT

58
PGPM

ION

PROGRAMME

NMP

EMP

example

Consider the following:

An experiment was conducted to understand recall
mechanism among the people. 100 people were
selected 50 OLD and 50 YOUNG; and they were
randomly assigned into five groups four
incidental-learning groups
Counting Group
Rhyming Group

This time the people are

divided into 2 groups
OLD and YOUNG.

Imagery Group

and one intentional-learning group.

Each group was given a list of 27 words and each

No. of words recalled as a

function of level of
processing
AGE
GROUP

COUNTING
OLD

YOUNG

9
8
6
8
10
4
6
5
7
7

8
6
4
6
7
6
5
7
9
7

RHYMING
7
9
6
6
6
11
6
3
8
7

10
7
8
10
4
7
10
6
7
7

11
13
8
6
14
11
13
13
10
11

14
11
18
14
13
22
17
16
12
11

IMAGERY
12
11
16
11
9
23
12
10
19
11

20
16
16
15
18
16
20
22
14
19

INTENTIONAL
10
19
14
5
10
11
14
15
11
11

21
19
17
15
22
16
22
22
18
21

Anova: Two-Factor With Replication

SUMMARY

COUNTING

RHYMING

INTENTIONAL Total

OLD
Count
Sum

10.0000
70.0000

10.0000
69.0000

10.0000
110.0000

10.0000
134.0000

10.0000
120.0000

50.0000
503.0000

7.0000
3.3333

6.9000
4.5444

11.0000
6.2222

13.4000
20.2667

12.0000
14.0000

10.0600
16.0576

10.0000
65.0000

10.0000
76.0000

10.0000
148.0000

10.0000
176.0000

10.0000
193.0000

50.0000
658.0000

Average

6.5000

7.6000

14.8000

17.6000

19.3000

13.1600

Variance

2.0556

3.8222

12.1778

6.7111

7.1222

33.4841

Count

20.0000

20.0000

20.0000

20.0000

20.0000

Sum

135.0000

145.0000

258.0000

310.0000

313.0000

Average

6.7500

7.2500

12.9000

15.5000

15.6500

Variance

2.6184

4.0921

12.5158

17.4211

24.0289

Average
Variance
YOUNG
Count
Sum

Total

ANOVA
Source of Variation

SS

df

MS

P-value

F crit

Sample

240.2500

1.0000

240.2500

29.9356

0.0000

3.9469

Columns

1514.9400

4.0000

378.7350

47.1911

0.0000

2.4729

Interaction

190.3000

4.0000

47.5750

5.9279

0.0003

2.4729

Within

722.3000

90.0000

8.0256

Total

2667.7900

99.0000

WORDS RECALLED
22

Estimated Marginal Means

20
18

LEVEL OF PROCESSING

16
14

COUNTING

12

RHYMING

10

8
IMAGERY
6
4

INTENTIONAL

OLD

AGE GROUP

YOUNG

WORDS RECALLED
22

Estimated Marginal Means

20
18
16
14
12

AGE GROUP

10
8

OLD

6
4
COUNTING

YOUNG
RHYMING

LEVEL OF PROCESSING

IMAGERY

INTENTIONAL

He has collected the following

data

STANDARD

MANUFACTURER-1

61.50

66.00

66.00

63.00

62.50

63.00

MANUFACTURER-2

64.50

67.00

65.00

70.50

67.00

67.50

MANUFACTURER-3

65.50

61.50

68.00

63.00

66.50

66.00
Life in thousand hours.

The Result of Two-Way

ANOVA
Tests of Between-Subjects Effects
Dependent Variable:LIFE OF BATTERY IN HOURS THOUSANDS
Type III Sum
of Squares

df

76630.5000

12771.7500

MANUF

31.8611

15.9306

4.6064

0.033

MODEL

0.0556

0.0556

0.0161

0.901

27.6944

13.8472

4.0040

0.047

Source
Model

MANUF * MODEL
Error
Total

41.5
76672

12
18

Mean
Square

3.4583

Sig.

3693.0361 0.000

effect?