Dec 06, 2019

anova

difference among more than two

sample means

• The training Director wanted to evaluate three

different training methods to determine whether

there were any differences in effectiveness of

the training methods.

• After completion of the training period, she

chose 18 new employees assigned at random to

the three training methods.

• Counting the production output by these 18

trainees, she summarized the data and

calculated the mean production of the trainees.

To determine the grand mean any

of the method can be followed

• Grand Mean

={(15+18+19+22+11)+(22+27+18+21+17)+(18+

24+19+16+22+15) = 19

• Method-1 Method-2 Method-3

15 22 18

11 17 15

85 105 114

Grand Mean=19

Statement of the Hypotheses

• Whether these three samples were drawn from

populations( a population is the total number of

employees who could be trained by that method)

having the same means.

• If the population means do not differ significantly

we can infer that choice of training methods

have the same effectiveness on the productivity

of the employee.

• Otherwise, we could adjust our training program

accordingly.

ANOVA

• Each of the samples is drawn from a normal

population and that each of the populations has

the same variance.

• If the sample size is large—Normality

assumption is not required.

• If null hypothesis is true, classifying data into

three columns is unnecessary and the entire set

of 18 measurements of productivity can be

thought of as a sample from one population

having a common variance.

Comparison of Estimates

• ANOVA is based on a comparison of two different

estimates of the variance of overall population.

• We can calculate one of these estimates by examining

the variance among three sample means( 17, 21, 19).

• The other estimate can be determined by the variations

within the three samples themselves.(17.5, 15.5, 12.0)

• Compare these two estimates of population variance.

• Because, both are estimates of variance, they should be

approximately equal in value when null hypothesis is

true.

• If the null hypothesis is not true, these two estimates will

differ considerably.

Variability among sample Means: Variance

between the samples provides a good estimate

only if the null hypothesis is true. If the null

hypothesis is false, it overestimates variance

( 20)

Variability of data within samples: Variance

within the samples approach provides a good

estimate of population variance in either case

(14.769).

When populations are not the same, the between

sample mean variance tends to be larger than

variance within sample approach and F tends to

be large which tends to reject the null

hypothesis.( F = 1.354)

Problem

In McDonald, a fast- food chain feels it is gaining a bad reputation

because it takes too long to serve the customers . Because the

chain as four restaurants in a city , it is concerned with whether all

four restaurants have the same average service time. One of the

owners of the fast food chain has decided to visit each of the stores

and monitor the service time for five randomly selected customers.

At his four noontime visits , he records the following service times in

minutes:

Restaurant-1 3 4 5.5 3.5 4

Restaurant-2 3 3.5 4.5 4 5.5

Restaurant-3 2 3.5 5 6.5 6

Restaurant-4 3 4 5.5 2.5 3

( a) Using a 0.05 significance level , do all the restaurants have the

same mean service time ?

( b) Based on his results , should the owner make any policy

recommendations to any of the restaurant managers ?

A survey conducted over the last 25 years

indicated that in 10 years ,the winter was

mild , in 8 years it was cold and in the

remain years , it was very cold . A company

sells 1000 woolen coats in a mild year ,1300

in a cold year and 2000 in a very cold year .

Find the yearly expected profit of the

company , if a woolen coat costs Rs. 1730/-

was sold for Rs.2480/- on an average .

Chi square as a test of

independence

• To test whether more than two population

proportions can be considered equal.

• One can classify population into several

categories with respect to two attributes and can

use this test to determine their independence or

whether one influences the other.

• If the null hypothesis is true, one can combine

the data from samples and then estimate the

proportion.

Problem (Attitude about job

interview)

N-E S-E Central West Coast Total

Present 68 75 57 79 279

Method

New 32 45 33 31 141

Method

Total 100 120 90 110 420

Problem-1

The number of car accidents per month in a

certain city were as follows:

Are these frequencies in agreement with the

belief that accident conditions were same

during this 10 month period.

Problem-2

The theory predicts that the proportion of an

item in the four groups A, B, C, D should

be 9:3:3:1. In an experiment among 1600

items, the numbers in the groups were

882, 313, 287 and 118. Does this

experiment support the theory.

Problem-3

Records taken of the number of male and female births in

800 families having 4 children are given as:

male female families

0 4 32

1 3 178

2 2 290

3 1 236

4 0 64

Test whether the data are consistent with the hypothesis

that male and female births are equally likely.

Problem-4

An educator has the opinion that the grades high school students

make depend on the amount of time they spend listening to music.

To test this theory, he has randomly given 400 students a

questionnaire. Within the questionnaire are the two questions. ”How

many hours a week you listen to music?” “What is the average

grade for all your classes?” The data from the survey are in the

following table. Using a 5% significance level, verify whether grades

and the time spent listening to music are independent or dependent.

Hours spent listening to Music Average Grade

A B C D F

Less than 5 Hours 13 10 11 16 5

5 to 10 hours 20 27 27 19 2

10 to 15 hours 9 27 71 16 32

More than 20 hours 8 11 41 24 11

Problem-5

The distribution of typing mistakes committed by a typist is

given as :

Mistakes/page 0 1 2 3 4 5

No. of pages 142 156 69 27 5 1

expected number of pages containing 0 ,1, 2, 3, 4 ,5

mistakes respectively. Test a hypothesis that there is no

significant difference amongst the observed and expected

number of pages congaing the mistakes.

Find Correlation coefficient

sales expenses (lakh)

50 11

50 13

55 14

60 16

65 16

65 15

65 15

60 14

60 13

50 13

Weight at age 12

Age Weight

1 5

2 8

3 9

4 11

5 14

6 15

7 17

8 18

9 20

10 25

