Session 11-15

07/05/2015
Faculty Development Program 2015
Research Methodology
Session 11-15
Tea-Test
The t-test assesses whether the means of two groups are

statistically different from each other.
1
07/05/2015
Difference between the two means

t =
Variability or dispersion of the scores
2
07/05/2015
• A teacher wants to know if his introductory RM class has a

good grasp of basic concepts. Six participants are chosen at
random from the class and given a statistic proficiency test.
The teacher wants the class to be able to score above 70 on the
test. The six students get scores of 62, 92, 75, 68, 83, and 95.
Can the professor have 90 percent confidence that the mean
score for the class on the test would be above 70?
Because the computed t-value of 1.71 is larger than the critical value in
the table, the null hypothesis can be rejected, and the professor has
evidence that the class mean on the statistics test would be at least 70.
Two Independent Samples
• Users and non-users of the brand differs in terms of their

brand perceptions
• High income groups spend more time on entertainment
than low income groups
3
07/05/2015
You have 2 samples which may be from 1 distribution or 2.

To assess the likelihood, find how many sds the means of the
2 populations are apart:
How many SD’s?
Calculate t = (μ1 - μ2) / pooled sd
μ1 μ2
• Homogeneity of Variance: The amount of variability in

each of the two groups is equal
X1 – X2
t=
√
(n1-1)s12 + (n2-1)s22 n1 + n2
(n1 + n2 -2) n1n2
4
07/05/2015
Do visual aids and examples increase the

learning of the students?
• Two groups were taught

• Group 1 was taught in a normal class room
• Group 2 was taught in a classroom and with lots of visual
aids and examples
• Is there any difference in the learning of the students?
5
07/05/2015
Group 1 Group 2
7 5 5 5 3 4
3 4 7 4 2 3
3 6 1 4 5 2
2 10 9 5 4 7
3 10 2 5 4 6
8 5 5 7 6 2
8 1 2 8 7 8
5 1 12 8 7 9
8 4 15 9 5 7
5 3 4 8 6 6
s1 = 3.42 s2 = 2.06
• Step 1: Null Hypothesis
• Step 2: Setting the level of risk

• Step 3: Select appropriate t-statistics: t-test for
independent means
• Step 4: Compute the t-value
• Step 5: determine the critical t-value
• Step 6: compare
• Step 7: decide
6
07/05/2015
If the obtained value does not exceed the critical value the
null hypothesis is the most attractive explanation
• OK! there is a significant difference but what about the

magnitude of difference
• How two groups are different from one another
X1 – X2
ES =
√ σ12 + σ22
2
Relative position of one group to another
7
07/05/2015
Two Dependent (Paired) Samples
• Do people differ in terms of their attitude towards

corruption and ministers
• Is there any difference between a 15 second and a 30
second TV commercial
Tea Test for Related Groups
• The difference between students scores on the pretest and

on the posttest
• Participants are being tested more than once
• There are two groups of scores
• The appropriate test statistic is t test for dependent means.
8
07/05/2015
n= no of pairs of observations
∑D
√ n∑D2- (∑D)2
n-1
Pretest Posttest
3 7
5 8
4 6
6 7
5 8
5 9
4 6
5 6
3 7
6 8
7 8
8 7
7 9
6 10
7 9
8 9
9
07/05/2015
Pretest Posttest
Difference (D) D2
3 7 4 16
5 8 3 9
4 6 2 4
6 7 1 1
5 8 3 9
5 9 4 16
4 6 2 4
5 6 1 1
3 7 4 16
6 8 2 4
7 8 1 1
8 7 1 1
7 9 2 4
6 10 4 16
7 9 2 4
8 9 1 1
∑D = 37 ∑D2 = 107
(∑D)2 = 1369
n= no of pairs of observations
∑D
7.737
√ n∑D2-
n-1
(∑D)2
10
07/05/2015
Are you examining differences between groups on one or more

variables
Examining differences
between groups on one or more
variables
Are the same participants

being tested more than once?
Yes No
How many groups are

How many groups are
you dealing with?
you dealing with?
More More
Two Two
than Two than Two
groups groups
Groups Groups
• What to do when there are more than TWO

Groups?
1‐(1‐α)k
α = Type I error rate
k = No of comparisons
11
07/05/2015
Analysis of Variance
How to decide when means are different enough, relative to the

spread of the observation in each group
Y2
Y2
Y1 Y1
12
07/05/2015
ANOVA looks at the way groups differ internally

versus what the difference is between them.
Determine the existence of a statistically significant

difference among several group means.
The test uses variances to help determine if the means are

equal or not
Three basic assumptions

Each population from which a sample is taken is assumed
to be normal.
Each sample is randomly selected and independent.
The populations are assumed to have equal standard
deviations (or variances).
13
07/05/2015
SSB = Sum of Square Between Groups
SSW = Sum of Squares within groups
TSS = Total Sum of Squares
TSS = SSW + SSB MSB

F Ratio =
MSS = MSW + MSB MSW
WITHIN BETWEEN
difference: difference
group data - group mean group mean - overall mean
data group mean plain squared plain squared
5.3 1 6.00 -0.70 0.490 -0.4 0.194
6.0 1 6.00 0.00 0.000 -0.4 0.194
6.7 1 6.00 0.70 0.490 -0.4 0.194
5.5 2 5.95 -0.45 0.203 -0.5 0.240
6.2 2 5.95 0.25 0.063 -0.5 0.240
6.4 2 5.95 0.45 0.203 -0.5 0.240
5.7 2 5.95 -0.25 0.063 -0.5 0.240
7.5 3 7.53 -0.03 0.001 1.1 1.188
7.2 3 7.53 -0.33 0.109 1.1 1.188
7.9 3 7.53 0.37 0.137 1.1 1.188
TOTAL 1.757 5.106
TOTAL/df 0.25095714 2.55275
overall mean: 6.44 F = 2.5528/0.25025 = 10.21575
14
07/05/2015
If the Between Group Variation is
significantly greater than the Within Group
Variation, then it is likely that there is a
statistically significant difference between the
groups.
Analysis of Variance for days

Source DF SS MS F P
treatment 2 34.74 17.37 6.45 0.006
Error 22 59.26 2.69
Total 24 94.00
15
07/05/2015
ANOVA
• The test statistic for ANOVA is an F-ratio, which is a ratio

of two sample variances. In the context of ANOVA, the
sample variances are called mean squares, or MS values.
• The top of the F-ratio MSbetween measures the size of mean

differences between samples. The bottom of the ratio
MSwithin measures the magnitude of differences that would
be expected without any treatment effects.
31
ANOVA
• The F-ratio has the same basic structure as the independent-

measures t statistic.
Obtained mean differences (including treatment effects) MSbetween

F = ─────────────────────── = ───────
Differences expected by chance (without treatment effects) MSwithin
32
16
07/05/2015
ANOVA
The differences (or variance) between means can be caused

by two sources:
1. Treatment Effects: If the treatments have different effects,
this could cause the mean for one treatment to be higher (or
lower) than the mean for another treatment.
2. Chance or Sampling Error: If there is no treatment effect at
all, you would still expect some differences between samples.
Mean differences from one sample to another are an example
of random, unsystematic sampling error.
34
17
07/05/2015
ANOVA
• Within-Treatments Variability: MSwithin measures the size

of the differences that exist inside each of the samples.
• Because all the individuals in a sample receive exactly the

same treatment, any differences (or variance) within a
sample cannot be caused by different treatments.
35
ANOVA
Thus, these differences are caused by only one source:
1. Chance or Error: The unpredictable differences that exist

between individual scores are not caused by any
systematic factors and are simply considered to be
random chance or error.
36
18
07/05/2015
ANOVA
• Considering these sources of variability, the structure of

the F-ratio becomes,
treatment effect + chance/error

F = ──────────────────────
chance/error
37
19
07/05/2015
ANOVA
• To supplement the hypothesis test, it is recommended that

you calculate a measure of effect size.
• For an analysis of variance the common technique for

measuring effect size is to compute the percentage of
variance that is accounted for by the treatment effects.
39
ANOVA
SSbetween treatments
η2 = ───────────
SStotal
40
20
07/05/2015
Product Moment Correlation
• The product moment correlation, r, summarizes the

strength of association between two metric (interval or ratio
scaled) variables, say X and Y.
• It is an index used to determine whether a linear or straight-

line relationship exists between X and Y.
• Also known as the Pearson correlation coefficient.

It is also referred to as simple correlation, bivariate
correlation, or merely the correlation coefficient.
Scatter Diagram
21
07/05/2015
Simple Linear Regression
22
07/05/2015

Y
X
Y=  + βX + e
Assumption
• The mean or expected value of e is zero
• The relationship between X and Y is linear
23
07/05/2015
24
07/05/2015
25
07/05/2015
Least Square Criterion: min ∑ (yi-yest)2
∑ (Xi-Xavg)(Yi-yavg)
Slope =
∑ (Xi-Xavg)2
 = Yavg-βXavg

Y
(Xi, Yi)
Yi - Yavg Yi-Yest
Yest-Yavg
Yavg
Yest
X
Y est = 60 + 5X
26
07/05/2015
Total Sum of Square Deviations = ∑ (Yi-Yavg)2
Sum of Square due to Regression

Explained Deviation = ∑ (Yest-Yavg)2
Sum of Square due to Error

Unexplained Deviation = ∑ (Yi-Yest)2
SSR
SST = SSR + SSE R2 =
SST
How well the estimated regression equation fit the data
Coefficient of Determination : How well the estimated

regression equation fit the data
Coefficient of Determination = SSR/ SST
Correlation coefficient = (sign of β ) √(Coefficient of

determination)
27
07/05/2015
SSE
MSE =
(n-k-1)
SSR
MSR = MSR
No of IV F Ratio =
MSE
If F > Fc: Reject H0

If F < Fc: Accept Ho
28
07/05/2015
Multiple Regression
How a dependent variable is related to 2 or more
independent variable
Y=  + β1X1 + β2X2 + β3X3 + β4X4 +… + e
Mean (Y)=  + β1X1 + β2X2 + β3X3 + β4X4 +…
Multiple coefficient of determination R2 = SSR/SST
k (1-R2)
Adj. Multiple coeff. of Determination Ra2 = R2 -
n-k-1
Sum of Mean
Model Squares df Square F Sig.
1 Regression 4.272 3 1.424 3.204 .023(a)
Residual 206.221 464 .444
Total 210.493 467
2 Regression 29.948 4 7.487 19.200 .000(b)
Residual 180.545 463 .390
Total 210.493 467
a Predictors: (Constant), TENURE, GENDER, SALARY

b Predictors: (Constant), TENURE, GENDER, SALARY, HAPPINESS
c Dependent Variable: PERFORMANCE
29
07/05/2015
Testing the assumptions for Regression
Normality (interval level variables)

– Skewness & Kurtosis must lie within acceptable limits
(-1 to +1)
• How to test?
– You can examine a histogram, but SPSS also provides
procedures, and these have convenient rules that can be
applied (see following slides)
• If condition violated?
– Regression procedure can overestimate significance, so
should add a note of caution to the interpretation of
results (increases type I error rate)
Testing the assumptions for Regression
• Linearity & homoscedasticity for interval level

variables
• How to test?
– Scatterplot
• If condition violated?
– Can underestimate significance
30
07/05/2015
The scatterplot for evaluating linearity
• Homoscedasticity refers to the assumption that that

the dependent variable exhibits similar amounts of
variance across the range of values for an
independent variable.
31
07/05/2015
The scatterplot for evaluating homoscedasticity
• Amount of variability in the selected IV not explained by

other IV
Variance Inflation Factor= 1/Tolerance
Multicollinearity
32
07/05/2015
Transformations
• Three common transformations that we use: the

logarithmic transformation, the square root
transformation, and the inverse transformation.
Scale Development
Item Generation
Scale Purification
Reliability and Validity analysis
33
07/05/2015
Factor Analysis
Condense (summarize) the information contained in

several items into smaller set of new, composite
dimensions or variates (factors) with a minimum loss of
information
Identifying structure through data summarization
Data Reduction
Structure of relationships may exist among items or

among respondents
Visit to a Bank
34
07/05/2015
Assumptions in Factor Analysis

Ensure that the data matrix has sufficient correlations to justify
the application of factor analysis
Bartlett test of Sphericity

Measure of Sampling Adequacy (0 to 1)
Statistical probability that correlation matrix has significant

correlations at least among some of the items/ variables
Degree of intercorrelations among the items/ variables
MSA
.80and above Meritorious
.70 and above Middling
.60 and above Mediocre
below.50 unacceptable
Visit to a Bank
Respondents Item 1 Item 2 Item 3 … … Item

18
5 4 3 6 4 3
35
07/05/2015
Correlation Matrix for Item 1 through Item 16
2 3 … … 17
Item 1 .87 .12 .11 .92 .08
Item 2 .07 .17 .89 .13
Item 3 .86 .10 .88
… .14 .91
Item 18 .06
Visit to a Bank
Respondents Item 1 Item 2 Item 3 … … Item

18
5 4 3 6 4 3
36
07/05/2015
Component
1 2 3
1 -.542 .216 .643
2 -.553 .212 .604
3 -.441 .248 .637
4 -.284 .591 -.213
5 -.317 .622 -.192
6 -.247 .582 -.201
7 -.336 .546 -.223
8 -.388 .547 -.230
9 .600 .093 .034
10 .632 .012 -.046
11 .716 .183 .014
12 .793 .158 .080
13 .760 .277 .044
14 .742 .132 .120
15 .711 .230 .172
16 .694 .216 .130
17 .693 .207 .154
18 .710 .230 .172
Criteria for number of Factors to Extract

r=.20
r = .30
r2= .04
r2= .09 r = .70
r2= .49
r = .75
r2= .5625 r = .80
r2= .64
The eigen value for a given factor measures the variance

in all the variables/ items which is accounted for by
that factor.
37
07/05/2015
Criteria for number of Factors to Extract
The Kaiser criterion: Retain only factors with eigen

values greater than 1
The eigen value for a given factor measures the variance
in all the variables/ items which is accounted for by
that factor.
If a factor has a low eigen value, then it is contributing
little to the explanation of variances in the
variables/items and may be ignored as redundant with
more important factors.
Initial Eigen values

Component Total % of Variance Cumulative %
1 6.301 35.007 35.007
2 2.181 12.117 47.124
3 1.535 8.529 55.654
4 .856 4.754 60.407
5 .761 4.229 64.636
6 .727 4.041 68.677
7 .705 3.919 72.596
8 .633 3.514 76.110
9 .599 3.327 79.437
10 .544 3.022 82.459
11 .493 2.740 85.199
12 .474 2.631 87.830
13 .443 2.460 90.290
14 .418 2.323 92.613
15 .373 2.070 94.683
16 .335 1.859 96.542
17 .318 1.767 98.309
18 .304 1.691 100.000
38
07/05/2015
Visit to a Bank
C
D
E
F
A
B
G
H
39
07/05/2015
Rotation does not change the amount of variance

accounted for but simply redistributes the variance
across the factors to facilitate interpretation
C
D
E
Orthogonal F
Varimax Oblique
A
Quartimax B
Equimax
G
H
THANK YOU
40

Session 11-15

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Session 11-15

Caricato da

Copyright:

Formati disponibili

07/05/2015

Faculty Development Program 2015

The t-test assesses whether the means of two groups are

Difference between the two means

• A teacher wants to know if his introductory RM class has a

Two Independent Samples

• Users and non-users of the brand differs in terms of their

You have 2 samples which may be from 1 distribution or 2.

How many SD’s?

Calculate t = (μ1 - μ2) / pooled sd

• Homogeneity of Variance: The amount of variability in

Do visual aids and examples increase the

• Two groups were taught

• Step 1: Null Hypothesis

• Step 2: Setting the level of risk

• OK! there is a significant difference but what about the

• How two groups are different from one another

Two Dependent (Paired) Samples

• Do people differ in terms of their attitude towards

Tea Test for Related Groups

• The difference between students scores on the pretest and

Are you examining differences between groups on one or more

Are the same participants

How many groups are

• What to do when there are more than TWO

How to decide when means are different enough, relative to the

ANOVA looks at the way groups differ internally

Determine the existence of a statistically significant

The test uses variances to help determine if the means are

Three basic assumptions

SSB = Sum of Square Between Groups

SSW = Sum of Squares within groups

TSS = Total Sum of Squares

TSS = SSW + SSB MSB

overall mean: 6.44 F = 2.5528/0.25025 = 10.21575

If the Between Group Variation is

significantly greater than the Within Group

Variation, then it is likely that there is a

statistically significant difference between the

Analysis of Variance for days

• The test statistic for ANOVA is an F-ratio, which is a ratio

• The top of the F-ratio MSbetween measures the size of mean

• The F-ratio has the same basic structure as the independent-

Obtained mean differences (including treatment effects) MSbetween

The differences (or variance) between means can be caused

• Within-Treatments Variability: MSwithin measures the size

• Because all the individuals in a sample receive exactly the

Thus, these differences are caused by only one source:

1. Chance or Error: The unpredictable differences that exist

• Considering these sources of variability, the structure of

treatment effect + chance/error

• To supplement the hypothesis test, it is recommended that

• For an analysis of variance the common technique for

Product Moment Correlation

• The product moment correlation, r, summarizes the

• It is an index used to determine whether a linear or straight-

• Also known as the Pearson correlation coefficient.

Simple Linear Regression

Simple Linear Regression

Least Square Criterion: min ∑ (yi-yest)2

Simple Linear Regression

Total Sum of Square Deviations = ∑ (Yi-Yavg)2

Sum of Square due to Regression