Sei sulla pagina 1di 24

Chi-square ( 2

) analysis
Types of measurement
Nominal Scale: categorical data, such as
hair colour (Brown, Blonde, Red)
Ordinal Scale: ranked data, such as good,
better, best
Interval data: continuous data with equal
intervals between points which may not
have an absolute zero, such as temperature
in Celsius and Fahrenheit
Ratio data: similar to interval data, but has
a true zero, such as length or mass
Types of measurement
Nominal Scale: categorical data, such as
hair colour (Brown, Blonde, Red)
Ordinal Scale: ranked data, such as good,
better, best
Interval data: continuous data with equal
intervals between points which may not
have an absolute zero, such as temperature
in Celsius and Fahrenheit
Ratio data: similar to interval data, but has
a true zero, such as length or mass
Chi-square ( ) analysis
2

Used for the analysis of frequency data grouped in categorical


variables (e.g. gender (male or female), pregnancy (yes or not),
voting in elections, preferences, age group).
Reveals any relationship between two categorical variables.
Allows comparison of observed frequencies in those categories
to the expected frequencies you may get by chance.
The data is arranged in a contingency table and the analysis
assumes that the patterning of the frequency values is
constant across all the columns and rows.
Null hypothesis is that there is no difference between the
observed and expected frequencies.
Chi-square ( ) analysis
2

Consider the following experiment


200 cats were trained to dance by reward of food or affection for
dance-like behaviour !!! (data from Field, 2009).

Reward
Dance? Food Affection Total
Yes 28 48 76
No 10 114 124
Total 38 162 200

The first step is to calculate the expected frequencies


Calculation of expected frequencies
Expected = (Row total x Column total)/n

Reward
Dance? Food Affection Total
Yes Obs 28 48 76
Exp 14.44 61.56
No Obs 10 114 124
Exp 23.56 100.44
Total 38 162 n = 200
The next step is to calculate the cell contribution to chi-square
Calculation of contribution to Chi-Sq
Chi-Sq = (Obs-Exp)2/Exp
Reward
Dance? Food Affection Total
Yes Obs 28 48 76
Exp 14.44 61.56
2

12.73 2.99
No Obs 10 114 124

Exp
2
23.56 100.44
7.80 1.83
Total 38 162 n = 200
Total Chi-Sq = 12.73+2.99+7.8+1.83 = 25.35
Degrees of freedom = (r-1)(c-1) = (2-1)(2-1) = 1
Summary of analysis

(1) = 25.35, P < 0.01
2

Critical values (from tables) of Chi-Sq for df = 1 are;


P < 0.05; 3.84
P < 0.01; 6.63
As the value of Chi-Sq is greater than the critical value
for P < 0.01, then H0 may be rejected.
Therefore there is a significant relationship between
the two variables.
Consequently there are some cells where there is a
significant difference between the observed and
expected frequencies.
But which ones?
Standardized residuals
A significant chi-square test may be viewed similarly to a
significant ANOVA test: there is variation in the data but
the test doesnt identify where!
Analysis of the standardized residual for each cell allow
significant differences between observed and expected
frequencies to be identified.

observed expected
Standardized residual
expected
Standardized residuals
Reward
Dance? Food Affection Total
Yes Obs 28 48 76
Exp 14.44 61.56
2

12.73 2.99
St Residual 3.568 1.728
No Obs 10 114 124

Exp
2
23.56 100.44
7.80 1.83
St Residual 2.794 1.353
Total 38 162 n = 200
Interpretation of Standardized residuals
Each of the standardized residuals is a z-score
Thus, if the value is outside 1.96, then P<0.05
if the value is outside 2.58, then P<0.01
if the value is outside 3.29, then P<0.001

This allows the probability for each standardized residual


to be determined.
Standardized residuals
Reward
Dance? Food Affection Total
Yes Obs 28 48 76
Exp 14.44 61.56
2

12.73 2.99
St Residual 3.568 1.728
P<0.001 P>0.05
No Obs 10 114 124

Exp
2
23.56 100.44
7.80 1.83
St Residual 2.794 1.353
P<0.01 P>0.05
Total 38 162 n = 200
Revision of z-scores and normal distribution
Each of the standardized residuals is a z-score
Z-scores are important because they are related to specific
percentage values of the standard normal distribution curve
Normal distribution

68.26%

95.44%

99.73%

0.13% 2.14% 2.14% 0.13%


13.59% 34.13% 34.13% 13.59%

Mean Standard Deviations


Normal Distribution

Mean = 3.8, Standard Deviation = 4.3


68.26% of population = 3.8 4.3 = - 0.5 8.1
95.44% of population = 3.8 (2*4.3) = - 4.8 12.4
95% of population = mean (1.96 * standard deviation) = -4.628 12.228. Sometimes
referred to as the 95% confidence interval of the population
99% of population = mean (2.58 * standard deviation) = -7.294 14.894
99.9% of population = mean (3.29 * standard deviation) = -10.347 17.947
Chi-square ( 2
) analysis
Consider the following table;

90 60 30
60 40 20
30 20 10

Here the patterning of the data is constant across all rows and
columns.
Because the patterning is constant the expected frequencies
are the same as the observed frequencies.
Chi-square ( 2
) analysis

Obs 90 60 30 180
Exp 90 60 30
Obs 60 40 20 120
Exp 60 40 20
Obs 30 20 10 60
Exp 30 20 10
180 120 60 n = 360
Chi-square ( 2
) analysis
Now consider this table;

135 60 30
60 40 20
30 20 10

Here the patterning is not as constant across all rows and


columns.
Because of this some minor differences start to appear
between the observed and expected frequencies.
Chi-square ( ) analysis
2

Obs 135 60 30 225


Exp 125 66.7 33.3
Obs 60 40 20 120
Exp 66.7 35.5 17.8
Obs 30 20 10 60
Exp 33.3 17.8 8.9
225 120 60 n = 405
Chi-square ( ) analysis
2

Obs 135 60 30 225


Exp 125 66.7 33.3
2

0.8 0.67 0.32


Obs 60 40 20 120
Exp 66.7 35.5 17.8
2

0.67 0.57 0.27


Obs 30 20 10 60
Exp 33.3 17.8 8.9
2

0.32 0.27 0.14


225 120 60 n = 405
Chi-square ( 2
) analysis
Now consider this table;

180 60 30
60 40 20
30 20 10

Here the patterning is even further removed from the


constant pattern seen earlier
Consequently even larger differences start to appear between
the observed and expected frequencies.
Chi-square ( ) analysis
2

Obs 180 60 30 270


Exp 162 72 36
Obs 60 40 20 120
Exp 72 32 16
Obs 30 20 10 60
Exp 36 16 8
270 120 60 n = 450
Chi-square ( ) analysis
2

Obs 180 60 30 270


Exp 162 72 36
2

2 2 1
Obs 60 40 20 120
Exp 72 32 16
2

2 2 1
Obs 30 20 10 60
Exp 36 16 8
2

1 1 0.5
270 120 60 n = 450
Assumptions of Chi-square
Each person, item or entity contributes to one cell
only therefore not appropriate for repeated
measures design experiments.
Expected frequencies > 5.
For large tables it is acceptable to have up to 20% of
expected frequencies below 5.
In very large tables no expected frequency <1

Potrebbero piacerti anche