Sei sulla pagina 1di 6

Calculating Phi: Correlation with Nominal Data Phi is the correlation procedure that is used with nominal data.

In order to use the Phi cor-

relation, your data must be arranged in a bifurcated format. Bifurcation means that the sets of nominal data must be divided into two categories. (Bifurcation is an important point and I will return to it later.) Since nominal data are the least sophisticated type of data, there are very few statistical procedures that can be used to analyze relationships. The Phi correlation is one of the most widely used. As our first example, consider the following data representing the numbers of males and females who passed or failed a math exam. Of the males in our fictitious experiment 63 passed and 25 failed, whereas 34 of the females passed and 53 failed. Our question becomes "is gender correlated with success on the exam@ (Notice that gender has been bifurcated into males or females and success on the exam has been bifurcated into pass or fail.) The null-hypothesis would state that there is no relationship between gender and performance on the math exam. Notice in the example, I have chosen to let the letters A through D represent our numbers. The data is summarized on the next page:

Math exam outcomes Pass Failed

7-1

Males

63(A)

25(B) 53(D)

Females 34(C)

AD - BC (A + B)(C + D)(A + C)(B + D)

The formula for Phi is as follows: In our example,

(63)(53) - (25)(34) 2489 2489 = = = 0.327 (88)(87)(97)(78) 57925296 7610.8

In a Phi correlation you can disregard the sign of the correlation. It has no meaning. Can you see how we could change the positive correlation to a negative simply by exchanging the males' and the females' rows in the above data? But, we do need to determine if our correlation is significant. To accomplish this test of significance, we need to transform our Phi to a Chi-square. Use the following formula to determine if Phi is significant.
2 2 = N

N is the number of males and females in the study (In our case N = 175). Therefore,

7-2

2 2 = (175)(.327 ) = 18.72

If your Chi-square is larger than 3.84, the critical value, then your Phi is significant at the . 05 level of confidence. In our example, we exceeded 3.84. Therefore, we can conclude that we have found a significant relationship between gender and performance on the math exam, with males tending to perform better than females. Since we have a significant relationship (our Chisquare calculation exceeded 3.84), we have found that our correlation falls outside of the "zero" range on the correlation line and we can reject our null-hypothesis.1 On the other hand, if our "Chi-square" calculation had failed to exceed 3.84, we would have concluded that our correlation falls within the "Zero" range, and therefore, we would not have sufficient justification to reject the null-hypothesis. (Note that is logically incorrect to say you accept your null-hypothesis.) As another example, consider the following problem where we would like to determine the correlation between ethnic groups and pay grade within a company. For the purpose of demonstration suppose that our data appeared as follows, where the numbers represent frequencies: Ethnic Groups Whites Blacks Hispanics Others Upper Pay Middle Beginning 35 20 5 5 10 25 5 2 12 11 5 3 8 9 5

Grades Lower 15

As you can most likely recognize, the data is not bifurcated, a necessary condition for the application the Phi correlation. To bifurcate the data, you will need to combine the rows and columns of the original data matrix in a logical, understandable manner to arrive at bifurcated sets of nominal data. I have decided to bifurcate the data on the ethnic group variable into whites and

The critical value of 3.84 is used since it represents the critical value for a chi-square with 1 degree of freedom. More about chi-square can be found on the chapter dealing with the test.

7-3

nonwhites and on the pay grade variable into an above category (upper plus middle combined) and a lower category (lower plus beginning combined.) The reorganized data is presented below: Whites Upper Lower 55 20 Non-Whites 40 60

The Phi for this problem is 0.33 (you may want to check my calculations). The tentative interpretation of the correlation is that there appears to be a relationship between ethnic status and pay grades, with whites tending to be more represented in the upper pay grades and nonwhites in the lower pay grades. However, have we found a meaningful relationship or can we attribute the Phi value to random fluctuations in the data? The answer to this question is provided by the Chisquare analysis that you perform after obtaining the Phi. The Chi-square was 19.19 and the critical value, that we needed to exceed, was 3.84. Therefore, we can conclude that we have found a significant correlation and the null-hypothesis of no relationship can be rejected. How to Perform the Calculations with Stats.exe The name of the program to perform the computations is called Phi. I will show you how to perform the calculations for the example presented in this chapter. Select the "Phi" option from the main menu and then select "Help" option. The program will respond as follows: This is a program to compute a Phi correlation for a two by two table of bifurcated nominal data. Your data should be arranged like this. :-------:-------: : A : B : :-------:-------: : C : D : :-------:-------: Now, select the >Start Calculations= option from the menu bar. Enter the value of A (enter 63) 7-4

Enter the value of B (enter 25) Enter the value of C (enter 34) Enter the value of D (enter 53) Phi = .3270324 Chi-square = 18.71628 Critical value is 3.84

7-5

Chapter Exercises 1. Suppose that a company wants to know if education is related to the tendency to be promoted. Education was separated into four categories (high school graduate, some college, college graduate, and graduate studies) and the tendency to be promoted was divided into three categories (promotable, but not promoted in the last year; not promotable and not promoted in the last year; and promoted in the last year). Following are the data. Perform the correlation, present the null-hypothesis and discuss your conclusion. H.S. Some College Grad. Grad. College Grad. School Promotable, not promoted Not promotable, not promoted Promoted 5 25 3 6 8 4 21 2 35 6 1 25

2. Following are two sets of data -- gender (males and females which is a nominal set of data) and subjective ratings of the desire to have children (measured on a 100-point scale, with high scores indicating greater desire - which is an ordinal set of data.) Correlate the two sets of data. Gender Desire Gender Desire Gender Desire male female female female male female male female 51 63 81 58 15 96 21 100 female male male female male female male female 99 86 22 10 11 80 55 64 male female male female male male female male 57 59 19 72 9 83 79 53

Potrebbero piacerti anche