Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
multinomial logit
The data for this exercise comes for the 1991 General Social Survey. The categorical
dependent variable occ is coded as follows:
The independent variables are : educ is years of schooling; age is age in years; sexx
is coded 1 male, 0 female; rural is coded 1 if grew up in rural area, 0 otherwise.
1. tab occ
2. mlogit occ,base(0)
------------------------------------------------------------------------------
occ | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
1 |
_cons | .3659343 .0992281 3.688 0.000 .1714508 .5604177
---------+--------------------------------------------------------------------
2 |
_cons | .2137977 .1025124 2.086 0.037 .0128771 .4147183
------------------------------------------------------------------------------
(Outcome occ==0 is the comparison group)
The coefficients above are on the logodds scale. In particular, they are the log odds
of being in occupation 1 versus 0 and 2 versus 0. Hence, they should equal the
following:
------------------------------------------------------------------------------
occ | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
1 |
educ | .2175129 .0495753 4.388 0.000 .120347 .3146788
_cons | -2.341483 .6221847 -3.763 0.000 -3.560943 -1.122024
---------+--------------------------------------------------------------------
2 |
educ | .7404903 .0630034 11.753 0.000 .6170059 .8639747
_cons | -9.937645 .8608307 -11.544 0.000 -11.62484 -8.250448
------------------------------------------------------------------------------
(Outcome occ==0 is the comparison group)
To get the coefficients on the odds ratio scale we just add the option ,rrr like so:
------------------------------------------------------------------------------
occ | RRR Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
1 |
educ | 1.242981 .0616212 4.388 0.000 1.127888 1.369819
---------+--------------------------------------------------------------------
2 |
educ | 2.096963 .1321158 11.753 0.000 1.853371 2.372572
------------------------------------------------------------------------------
(Outcome occ==0 is the comparison group)
The interpretation of the odds ratio is analogous to logistic regression. Hence, for
category 1, exp(.2175129)= 1.242981, and similarly for category 2. This means that one
additional year of schooling multiplies the odds of being in occupation 1 rather than
0 by 1.2429, i.e., one year of schooling increases the odds of being in category 1
instead of 0 by about 24%. Similarly, the odds of being in category 2 instead of 0
are more than doubled (2.09) for each one year increase in schooling.
Hence, if one additional year of schooling increases the logodds of occ 2 instead of 0
by .7404, and increases the logodds of 1 instead of 0 by .2175, then it increases the
logodds of 2 versus 1 (taking occ 1 as the base category) by .7404-.2175=.5229
To get the odds ratio, we just take exp(.5229)=1.687. Note that this is identical
(aside from rounding error) to the ratio of the odds ratio for category 2 to the odds
ratio for category 1 from the regression above with 0 as the base category:
2.09/1.24=.5229.
------------------------------------------------------------------------------
occ | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
0 |
educ | -.2175129 .0495753 -4.388 0.000 -.3146788 -.120347
_cons | 2.341483 .6221847 3.763 0.000 1.122024 3.560943
---------+--------------------------------------------------------------------
2 |
educ | .5229774 .0514263 10.169 0.000 .4221837 .6237711
_cons | -7.596161 .7404896 -10.258 0.000 -9.047494 -6.144828
------------------------------------------------------------------------------
(Outcome occ==1 is the comparison group)
Note that the education coefficient for the comparison of occupation 0 to occupation 1
is identical in magnitude but opposite in sign to the education coefficient for the
comparison of occ 1 to occ 0.
Now that you know how to recover coefficients and odds ratios by hand, here’s a
command that does it automatically and covers all possibilities:
6. listcoef educ
Odds comparing |
Group 1 - Group 2 | b z P>|z| e^b e^bStdX
------------------+---------------------------------------------
1 -2 | -0.52298 -10.169 0.000 0.5928 0.2415
1 -0 | 0.21751 4.388 0.000 1.2430 1.8056
2 -1 | 0.52298 10.169 0.000 1.6870 4.1403
2 -0 | 0.74049 11.753 0.000 2.0970 7.4758
0 -1 | -0.21751 -4.388 0.000 0.8045 0.5538
0 -2 | -0.74049 -11.753 0.000 0.4769 0.1338
----------------------------------------------------------------
Probability interpretations
How about computing the probability of being in each occupation for a given value of
schooling. To do this, ask stata to compute the probabilities with the following
command:
7. predict p0 p1 p2
Now to get these for each year of schooling value, I did the following (which by the
way, destroys your original data file)
educ p0 p1 p2 summ_p
1. 3 0.8438 0.1559 0.0004 1
2. 4 0.8127 0.1866 0.0008 1
3. 5 0.7768 0.2217 0.0015 1
4. 6 0.7359 0.2611 0.0030 1
5. 7 0.6899 0.3042 0.0059 1
6. 8 0.6385 0.3499 0.0115 1
7. 9 0.5817 0.3963 0.0220 1
8. 10 0.5192 0.4396 0.0412 1
9. 11 0.4506 0.4743 0.0751 1
10. 12 0.3763 0.4923 0.1314 1
11. 13 0.2977 0.4842 0.2181 1
12. 14 0.2194 0.4435 0.3371 1
13. 15 0.1485 0.3731 0.4784 1
14. 16 0.0919 0.2871 0.6210 1
15. 17 0.0525 0.2038 0.7437 1
16. 18 0.0281 0.1358 0.8360 1
17. 19 0.0144 0.0866 0.8990 1
18. 20 0.0072 0.0536 0.9392 1
Notice that for each value of education, the probabilities (as given by summ_p) sum to
1.
Here’s an example of computing the probabilities. I do this for case of educ=16 years.
As an exercise, you should try to compute some of the other probabilities at some
other levels of education to make sure you know how.
Note that the effect of a one year change in schooling on the probability, of, say,
being in occupation 2 depends on the value of schooling that you start from. This is
just like the binary case and is due to the fact that the probabilities are a
nonlinear function of schooling.