Nested Log It

Nested Logit
Brad Jones1
1 Department of Political Science
University of California, Davis
April 30, 2008
Jones POL 213: Research Methods

Nested Logit
I Interesting model that does not have IIA property.

I Possible candidate model for structured choice situations.
I Conceptual example:
I J political parties a voter i could choose from.
I Say: Green, Workers, Social Dem., Moderate, CR, Extreme
Right
I Models?
I Conditional logit or MNL?
I IIA property could be an issue.

Nested Logit
I IIA says that the disturbances are independent and

homoskedastic.
I Odds are assumed to remain the same if some alternative is
removed.
I Problem: one left party is a close substitute (possibly) of
another.
I If CD voters split their vote across two leftist parties,
elimination of one from the choice set does not imply they will
randomly distribute over remaining choices.
I That is, they most likely will gravitate to the remaining leftist
party.
I If so, odds ratios will change because of nonrandom
redistribution.

Nested Logit
I Under NL (or MNNL), the idea is to group comparable

alternatives and then structure choice setting as a tree.
I Voter i decides to vote leftist, centrist, or rightist.
I Call this the top level choice.
I Once this choice is made, the voter must decide which
outcome to choose:
I Left: Green, Workers; Center: SD, Moderate; Right: CR,
Extreme Right
I Basic result from conditional probability: Prij = Prj|i Pri
I J outcomes (i.e. parties) and i branches.

Nested Logit
I Conditional probability says the probability of the bottom

level choice is equal to the conditional probability of selecting
j given branch i times the probability that branch i was
selected.
I two levels of probability because two levels of decisions.
I Consider the conditional probability statement, Prj|i .
I Suppose we specify a utility model:
Uij = 0 xij + 0 wi
I As in the CL presentation, the xij are covariates that can

change over the choices (bottom level) and the wi are
covariates that are attributes of the choice sets (top level).

Nested Logit
I The conditional probabilities can only be a function of the xij :
exp( 0 xij ) exp(0 wi )
Prj|i =
exp(0 wi ) N 0
P i
k=1 exp( xik )
exp( 0 xij )
= PNi 0
k=1 exp( xik )
I The top level probability is defined by first identifying what
is sometimes called an inclusive value parameter:
Ni
!
X
Ii = log exp( 0 xik )
k=1
I The probability of branch i is then
exp(0 wi + i Ii )
Pri = PC
0
m=1 exp( wi + m Im )
Nested Logit
I The inclusive value parameter, , is the weight accorded

each of the branches.
I Under CL (or MNL), we assume this weight is fixed at 1.
I Estimation is done via full information maximum likelihood:
N
X
log L = log Prj|i Pri .
i
I Model has many parameters.

I It requires a lot of work to interpret.
I My job to show you how . . .
I Stata is actually quite good w/this model.

Nested Logit: Illustration
I Im going to continue with the Stata data set provided by

their website.
I We used it with conditional logit.
I Lets consider the data structure.

. list family_id restaurant chosen kids rating distance cost income in 1/21
+---------------------------------------------------------------------------------+
| family~d restaurant chosen kids rating distance cost income |
|---------------------------------------------------------------------------------|
1. | 1 Freebirds 1 1 0 1.245553 5.444695 39 |
2. | 1 MamasPizza 0 1 1 2.82493 6.19446 39 |
3. | 1 CafeEccell 0 1 2 4.21293 8.182085 39 |
4. | 1 LosNortenos 0 1 3 4.167634 9.861741 39 |
5. | 1 WingsNmore 0 1 2 6.330531 9.667909 39 |
|---------------------------------------------------------------------------------|
6. | 1 Christophers 0 1 4 10.19829 25.95777 39 |
7. | 1 MadCows 0 1 5 5.601388 28.99846 39 |
8. | 2 Freebirds 0 3 0 4.162657 5.26874 58 |
9. | 2 MamasPizza 0 3 1 2.865081 5.728618 58 |
10. | 2 CafeEccell 0 3 2 5.337799 7.054855 58 |
|---------------------------------------------------------------------------------|
11. | 2 LosNortenos 1 3 3 4.282864 10.78514 58 |
12. | 2 WingsNmore 0 3 2 8.133914 8.313948 58 |
13. | 2 Christophers 0 3 4 8.664631 21.2801 58 |
14. | 2 MadCows 0 3 5 9.119597 25.87567 58 |
15. | 3 Freebirds 1 3 0 2.112586 4.616315 30 |
|---------------------------------------------------------------------------------|
16. | 3 MamasPizza 0 3 1 2.215329 5.992166 30 |
17. | 3 CafeEccell 0 3 2 6.978715 7.980528 30 |
18. | 3 LosNortenos 0 3 3 5.117877 10.0605 30 |
19. | 3 WingsNmore 0 3 2 5.312941 8.76644 30 |
20. | 3 Christophers 0 3 4 9.551273 23.64499 30 |
|---------------------------------------------------------------------------------|
21. | 3 MadCows 0 3 5 5.539806 24.72128 30 |
+---------------------------------------------------------------------------------+

. nlogitgen type=restaurant(fast: Freebirds | MamasPizza,
family: CafeEccell | LosNortenos | WingsNmore, fancy: Christophers | MadCows)
This returns:
new variable type is generated with 3 groups
label list lb_type
lb_type:
1 fast
2 family
3 fancy
. nlogittree restaurant type <-GIVES US THE TREE STRUCTURE.

Type is the branch; restaurants are the "twigs."
tree structure specified for the nested logit model
top --> bottom
type restaurant
--------------------------
fast Freebirds
MamasPizza
family CafeEccell
LosNorte~s
WingsNmore
fancy Christop~s
MadCows

\newpage
. nlogit chosen (restaurant= cost rating distance)
(type = incFast incFancy kidFast kidFancy), group(family_id) nolog
Nested logit estimates
Levels = 2 Number of obs = 2100
Dependent variable = chosen LR chi2(10) = 199.6293
Log likelihood = -483.9584 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
restaurant |
cost | -.0944352 .03402 -2.78 0.006 -.1611131 -.0277572<-These are the alpha parms.
rating | .1793759 .126895 1.41 0.157 -.0693338 .4280855
distance | -.1745797 .0433352 -4.03 0.000 -.2595152 -.0896443
-------------+----------------------------------------------------------------
type |
incFast | -.0287502 .0116242 -2.47 0.013 -.0515332 -.0059672 <-WHY DO I HAVE THESE?
incFancy | .0458373 .0089109 5.14 0.000 .0283722 .0633024 <-These are the beta parms.
kidFast | -.0704164 .1394359 -0.51 0.614 -.3437058 .2028729
kidFancy | -.3626381 .1171277 -3.10 0.002 -.5922041 -.1330721
-------------+----------------------------------------------------------------
(incl. value |
parameters) |
type |
/fast | 5.715758 2.332871 2.45 0.014 1.143415 10.2881 <-These are the tau parms.
/family | 1.721222 1.152002 1.49 0.135 -.5366608 3.979105
/fancy | 1.466588 .4169075 3.52 0.000 .6494642 2.283711
------------------------------------------------------------------------------
LR test of homoskedasticity (iv = 1): chi2(3)= 9.90 Prob > chi2 = 0.0194
------------------------------------------------------------------------------

For fun.
. nlogit chosen (restaurant= cost rating distance) (type = incFast

incFancy kidFast kidFancy), group(family_id)
nolog ivc(fast=1, family=1, fancy=1) notree <---CONSTRAINING TAU TO 1
User-defined constraints:
IV constraints:
[fast]_cons = 1
[family]_cons = 1
[fancy]_cons = 1
Nested logit regression
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
restaurant |
cost | -.1367799 .0358479 -3.82 0.000 -.2070404 -.0665193
rating | .3066626 .1418291 2.16 0.031 .0286827 .5846424
distance | -.1977508 .0471653 -4.19 0.000 -.2901931 -.1053085
-------------+----------------------------------------------------------------
type |
incFast | -.0390182 .0094018 -4.15 0.000 -.0574454 -.020591
incFancy | .0407053 .0080405 5.06 0.000 .0249462 .0564644
kidFast | -.2398756 .1063674 -2.26 0.024 -.4483517 -.0313994
kidFancy | -.3893868 .1143797 -3.40 0.001 -.6135669 -.1652067
-------------+----------------------------------------------------------------
(incl. value |
parameters) |
type |
/fast | 1 . . . . .
/family | 1 . . . . .
/fancy | 1 . . . . .
------------------------------------------------------------------------------

Constraining tau=1 should recover conditional logit:
. clogit chosen cost rating dist incFast incFancy kidFast kidFancy, group(family_id)
Conditional (fixed-effects) logistic regression Number of obs = 2100
LR chi2(7) = 189.73
Prob > chi2 = 0.0000
Log likelihood = -488.90834 Pseudo R2 = 0.1625
------------------------------------------------------------------------------
chosen | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cost | -.1367799 .0358479 -3.82 0.000 -.2070404 -.0665193
rating | .3066622 .1418291 2.16 0.031 .0286823 .584642
distance | -.1977505 .0471653 -4.19 0.000 -.2901927 -.1053082
incFast | -.0390183 .0094018 -4.15 0.000 -.0574455 -.0205911
incFancy | .0407053 .0080405 5.06 0.000 .0249462 .0564644
kidFast | -.2398757 .1063674 -2.26 0.024 -.448352 -.0313994
kidFancy | -.3893862 .1143797 -3.40 0.001 -.6135662 -.1652061
-----------------------------------------------------------------------------
(And it does; verify from previous slide)

But since we know IIA doesnt hold, we should continue with unconstrained
nested logit.
Nested logit regression

------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
restaurant |
cost | -.0944352 .03402 -2.78 0.006 -.1611131 -.0277572
rating | .1793759 .126895 1.41 0.157 -.0693338 .4280855
distance | -.1745797 .0433352 -4.03 0.000 -.2595152 -.0896443
-------------+----------------------------------------------------------------
type |
incFast | -.0287502 .0116242 -2.47 0.013 -.0515332 -.0059672
incFancy | .0458373 .0089109 5.14 0.000 .0283722 .0633024
kidFast | -.0704164 .1394359 -0.51 0.614 -.3437058 .2028729
kidFancy | -.3626381 .1171277 -3.10 0.002 -.5922041 -.1330721
-------------+----------------------------------------------------------------
(incl. value |
parameters) |
type |
/fast | 5.715758 2.332871 2.45 0.014 1.143415 10.2881
/family | 1.721222 1.152002 1.49 0.135 -.5366608 3.979105
/fancy | 1.466588 .4169075 3.52 0.000 .6494642 2.283711
------------------------------------------------------------------------------
LR test of homoskedasticity (iv = 1): chi2(3)= 9.90 Prob > chi2 = 0.0194
------------------------------------------------------------------------------

I There are clearly many parameters here.

I Lets figure out what all of this means.
I Im going to make use of Statas predict options to back
out various quantities.
I Note, any of these quantities could be retrieved by hand
using functions from above.

I predict pb will return the probability of choosing restaurant

j.
I predict p1, p1 will return the probability of branch i.
I predict condpb, condpb will return Prj|i .
I predict xbb, xbb will return the linear prediction for the
bottom-level choice.
I predict xb1, xb1 will return the linear prediction for the
top-level choice.
I predict ivb, ivb will return the inclusive value parameter.

. list family_id chosen pb p1 condpb restaurant type in 1/14
+----------------------------------------------------------------------------+
| family~d chosen pb p1 condpb restaurant type |
|----------------------------------------------------------------------------|
1. | 1 1 .0831245 .1534534 .5416919 Freebirds fast |
2. | 1 0 .070329 .1534534 .4583081 MamasPizza fast |
3. | 1 0 .2763391 .7266538 .3802899 CafeEccell family |
4. | 1 0 .284375 .7266538 .3913486 LosNortenos family |
5. | 1 0 .1659397 .7266538 .2283615 WingsNmore family |
|----------------------------------------------------------------------------|
6. | 1 0 .0399215 .1198928 .3329766 Christophers fancy |
7. | 1 0 .0799713 .1198928 .6670234 MadCows fancy |
8. | 2 0 .01176 .0286579 .4103599 Freebirds fast |
9. | 2 0 .0168978 .0286579 .5896401 MamasPizza fast |
10. | 2 0 .2942401 .7521651 .3911909 CafeEccell family |
|----------------------------------------------------------------------------|
11. | 2 1 .2975767 .7521651 .3956268 LosNortenos family |
12. | 2 0 .1603483 .7521651 .2131824 WingsNmore family |
13. | 2 0 .1277234 .219177 .582741 Christophers fancy |
14. | 2 0 .0914536 .219177 .417259 MadCows fancy |
+-------------------------------------------------------------------------------+
| family~d chosen xbb xb1 ivb restaurant type |
|-------------------------------------------------------------------------------|
1. | 1 1 -.731619 -1.191674 -.1185611 Freebirds fast |
2. | 1 0 -.8987747 -1.191674 -.1185611 MamasPizza fast |
3. | 1 0 -1.149417 0 -.1825957 CafeEccell family |
4. | 1 0 -1.120752 0 -.1825957 LosNortenos family |
5. | 1 0 -1.659421 0 -.1825957 WingsNmore family |
|-------------------------------------------------------------------------------|
6. | 1 0 -3.514237 1.425016 -2.414554 Christophers fancy |
7. | 1 0 -2.819484 1.425016 -2.414554 MadCows fancy |
8. | 2 0 -1.22427 -1.878761 -.3335493 Freebirds fast |
9. | 2 0 -.8617923 -1.878761 -.3335493 MamasPizza fast |
10. | 2 0 -1.239346 0 -.3007865 CafeEccell family |
|-------------------------------------------------------------------------------|

11. | 2 1 -1.22807 0 -.3007865 LosNortenos family |
12. | 2 0 -1.846394 0 -.3007865 WingsNmore family |
13. | 2 0 -2.804756 1.570648 -2.264743 Christophers fancy |
14. | 2 0 -3.138791 1.570648 -2.264743 MadCows fancy |
+-------------------------------------------------------------------------------+

Where do the numbers come from?
xbb: Linear prediction for the bottom level
Its a function of the covariates cost, rating, and distance.

For the first observation, we see this is:
. display _b[cost]*cost+_b[rating]*rating+_b[distance]*distance
-.73161902
---------------
condpb: Conditional probability of restaurant j given branch i (from equation on previous slide):
. display exp(-.731619)/(exp(-.731619)+exp(-.8987747))
.54169189
for "FreeBirds" and
. display exp(-.8987747)/(exp(-.731619)+exp(-.8987747))
.45830811
for "MamasPizza."
-----------------
xb1: Linear prediction for i branch
This is the linear prediction for the top-level model (or the branches):
. display -.0287502*incFast + .0458373*incFancy + -.0704164*kidFast + -.3626381*kidFancy

-1.1916742
(The parms are the alphas from the model output).

---------------

OK. Now what about the "inclusive value parameters."
These parameters essentially give us the "weight" the
chooser ascribes to each branch. Under conditional logit, this weight is assumed
to be uniform and therefore, 1. We see in our model that these parameters are not
jointly 1 (which provides evidence in favor of the nested logit model).
Above, I refer to these parameters as the tau. The question at
hand now is where do the I come from? For the first family in the data set, note the following:
. display log(exp( -.731619)+exp(-.8987747))

-.1185611
. display log(exp( -1.149417)+exp(-1.120752)+exp(-1.659421))

-.18259554
. display log(exp( -3.514237)+exp(-2.819484))

-2.4145539
What do the numbers represent? The numbers in parentheses are

our linear predictions for the "bottom level" choices, that is,
the "xbb." Note, then, what the
inclusive value gives us: it gives us a summary of the weight accorded each
"branch" that is available to the chooser.

Ok, almost done. Now what about the top-level probabilities
(i.e. the probability of choosing fast food, family, or fancy?).
In lecture, I give the function. To compute it directly, we do the following:
. display exp(-1.191674 +_b[/fast]*-.1185611)/

(exp( -1.191674 + _b[/fast]*-.1185611) + exp(1.425016 +_b[/fancy]*-2.414554)
+ exp(0 +_b[/family]*-.1825957))
.15345345
Note where these numbers come from: they are the taus, the "ivb," and the "xb1."
In doing this exercise, we reproduce pb1. Interpretation?
The probability of choosing a fast food restaurant is .15 for a person
with this covariate profile.

Finally, we can compute the "bottom-level" probability.
It is the simple conditional probability result. For the first observation, it is:
. display p1*condpb
.08312449
We could then "fill in the tree" for observation 1 (if we wanted to).

I So what would we get from this model if we fully interpreted

it?
I The probability of choice j. That is, the unconditional
probability.
I The conditional probability of choice j given the selection of
branch i.
I The probability of choosing branch i.
I A direct test of the weight associated with each branch, given
chooser attributes.
I Seems a useful empirical model for testing rational choice
predictions.
I Data requirements are substantial, as is theory for nesting
choices.

Nested Log It

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Nested Log It

Caricato da

Copyright:

Formati disponibili

Nested Logit

April 30, 2008

Jones POL 213: Research Methods

I Interesting model that does not have IIA property.

Jones POL 213: Research Methods

I IIA says that the disturbances are independent and

Jones POL 213: Research Methods

I Under NL (or MNNL), the idea is to group comparable

Jones POL 213: Research Methods

I Conditional probability says the probability of the bottom

I As in the CL presentation, the xij are covariates that can

Jones POL 213: Research Methods

I The inclusive value parameter, , is the weight accorded

I Model has many parameters.

Jones POL 213: Research Methods

I Im going to continue with the Stata data set provided by

Jones POL 213: Research Methods

Jones POL 213: Research Methods

. nlogittree restaurant type <-GIVES US THE TREE STRUCTURE.

tree structure specified for the nested logit model

top --> bottom

Jones POL 213: Research Methods

Jones POL 213: Research Methods

. nlogit chosen (restaurant= cost rating distance) (type = incFast

Jones POL 213: Research Methods

(And it does; verify from previous slide)

Jones POL 213: Research Methods

Nested logit regression

Jones POL 213: Research Methods

I There are clearly many parameters here.

Jones POL 213: Research Methods

I predict pb will return the probability of choosing restaurant

Jones POL 213: Research Methods

Jones POL 213: Research Methods

Jones POL 213: Research Methods

Its a function of the covariates cost, rating, and distance.

for "FreeBirds" and

. display -.0287502*incFast + .0458373*incFancy + -.0704164*kidFast + -.3626381*kidFancy

(The parms are the alphas from the model output).

Jones POL 213: Research Methods

. display log(exp( -.731619)+exp(-.8987747))

. display log(exp( -1.149417)+exp(-1.120752)+exp(-1.659421))

. display log(exp( -3.514237)+exp(-2.819484))

What do the numbers represent? The numbers in parentheses are

Jones POL 213: Research Methods

. display exp(-1.191674 +_b[/fast]*-.1185611)/

Jones POL 213: Research Methods

Jones POL 213: Research Methods

I So what would we get from this model if we fully interpreted

Jones POL 213: Research Methods

Potrebbero piacerti anche

. display -.0287502incFast + .0458373incFancy + -.0704164kidFast + -.3626381kidFancy