Next: The Conditional Logit model
Up: Multinomial and Ordered Logit
Previous: Multinomial and Ordered Logit
Contents
- Let's return to our example from last semester where we used years of education to predict abortion attitudes. Respondents were asked to respond whether they strongly disagree (1), disagree (2), had no opinion (3), agree (4), or strongly agree (5) that a woman should be able to seek an abortion for any reason. Previously we treated these responses as integer scores and used OLS regression to get:
| Variable |
Coefficient |
SE |
| Intercept |
1.912 |
0.188 |
| Education |
0.079 |
0.014 |
- We now realize that it is a bit crude to treat what is clearly a categorical response as a continuous response. We know how to put a dichotomous outcome on the left-hand side (logistic regression), and we know how to model association when both variables are categorical (log-linear models), but how do we deal with a polytomous dependent variable?
- What if we broke this question up into four different contrasts.
- For five categories, create four contrasts: (1) strongly disagree vs. disagree, (2) strongly disagree vs. no opinion (3) strongly disagree vs. agree, and (4) strongly disagree vs. strongly agree. (notice I kept one of the categories in each contrast the same).
- For each contrast run a logistic regression on the individual-level data with the same independent variables: in our case years of education. Be sure to only use individuals who fall into one of the categories being contrasted.
- The results will tell you how education affects the likelihood of being in each category vs. the reference (strongly disagree). Formally, you will have a set of
for the
the contrast. Then,
- If we do this for our data we get the following values:
| |
Disagree vs. |
No Opinion vs. |
Agree vs. |
Strongly Agree vs. |
| |
Strongly Disagree |
Strongly Disagree |
Strongly Disagree |
Strongly Disagree |
| Intercept |
0.498 |
-0.865 |
-0.728 |
-2.079 |
| Education |
-0.046 |
-0.003 |
0.063 |
0.139 |
- What are these parameters telling us?
- The intercept tells us the expected odds of falling into the given category vs. the reference category when a person has zero years of education.
- The slope tells us how the log-odds of falling into the given category vs. the reference changes with every year of education. '
- It is clear that the effect of education is not uniformly to increase one's tolerance at all levels of the dependent variable. It also seems to create greater polarization.
- The model we have fit is not quite right because we don't correctly specify the error distribution of the polytomous dependent variable. We can do this more effectively by specifying a multinomial logit model. In practice, it produces very similar results as before.
| |
Disagree vs. |
No Opinion vs. |
Agree vs. |
Strongly Agree vs. |
| |
Strongly Disagree |
Strongly Disagree |
Strongly Disagree |
Strongly Disagree |
| Intercept |
0.542 |
-0.864 |
-0.708 |
-2.131 |
| Education |
-0.049 |
-0.003 |
0.062 |
0.143 |
- I have already shown you the form of this equation. We are fitting the log-odds of membership in each category of the dependent variable vs. some baseline category as a linear function of covariates:
where
is the
th individual and
is the
th category of the dependent variable. It is necessary to make one of the categories the baseline category (
)
- There are two important cautions in interpreting coefficients from a multinomial model:
- In our previous models, each covariate had only one coefficient. Now each covariate will have
coefficients: one for each contrast.
- The decision about which category to set as baseline is arbitrary. It will not affect the overall fit of the model, but will affect interpretation. To get the coefficient for the contrast between
and
:
For example, let's say we were interested in the effect of education on the log-odds of being in the agree vs no opinion group.
And the effect of education on the log-odds of being in the strongly agree vs. agree group is
This can also be done by rerunning the model with a different reference group.
- The Independence of Irrelevant Alternatives (IIA) assumption
- The most serious assumption within the multinomial logit frameworks is the assumption of the textbfindependent of irrelevant alternatives.
- This assumption is that the the relative odds between any two outcomes are independent of the number and nature of other outcomes being simultaneously considered.
- The clearest case of a violation of this property is when certain outcomes serve as substitutes for others.
- Let's say we were interested in individual's transportation choices for commuting. What if we broke our transportation categories into four: red bus, blue bus, car, and train.
- Now let's suppose that everyone is equally distributed among these categories, so that:
| Choice |
Red Bus |
Blue Bus |
Car |
Train |
| Proportion |
0.25 |
0.25 |
0.25 |
0.25 |
- Let's say we removed blue buses as an option by repainting all of our blue buses red. If these are truly distinct categories, then the blue bus group should distribute evenly among the remaining categories:
| Choice |
Red Bus |
Car |
Train |
|
| Proportion |
0.33 |
0.33 |
0.33 |
|
- What is more likely however is that the blue bus and red bus are perfect substitutes for one another so that individuals who did use the blue bus will now use the red bus:
| Choice |
Red Bus |
Car |
Train |
|
| Proportion |
0.50 |
0.25 |
0.25 |
|
- Multinomial logistic regression assumes that none of the categories can serve as substitutes. If they can serve as substitutes, then the results of multinomial logistic regression might not be very realistic.
- There are methods discussed in the book for testing the IIA assumption. There are also more complex models which allow the researcher to assess the degree to which two categories are serving as substitutes for one another.
Next: The Conditional Logit model
Up: Multinomial and Ordered Logit
Previous: Multinomial and Ordered Logit
Contents
Aaron
2005-12-21