next up previous contents
Next: Goodness-of-fit Up: Logistic Regression Previous: Odds and Probabilities   Contents

Logistic Regression

Logistic regression

The coefficients returned from a logistic regression model are log-odds ratios. They tell us how the log-odds of a "success" change with a one-unit change in the independent variable. Increasing the log-odds of a success means increasing the probability, and vice-versa decreasing the log-odds of a success means decreasing the probability. Therefore, the sign of the log-odds ratio indicates the direction of its relationship: + means a positive relationship between $ x_{1}$ and the likelihood of a success, and - means a negative relationship. In order to get an intuitive sense of how much things are changing, we need to get the exponential of the log-odds ratio, which gives us the odds ratio itself. Let's return to our example from yesterday looking at survival of the Titanic by gender:

$\displaystyle \log \frac{p_{i}}{1-p_{i}}=\beta_{0}+\beta_{1}x_{i}
$

$\displaystyle \log \frac{p_{i}}{1-p_{i}}=-1.44+2.42x_{i}
$

The positive coefficient indicates that women were more likely to survive the Titanic than men. But our coefficients are related to the log-odds of survival. Let's exponentiate both sides to see how they related to the odds of survival.

$\displaystyle \frac{p_{i}}{1-p_{i}}=e^{-1.44}e^{2.42x_{i}}=(0.24)(11.25)^{x_{i}}
$

The odds for any individual is a multiplicative function of a "baseline" odds and "odds ratios" of their characteristics. The predicted odds for a man are:

$\displaystyle \frac{p_{i}}{1-p_{i}}=(0.24)(2.42)^{0}=0.24
$

The odds for a woman are:

$\displaystyle \frac{p_{i}}{1-p_{i}}=(0.24)(11.25)^{1}=0.24(11.25)=2.67
$

For the log-odds ratios, a negative value indicates a negative relationship. But all odds-ratios are positive values. The distinction regarding a positive or negative relationship in the odds ratios is given by which side of 1 they fall on. 1 indicates no relationship. Less than one indicates a negative relationship and greater than one indicates a positive relationship.

The interpretation is similar with continuous variables. Let's take the case of predicting survival by fare paid.

$\displaystyle \log \frac{p_{i}}{1-p_{i}}=-0.882+0.012x_{i}
$

Once again let's exponentiate this to get the results in terms of the odds of survival.

$\displaystyle \frac{p_{i}}{1-p_{i}}=e^{-0.882}e^{0.012x_{i}}=(0.414)(1.012)^{x_{i}}
$

Once again we have a multiplicative relationship. Let's take the three cases where the paid either zero, one, or two pound for his/her ticket.
$\displaystyle \frac{p_{1}}{1-p_{1}}$ $\displaystyle =$ $\displaystyle (0.414)(1.012)^{0}=0.414$  
$\displaystyle \frac{p_{2}}{1-p_{2}}$ $\displaystyle =$ $\displaystyle (0.414)(1.012)^{1}=0.414(1.012)$  
$\displaystyle \frac{p_{3}}{1-p_{3}}$ $\displaystyle =$ $\displaystyle (0.414)(1.012)^{2}=0.414(1.012)(1.012)$  

For the first person, the odds of survival is simply given by the exponential of the intercept term, which in this case leads to an odds of 0.414. For the second person, the odds of survival increases by a factor of 1.012 because this person paid a pound more than the first. For the third person, the odds of survival increase a further factor of 1.012 because this person paid a pound more than the second. The exponential of the coefficient then gives the expected odds ratios between two individuals who only differ by one unit on the given independent variable.

We can think of interactions in a similar way: they tell us how much the odds ratio related to one variable is different between groups. Lets now interact gender and fare in our Titanic example.

$\displaystyle \log \frac{p_{i}}{1-p_{i}}=-1.61+1.919x_{i1}+0.006x_{i2}+0.015x_{i1}x_{i12}
$

where $ x_{1}$ is a female indicator and $ x_{2}$ is the fare paid Exponentiate again:

$\displaystyle \frac{p_{i}}{1-p_{i}}=(0.20)(6.81)^{x_{i1}}(1.006)^{x_{i2}}(1.015)^{x_{i1}x_{i2}}
$

For men:

$\displaystyle \frac{p_{i}}{1-p_{i}}=(0.20)(1.006)^{x_{i2}}
$

For women:

$\displaystyle \frac{p_{i}}{1-p_{i}}=(0.20)(6.81)(1.006)^{x_{i2}}(1.015)^{x_{i2}}=(1.36)(1.021)^{x_{i2}}
$

The exponential of the gender effect (6.81) gives us the level odds ratio between genders, while the exponential of the interaction term tells us how much lower/higher in a multiplicative sense the odds ratio between survival and fare is for women than men. In this case, gender differences in survival increased with fare.

Multivariate: change in the odds ratios holding all the other variables constant


next up previous contents
Next: Goodness-of-fit Up: Logistic Regression Previous: Odds and Probabilities   Contents
Aaron 2005-12-21