Logistic Regression Coefficient Interpretation

Hello I'm working on the interpretation of logistic regression. I am not sure whether I understand it fully. Can you help me with it? Really appreciate it.

This is my sample data. I want to study factors impacting the college attendance. The dependent variable is the dummy variable, whether to attend college, the independent variables are family income, a continuous variable, and whether live in a city, a dummy variable.

# Set seed for reproducibility

# Number of observations
n <- 20

# Simulate data
attend_college <- rbinom(n, 1, 0.5)  # 50% chance of attending college
family_income <- rnorm(n, mean = 50000, sd = 15000)  # Family income with mean = $50,000, sd = $15,000
live_in_city <- rbinom(n, 1, 0.6)  # 60% chance of living in a city

# Create a dataframe
simulated_df <- data.frame(AttendCollege = attend_college, FamilyIncome = family_income, LiveInCity = live_in_city)

# Display the first few rows of the dataframe
# Perform logistic regression
model <- glm(AttendCollege ~ FamilyIncome + LiveInCity, family = binomial(link = "logit"), data = simulated_df)

# Display the summary of the logistic regression model

This is the output.

glm(formula = AttendCollege ~ FamilyIncome + LiveInCity, family = binomial(link = "logit"), 
    data = simulated_df)

               Estimate Std. Error z value Pr(>|z|)
(Intercept)   3.397e-01  1.285e+00   0.264    0.792
FamilyIncome  1.421e-05  2.506e-05   0.567    0.571
LiveInCity   -5.751e-01  1.058e+00  -0.544    0.587

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 25.898  on 19  degrees of freedom
Residual deviance: 25.367  on 17  degrees of freedom
AIC: 31.367

Number of Fisher Scoring iterations: 4

Can I interpret the relationship between the family income and whether to attend college in this way: each additional dollar of family income increases the odds of attending college by approximately (1.421e-05) *100 = 0.001421%, all else being equal. For example, if family income increases by 1000 dollars, the likelihood of attending college increases by around 0.001421% * 1000 = 1.421 %. Or do I need to calculate the exp(0.001421)? Appreciate your help.

You have it right, except that that changing the "likelihood" (probability) is wrong. It's as you say first, the odds that change. That's the probability of attending college divided by the probability of not attending.

(Also, the estimates aren't statistically significant. So don't rely on them much.)

Thanks for your help. Yeah, it's only the simulated data but still thanks for your reminder. :smile:

So if I want to calculate the probability change, I need to use P(attend) / [1-P(attend)] = odds = 0.001421% * 1000 = 1.421 %, P(attend) = 1.421% / (1+1.421%)?

It's messier than that. See if r - Converting logistic regression output from log odds to probability - Stack Overflow helps.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.