I am currently building a predictive, logistic model. The outcome is whether or not one has a disease (yes/no). The predictors are smoking status (binary predictor) and percentage (continuous variable).
I am confused about the interpretation after adding an interaction term and would be grateful for all the help.
Can someone please help me interpret the interaction term for me? For example, do you need to state the odds ratio of this interaction is after adjusting for percentage and smoking status (this doesn't seem right)?
Call:
glm(formula = disease ~ percentage + SmokingNA + percentage:SmokingNA,
family = binomial, data = final)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.2361 -1.0196 0.4236 0.8969 1.6511
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.81383 0.96998 1.870 0.0615 .
percentage -0.06994 0.03754 -1.863 0.0625 .
SmokingNAsmoking -2.25392 1.45208 -1.552 0.1206
percentage:SmokingNAsmoking 0.13922 0.05922 2.351 0.0187 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 61.513 on 46 degrees of freedom
Residual deviance: 51.366 on 43 degrees of freedom
AIC: 59.366
Number of Fisher Scoring iterations: 5
Here is my data:
structure(list(percentage = c(5.5, 72.1, 7.9, 80.6, 56.3, 11.5,
15.3, 12.3, 30.9, 27.5, 0.3, 5.3, 19.6, 19.8, 0.3, 40.5, 16.8,
38, 13.8, 29.9, 15.8, 15.3, 22.8, 17.2, 41.2, 17.2, 31.6, 41.2,
19.6, 38, 41.2, 29.9, 15.3, 29.9, 38, 30.9, 31.6, 15.3, 15.3,
38, 31.6, 41.3, 21.4, 0.4, 41.2, 7.6, 29.9),
SmokingNA = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L,
1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L), .Label = c("non-smoking",
"smoking"), class = "factor"), disease = structure(c(1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L,
1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L,
2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L), .Label = c("none", "disease"), class = "factor")), row.names = c(NA,
-47L), class = "data.frame")
>