I am working on logistic regression and was hoping to spline transform my continuous predictor (percentage). I have several questions as follows and would be very grateful for any guidance.
- Below is my attempt to incorporate the spline transformation into my predictor. I want to know whether I am doing this correctly?
library(splines)
knots <- quantile(final$percentage, p = c(0.25, 0.5, 0.75))
model <- glm (disease ~ bs(percentage, knots = knots), data = final, family=binomial)
- If so, how can I interpret the below output from the above code? For example, how can I obtain the odds ratio based on these coefficients?
glm.fit: fitted probabilities numerically 0 or 1 occurred
Call:
glm(formula = disease ~ bs(percentage, knots = knots), family = binomial,
data = final)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.9502 -1.2835 0.5687 0.9548 1.3618
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.4845 1.1982 0.404 0.686
bs(percentage, knots = knots)1 2.7183 3.3578 0.810 0.418
bs(percentage, knots = knots)2 -1.7655 2.3966 -0.737 0.461
bs(percentage, knots = knots)3 2.1134 3.1838 0.664 0.507
bs(percentage, knots = knots)4 -6.1535 10.9851 -0.560 0.575
bs(percentage, knots = knots)5 -8.1928 56.9180 -0.144 0.886
bs(percentage, knots = knots)6 10636.9163 8119.2865 1.310 0.190
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 61.513 on 46 degrees of freedom
Residual deviance: 51.817 on 40 degrees of freedom
AIC: 65.817
Number of Fisher Scoring iterations: 14
Here is my dataset:
structure(list(percentage = c(5.5, 72.1, 7.9, 80.6, 56.3, 11.5,
15.3, 12.3, 30.9, 27.5, 0.3, 5.3, 19.6, 19.8, 0.3, 40.5, 16.8,
38, 13.8, 29.9, 15.8, 15.3, 22.8, 17.2, 41.2, 17.2, 31.6, 41.2,
19.6, 38, 41.2, 29.9, 15.3, 29.9, 38, 30.9, 31.6, 15.3, 15.3,
38, 31.6, 41.3, 21.4, 0.4, 41.2, 7.6, 29.9),
SmokingNA = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L,
1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L), .Label = c("non-smoking",
"smoking"), class = "factor"), disease = structure(c(1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L,
1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L,
2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L), .Label = c("none", "disease"), class = "factor")), row.names = c(NA,
-47L), class = "data.frame")