I'm currently working on a curve-fitting problem where I need to find the breakpoint of a curve. I initially used a two-step approach, by fitting twice the outcome model using avg_predictions, because the seg.fit function I tried doesn't handle splines term.
Are there more efficient ways to find the breakpoint for a curve that involves splines? Am I missing a simpler method or approach that can handle this more effectively? Is it better to use glm with family=quasibinomial for fit.lm or should I use a simple lm model? Any advice or suggestions would be greatly appreciated!
My code looks something like this:
library(dplyr)
library(splines)
library(marginaleffects)
library(ggplot2)
library(segmented)
d <- my_dataset
d$binary_outcome <- factor(d$binary_outcome)
fit <- glm(binary_outcome ~ splines::ns(x_axis_variable, df=5, intercept = T),
data=d,
family="binomial")
values <- with(d, seq(min(d$x_axis_variable, na.rm = TRUE),
max(d$x_axis_variable, na.rm = TRUE),
length.out = nrow(d)))
p <- avg_predictions(fit,
variables = list(x_axis_variable = values),
byfun = function(...) qnorm(mean(...)),
transform = pnorm)
fit.lm <- glm(estimate ~ x_axis_variable, data = p, family = quasibinomial)
seg.fit <- segmented(fit.lm, seg.Z = ~ x_axis_variable, npsi=1)
breakpoint <- summary(seg.fit)$psi[, "Est."]
print(breakpoint)
ggplot(p, aes(x = x_axis_variable, y = estimate)) +
geom_point(alpha = 0.5)+
geom_ribbon(aes(ymin = conf.low, ymax = conf.high),
alpha = .3,
fill="slategray")+
geom_line(aes(y = predict(seg.fit, type = "response")), color = "blue") +
geom_vline(xintercept = breakpoint, color = "red", linetype = "dashed")