Have been trying to fit a multiple logistic regression model in a given dataset but I seem to find 'strange results' with some variables. The p-values and the confidence intervals seem not to be consistent. However, when I tried fitting the exact same model in Stata I get consistent results. Is there a way this can be handled? Is there an option I need to specify to cater for this?... How would I proceed?
library(tidyverse)
set.seed(2021)
testdata <- tibble(
var1 = rbinom(1114, 1, 0.12),
var2 = rbinom(1114, 1, 0.82),
var3 = rbinom(1114, 1, 0.60),
var4 = rbinom(1114, 1, 0.18),
var5 = rbinom(1114, 1, 0.12),
var6 = rbinom(1114, 1, 0.05),
var7 = rbinom(1114, 1, 0.63),
var8 = rbinom(1114, 1, 0.20),
var9 = rbinom(1114, 1, 0.06),
var10 = rbinom(1114, 1, 0.40),
var11 = rbinom(1114, 1, 0.35),
var12 = rbinom(1114, 1, 0.32),
outcome = rbinom(1114, 1, 0.04)
) %>%
mutate(across(.cols = everything(),
~factor(., levels = c(0, 1),
labels = c("No", "Yes"))))
mvariate.regress <- function(outcome, covariates, mydata) {
form <- paste(outcome, "~",
paste(covariates, collapse = " + "))
model1 <- glm(as.formula(form),
data = mydata, family = binomial)
model1
}
ipvars <- paste0("var", 1:12)
mlogitfit <- mvariate.regress("outcome", ipvars, testdata)
summary(mlogitfit)
confint(mlogitfit)
var1 and var2 have inconsistent pvalues and confidence intervals