Dear Braintrust,
I'm facing an interesting (at least for me ) challenge (especially because I'm not used with wide datasets).
My dataset (my_data) has 50 potentials covariates (X's that can either be numerical or factors).
I'm interested to do first a univariable analysis using a Poisson model (my dependent variable Y is a count) then to automatically select variables that would go to the next step of multivariable modelling
I've created my vector of interesting X's (u_var) to pass through univariable analysis
lapply(u_var, #u_var is the vector of X names
function(var) {
formula <- as.formula(paste("Y ~", var, "+offset(n))")) #I specify my Poisson model with an offset
res.pois.uni <- glm(formula, data = my_data, fam = poisson(link = log))
summary(res.pois.uni)
})
I then obtain my 50 univariable models.
However, I would be interested to only keep those which have a P-value below a specific threshold (in my field of research commonly used 0.15 to .2)
Then I want to use the remaining variables to use them for multivariable modeling.
is there a way to specify it simply during my lapply command?
I would be very interested to then only keep variable of interest I could put to backward elimination strategy
Thanks for your help!