I appreciate this discussion, and the help I've received thus far.
@lionel, I'll be honest, I spent yesterday well into the night trying to figure out lazyeval or whatever the proper term would be, to no avail. Maybe I'm just a novice, maybe worse, but point is, I didn't see how what you put applied to my scenario.
Back to John:
At the end of the day, what you wrote does only one outcome variable at a time, and only one dataframe at a time.
I have a list of dataframes. They differ only by one independent variable's column. I don't see how I could recode it without generating a model for each case...whereupon your code scales to meet the task even less for that scenario. Not to bash your code, just stating the limitations for my case.
Finally, the whole reason I'm trying to resolve this lazyeval thing is for the sole purpose of having the $call be appropriate, and not $call: lm(formula = ., data = df)
.
Which, when trying to use sjPlot, can't reconstruct where the original data came from properly. That's the whole reason for this headache.
My hopes are that I can somehow retain the reference to the right dataframe in my list of df's, such that when I purrr through them to generate my fitted results, the references are all properly maintained. Otherwise, the code I've come up with to generate all my results for all my data is this, if anyone is interested.
dff_ALL <- list(df1, df2, df3) #differ only on "x3" below!
variables <- c("y1", "y2", "y3")
models_test <- map(variables, ~paste0(.x, " ~ x1 + x2 + x3"))
allresults <- map(dff_ALL, function(x) {
res <- map(models_test, function(z) {
tmp <- lm(as.formula(z), data = x)
tmp$call <- (z)
tmp <-
return(tmp)
})
})
# Set names from models (otherwise reference from list is lost)
for (i in seq_along(allresults)){
allresults[[i]] <- set_names(allresults[[i]], models_test)
}
At which point you can just broom::tidy/glance/augment. Yes, I can just emulate sjPlot functionality for marginal effects, but that becomes cumbersome very quickly.
Conceptually. I have a set of dependent variables I want to test a set of predictors on. I want to test those models on a set of dataframes that are, as far as dimensions, cases, and variables, identical. The values in one column are different, I'm stating for a third time. Maybe there's an easier way to approach this, once you know that.
I want to keep it tidy, start to finish, and I want to have the ability to expand my analysis very easily, by adding another string to variables
or models_test
construction of predicting vars.
I'm actually pretty surprised that I didn't find a single "many models" approach that didn't nest the dataframes. My case doesn't work with that, evidently, and I haven't managed to get the horrid-looking tribble
approach, as shown here.