Complete questions attract more informed answers. See the FAQ: How to do a minimal reproducible example reprex
for beginners.
Given a model for one data frame, it's not difficult to write a function to apply the same model to multiple data frames, provided the data frames have consistent variable names. (If not, some preprocessing will be required to conform each data frame to the exemplar. This raises an important consideration in workflow design—the separation of analysis and preparation. Use short names with a lookup table if needed and save the descriptive longer names for presentation tables.)
To pick a naive example, assume we have a series of mtcars
data frame, structurally identical but differing the makes and models of cars and we are interested in regressing mpg
on drat
.
suppressPackageStartupMessages({
library(purrr)
})
make_mod <- function(x) lm(mpg ~ drat, data = x)
summary(make_mod(mtcars))
#>
#> Call:
#> lm(formula = mpg ~ drat, data = x)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -9.0775 -2.6803 -0.2095 2.2976 9.0225
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -7.525 5.477 -1.374 0.18
#> drat 7.678 1.507 5.096 1.78e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 4.485 on 30 degrees of freedom
#> Multiple R-squared: 0.464, Adjusted R-squared: 0.4461
#> F-statistic: 25.97 on 1 and 30 DF, p-value: 1.776e-05
dfs <- list(mtcars = mtcars,mtcars2 = mtcars)
results <- dfs %>% map(make_mod)
summary(results[1]$mtcars)
#>
#> Call:
#> lm(formula = mpg ~ drat, data = x)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -9.0775 -2.6803 -0.2095 2.2976 9.0225
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -7.525 5.477 -1.374 0.18
#> drat 7.678 1.507 5.096 1.78e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 4.485 on 30 degrees of freedom
#> Multiple R-squared: 0.464, Adjusted R-squared: 0.4461
#> F-statistic: 25.97 on 1 and 30 DF, p-value: 1.776e-05
summary(results[2]$mtcars2)
#>
#> Call:
#> lm(formula = mpg ~ drat, data = x)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -9.0775 -2.6803 -0.2095 2.2976 9.0225
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -7.525 5.477 -1.374 0.18
#> drat 7.678 1.507 5.096 1.78e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 4.485 on 30 degrees of freedom
#> Multiple R-squared: 0.464, Adjusted R-squared: 0.4461
#> F-statistic: 25.97 on 1 and 30 DF, p-value: 1.776e-05