Hi,
Let us say that we have some data frames named S1,S2,S3... that I want to join independently by column names to dataframe X. This is how they look like:
Now the problem is that I have 500 plus data frames (S1,S2,S3,S4,S5.....S500) that I want to join to dataframe "X" by any function. For example rbind. Is it possible to use some loop/function for that.
What I have best tried is create a list (nested list of dataframes) that I can later split into individual dataframes:
output = lapply(list(S1=S1,S2=S2,S3=S3),rbind,X)
The output is what I expected. So it works The only problem is, how do I specify S1 to S500?
The code below generates a named list with each of the S* data frames bound to X. This approach assumes all 500 of your data frames will be created in your environment first, puts them all in a list using the ls() function, and then walks through each one, calling it to bind to X via eval(parse(text = )). Finally, if desired, you can name all of the newly formed data frames in the list to match the naming convention you shared (i.e. S1_X).
# some data frames
S1 <- data.frame(
values = c (1:5),
numbers = c(11:15))
S2 <- data.frame(
values = c (1:5),
numbers = c(11:15))
S3 <- data.frame(
values = c (1:5),
numbers = c(16:20))
# create a list of the data frames
list_of_dataframes = ls()
# data frame to bind
X <- data.frame(
values = c (26:30),
numbers = c(501:505))
# function to bind a data frame to X
f = function(i) {
rbind(eval(parse(text = i)), X)
}
# step through the list of data frames, binding each to X
output = lapply(list_of_dataframes, f)
# set the names of each newly bound data frame
names(output) = paste0(list_of_dataframes, '_X')
output
#> $S1_X
#> values numbers
#> 1 1 11
#> 2 2 12
#> 3 3 13
#> 4 4 14
#> 5 5 15
#> 6 26 501
#> 7 27 502
#> 8 28 503
#> 9 29 504
#> 10 30 505
#>
#> $S2_X
#> values numbers
#> 1 1 11
#> 2 2 12
#> 3 3 13
#> 4 4 14
#> 5 5 15
#> 6 26 501
#> 7 27 502
#> 8 28 503
#> 9 29 504
#> 10 30 505
#>
#> $S3_X
#> values numbers
#> 1 1 16
#> 2 2 17
#> 3 3 18
#> 4 4 19
#> 5 5 20
#> 6 26 501
#> 7 27 502
#> 8 28 503
#> 9 29 504
#> 10 30 505
An alternative and a more general solution would be to use Reduce() with merge() or purrr::reduce() with dplyr::*_join(), which allow for different dataframe structures as long as there is a key to join.