I want to eventually combine multiple dataframes into one big dataframe, however, I want the identity of each dataframe to remain intact, and so I was thinking of using map_dfr to create a new column (with the header "Day_") for each file, and then somehow combining the files into one big dataframe, so I can later run the rest of the code. How might I go about doing this?
This can be done conveniently with dplyr.
I dont have your csv's so we'll go with iris dataset to generate example from.
library(tidyverse)
#example frames
df_1<- slice(iris,1)
df_2 <- slice(iris,2)
df_3 <- slice(iris,3)
(getnames <- ls(pattern="df_"))
as_a_list <- map(getnames,
~get(.)) %>%
set_names(getnames)
#binding and recording source
(result_df<- bind_rows(as_a_list,
.id="dfsource"))
#cleanup
rm(list=getnames)
rm(getnames)
rm(as_a_list)
A slightly more concise solution:
files <- list.files(pattern = "df_") # adapt as required
result_df <- purrr::map_dfr(files, read.csv, .id = "file")
Hello, thanks for the response; when I go to do this, it gives me empty character values for "files" and also result_df is empty and says 0 observations of 0 variables
Sorry I was copying the pattern from the previous replier's example.
Try this:
files <- list.files(pattern = "Day") # adapt as required
result_df <- purrr::map_dfr(files, read.csv, .id = "file")
I think map_dfr is similar to bind_rows in that the default is just to get a numeric index as the ID, which might be all that is needed in many cases.
Admittedly my example is a little overloaded / inelegant, but it is to accommodate for 'better naming' in the resulting dataset.
Im thinking that perhaps in your map_dfr approach something like
names(files) <- files
before the map_dfr step would cover that
map_dfr()
just combines map()
and bind_rows()
into one step.
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.