I have a package under development that includes an initial workflow for importing up to 12 .csv files at a time using purrr::map(), validating each of them, and then creating a tibble of the validation results. The number of .csv files is not predictable except that it is 2 <= files <= 12.
I've created a reprex below that implements a very simple version of this process (while also creating some sample data). The workflow itself is rather complex, but I've tried to distill it down here as best I can.
The reprex:
- Creates two sample data frames named
aandb - Writes both to a temporary
.csvfile - Imports them both back into the session using
map()to simulate the actual workflow - Names the first list item (
a)redand the second list item (b)blue. - Creates a simple validation function.
- Uses
map()to apply the validation function to bothaandb. - Prints the validation results.
Herein lies the challenge - I want to take the name of each list item (i.e. red and blue) and add them as observations in the validation results. I have the process down as a for loop, which is the last step in the reprex before I print the type of output I ultimately want to create. I cannot for the life of me figure out how to do this final step (of writing list names in as observations) with purrr as opposed to with the loop. Any suggestions would be greatly appreciated!
# load packages
suppressMessages(library(dplyr))
library(purrr)
library(readr)
# create data
a <- data.frame(
id = c(1, 2, 3, 4, 5),
group = c("red", "red", "red", "red", "red"),
outcome = c(TRUE, FALSE, FALSE, TRUE, FALSE),
stringsAsFactors = FALSE
)
b <- data.frame(
id = c(1, 2, 3, 4, 5),
group = c("blue", "blue", "blue", "blue", "blue"),
outcome = c(FALSE, TRUE, FALSE, TRUE, TRUE),
stringsAsFactors = FALSE
)
# save as csv to tempdir
a_file <- tempfile(pattern = "", fileext = ".csv")
write_csv(a, path = a_file)
b_file <- tempfile(pattern = "", fileext = ".csv")
write_csv(b, path = b_file)
# create list of files
files <- dir(path = tempdir(), pattern = "*.csv")
# combine list of files into single list using map()
files %>%
map(~ suppressMessages(suppressWarnings(read_csv(file.path(tempdir(), .))))) -> data
# name the two items in data
names(data) <- c("red", "blue")
# validation function
validate <- function(item){
# logic check 1 - does it have 3 cols?
if (ncol(item) == 3){
a <- TRUE
} else {
a <- FALSE
}
# logic check 2 - is it a tibble?
classes <- class(item)
if (classes[1] == "tbl_df"){
b <- TRUE
} else {
b <- FALSE
}
# concatenate results
out <- c(a,b)
# return results
return(out)
}
# validate items by iterating over list
data %>%
purrr::map(validate) -> result
# print results
result
#> $red
#> [1] TRUE TRUE
#>
#> $blue
#> [1] TRUE TRUE
# add name as observation
for (i in 1:length(result)){
result[[i]] <- c(result[[i]], names(result[i]))
}
# print results again
result
#> $red
#> [1] "TRUE" "TRUE" "red"
#>
#> $blue
#> [1] "TRUE" "TRUE" "blue"
Created on 2018-10-07 by the reprex
package (v0.2.0).

