There's a few ways that you can do this, but I like to use the purrr
package for tasks like this. Here's a couple of options for doing so.
First, I'd get a vector of file locations:
library(tidyverse)
library(readxl)
# Get the paths to each file:
files <- list.files(
# Currently set to working directory location
path = ".",
# Look for Excel workbooks
pattern = ".xlsx",
# Include the full path to the file
full.names = TRUE)
One option would then be to read in the data for one sheet at a time (i.e., all files, but same sheet from each file):
# Store the names of the sheets we'll be accessing, using one of the files
# as the template for the sheet names
all_sheets <- excel_sheets(files[1])
# .x is our vector of file paths. map_df gives us a single data frame/tibble as its output
# We're saying "for each file path..."
first_sheet_contents <- map_df(.x = files,
# "...read the xlsx file, but only for the the sheet with this name"
.f = ~ read_excel(path = .x,
sheet = all_sheets[1]))
# Here's an example of the output from multiple sheets condensed into one data frame
first_sheet_contents
# # A tibble: 4 x 2
# first_col second_col
# <dbl> <chr>
# 1 1 a
# 2 4 b
# 3 8 a
# 4 5 f
# And then you could do the same for the second sheet, etc...
second_sheet_contents <- map_df(.x = files,
.f = ~ read_excel(path = .x,
sheet = all_sheets[2]))
An alternative option might be to read all sheets from all files in one step. We'll get a list as our output. Within each list item will be a sublist. Main list items correspond to files, and their sub-items correspond to sheets in the file. For example:
all_file_contents <- map(.x = files,
.f = ~ {
temp_file <- .x
map(.x = excel_sheets(.x),
.f = ~ read_excel(path = temp_file,
sheet = .x) %>%
# Include sheet name. Not required,
# but might be helpful for checking
mutate(sheet_name = .x))
})
all_file_contents
# [[1]] # This is file one...
# [[1]][[1]]
# # A tibble: 2 x 3
# first_col second_col sheet_name
# <dbl> <chr> <chr>
# 1 1 a C_1
# 2 4 b C_1
# [[1]][[2]]
# # A tibble: 2 x 3
# first_col second_col sheet_name
# <dbl> <chr> <chr>
# 1 1 d C_2
# 2 6 b C_2
# [[2]] # This is file two...
# [[2]][[1]]
# # A tibble: 2 x 3
# first_col second_col sheet_name
# <dbl> <chr> <chr>
# 1 8 a C_1
# 2 5 f C_1
# [[2]][[2]]
# # A tibble: 2 x 3
# first_col second_col sheet_name
# <dbl> <chr> <chr>
# 1 5 a C_2
# 2 6 b C_2
You can invert the list so that you have data from the same sheet IDs grouped together
transpose(all_file_contents) %>%
map(.x = .,
.f = ~ reduce(.x = .x, .f = bind_rows))
# [[1]]
# # A tibble: 4 x 3
# first_col second_col sheet_name
# <dbl> <chr> <chr>
# 1 1 a C_1
# 2 4 b C_1
# 3 8 a C_1
# 4 5 f C_1
# [[2]]
# # A tibble: 4 x 3
# first_col second_col sheet_name
# <dbl> <chr> <chr>
# 1 1 d C_2
# 2 6 b C_2
# 3 5 a C_2
# 4 6 b C_2
OR, you could just read all the sheets in as a single data frame:
transpose(all_file_contents) %>%
map_df(.x = .,
.f = ~ .x)
# # A tibble: 8 x 3
# first_col second_col sheet_name
# <dbl> <chr> <chr>
# 1 1 a C_1
# 2 4 b C_1
# 3 8 a C_1
# 4 5 f C_1
# 5 1 d C_2
# 6 6 b C_2
# 7 5 a C_2
# 8 6 b C_2
Created on 2021-03-24 by the reprex package (v1.0.0)