Please, I have an assignment involving a large volume of data saved in 10 different files. I need an explanation and the code procedure for merging them into one singe R data frame for analysis.
These files are arranged in similar number of columns, and column title.
Note, they are not different sheets on an Excel file but different Excel files. Kindly help. ( Kindly find the image attached below)
For load all the files, you could use this: Remember set working directory
setwd(~path) # working directory
library(readxl)
files.list <- list.files(pattern='*.xlsx') # all the files are .xlsx
df.list <- lapply(files.list, read_excel)
You can break this task up into a few elements. The first is the retrieval of the files. Secondly, the reading of files. Finally, merging them.
library(readxl)
library(dplyr)
library(tidyr)
Data_Files <- data.frame(excel_path =list.files(full.names=TRUE,pattern = ".xlsx$",
path=getwd())) # Find the files in your working directory/path
Data_Files <- Data_Files %>%
mutate(excel_files = lapply(excel_path, read_xlsx)) %>% # Read Each xlsx file.
unnest(excel_files) # Expand the list of excel files by binding rows.
Please load the data and combine them using bind_rows() assuming that the columns are matching. If they are in xlsx then use .xlsx instead of .csv that I have used.
Use purrr or as shown in the articles. Everything you need is already there. Just read the articles. You can't expect people to do your work for you.
Otherwise, don't load the files individually, that is a waste of your time and not good practice given that you're just repeating a function ten times. Also, use Projects rather than setting the working directory. It is just good practice.