readxl::read_xlsx is really slow reading from my network drives. This has started within the past few months. Code that I have been running for years is almost un-usable now.
Do you think this is due to the architecture of the network drives? I am wondering if this is something that I can alleviate or if I will need to modify my approach to copy files local or use a different package.
Here is a reprex
localpath <- "path to a folder on my computer"
networkpath <- "path to a shared network drive"
#create a test tibble with 26 rows and 2 columns
df <- tibble::tibble(column_a = letters, column_b = 1:26)
time test #1
#write it to local drive
writexl::write_xlsx(df, path = stringr::str_c(localpath, "testfile.xlsx"))
#read in from local drive.
microbenchmark::microbenchmark(
readxl::read_xlsx(stringr::str_c(localpath, "testfile.xlsx"))
)
#this takes an average of 11 miliseconds
time test #2
#write it to a network drive
writexl::write_xlsx(df, path = stringr::str_c(networkpath, "testfile.xlsx"))
#read it in from a network drive
microbenchmark::microbenchmark(
readxl::read_xlsx(stringr::str_c(networkpath, "testfile.xlsx"))
)
#this takes an average of 679 miliseconds which is 61 times slower than above.
this section does the same tests using csv files to verify the issue is specific to readxl::read_xlsx
readr::write_csv(df, stringr::str_c(localpath, "testfile.csv"))
#this takes 177 miliseconds
microbenchmark::microbenchmark(
readr::read_csv(stringr::str_c(localpath, "testfile.csv"))
)
readr::write_csv(df, stringr::str_c(networkpath, "testfile.csv"))
#this takes 336 miliseconds which is 2 times slower
microbenchmark::microbenchmark(
readr::read_csv(stringr::str_c(networkpath, "testfile.csv"))
)
Here is my sessionInfo()
sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readxl_1.4.2 writexl_1.4.2
loaded via a namespace (and not attached):
[1] compiler_4.0.3 tools_4.0.3 cellranger_1.1.0