Hi all,
I am trying to render ~3000 parameterized reports through pagedown::chrome_print.
I usually run a few samples to make sure that things are rendering correctly.
The initial problem:
- somewhere in the region of 170 ~ 210 files render perfectly, then RStudio loses its connection
The initial attempted solution:
- run the script from the command-line (macOS Catalina 10.15.7)
Result:
- either an endless stall, or errors for every file after number 200ish.
A bit of googling made me think that it was to do with the default number of files that can be open at once(?) (https://github.com/jupyterlab/jupyterlab/issues/6727)
ulimit -n
which is indeed 256
Changing it to 10000 in the same terminal window, immediately before running the script did not affect the result - still 170 ~ 210 files and then either a stall forever part-way through, or errors from then on out (letting me save what the errors were in an RDS file)
Here is my best attempt at a reprex:
reprex.Rmd:
---
title: "reprex"
date: "12/4/2020"
output:
pagedown::html_paged:
toc: false
number_sections: false
params:
a: "a"
b: 1
knit: pagedown::chrome_print
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
Parameter "a" is: `r params$a`
Parameter "b" is: `r params$b`
render.R:
# make sure the ulimit is high enough
system(command = "ulimit -n 10000")
# 400 sets of parameters
parameters <- tidyr::expand_grid(a = c("a", "b", "c", "d"), b = runif(100))
# make the parameters compatible with YAML
# side note - is there a better way to do this?
params <- as.list(rlist::list.parse(parameters))
# prep the outpaths
args <- list(a = parameters$a, b = parameters$b)
paths<-purrr::pmap(args, ~c(paste0("./", ..1, "/", ..1, "_", ..2, ".pdf")))
# keep output/errors
safe_render <- purrr::safely(pagedown::chrome_print)
# render
output <- purrr::map2(params, paths,
~safe_render(
rmarkdown::render("reprex.rmd", params = .x, envir = new.env()), output = .y))
# make errors etc available after running from the command line
saveRDS(output, "output.RDS")
Result:
- Command-line
processing file: reprex.rmd
...
...
Output created: reprex.html
200ish times, then
エラー: 予想外の ',' です in "ms = .x," 実行が停止されました
(unexpected "," in "ms = .x", action terminated)
- troubleshooting post-hoc
(sorry, I am still pretty bad with working with lists)
``` {r}
library(magrittr)
tibble::tibble(output = readRDS("output.RDS")) %>%
tidyr::unnest_wider(output) %>%
dplyr::count(error)
```
> ||||
> | --- | --- | --- |
> |1|NULL|200|
> |2|list(message = "Failed to generate output. Reason: Failed to open http://127.0.0.1:5306/reprex.html (HTTP status code: 500)", call = force(expr))|1|
> |3|list(message = **"Cannot create pipe when running '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome' (system error 24, Too many open files)** @unix/processx.c:455 (processx_exec)", call = rethrow_call(c_processx_exec, command, c(command, args), stdin, stdout, stderr, pty, pty_options, connections, env, windows_verbatim_args, windows_hide_window, private, cleanup, wd, encoding, paste0("PROCESSX_", private$tree_id, "=YES")), `_nframe` = 16, `_ignore` = list(c(17, 21)))|4|
> |4|list(**message = "cannot make processx socketpair (system error 24, Too many open files) @unix/processx.c:408 (processx__make_socketpair)**", call = rethrow_call(c_processx_connection_create_pipepair, encoding, nonblocking), `_nframe` = 17, `_ignore` = list(c(18, 22)))|195|
The parts of the error messages that I think are pertinent are in bold.
I have had success rendering batches this size and greater in the past with flexdashboard > png > imagemagick to convert to pdf (valueboxes don't render correctly when going directly to pdf) using an almost identical workflow, so I am trying to figure out what is going on. I imagine it must be something to do with headless chrome?
I have been explicitly requested printable, paged, pretty pdfs for this one, so worst case scenario, I split it all up and render 100 reports at a time, but I don't want to waste an entire day doing so.
edit session info for reprex
error/result identical for office computer running R 4.0.3, otherwise platform/OS version the same
sessionInfo()
> R version 4.0.2 (2020-06-22)
> Platform: x86_64-apple-darwin17.0 (64-bit)
> Running under: macOS Catalina 10.15.7
>
> Matrix products: default
> BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods
> [7] base
>
> loaded via a namespace (and not attached):
> [1] rstudioapi_0.13 knitr_1.30 magrittr_2.0.1
> [4] tidyselect_1.1.0 R6_2.5.0 rlang_0.4.8
> [7] fansi_0.4.1 dplyr_1.0.2 tools_4.0.2
> [10] data.table_1.13.2 xfun_0.19 cli_2.2.0
> [13] htmltools_0.5.0 ellipsis_0.3.1 yaml_2.2.1
> [16] assertthat_0.2.1 digest_0.6.27 tibble_3.0.4
> [19] lifecycle_0.2.0 crayon_1.3.4 purrr_0.3.4
> [22] tidyr_1.1.2 vctrs_0.3.5 rlist_0.4.6.1
> [25] glue_1.4.2 evaluate_0.14 rmarkdown_2.5
> [28] compiler_4.0.2 pillar_1.4.7 generics_0.1.0
> [31] pagedown_0.12 pkgconfig_2.0.3
~~~~~~~~~