We have an internal python library that we want to build an R package/wrapper for. The python package only needs to be installed once and not every time the R package is loaded. The documentation says to use py_require()
over virtualenv_create()
but the former creates a temp environment and doesn't work if the user is offline or not behind the firewall. It also slows down dashboards, etc due to how it downloads the python pkgs each time the R package is loaded.
What it seems the docs suggest:
.onLoad <- function(libname, pkgname) {
reticulate::py_require(
list(
"git+https://github.com/astral-sh/ruff", # reprex
# "git+https://github.our-enterprise-account/py/pkg, # actually need
"pandas",
"pyarrow"
)
)
We have chosen to use the virtualenv route and created a function to install the packages so that .onLoad()
looks like this instead
.onLoad() {
if (!reticulate::virtualenv_exists(virtual_env_path)) {
return(message("First run 'install_our_pkg()'"))
}
use_virtualenv(virtualenv_name)
}
What is the best practice here?