I manage a linux server for a group of ~8 users. We are all lab-based scientists and most are relatively new users.
We are all using instances of rstudio-server running from the same R installation. I have encouraged people to use renv and so we all have separate R package libraries and renv caches. Eventually I may want to upgrade to R 4.2. Is there a way to upgrade the entire renv cache at once from 4.1 to 4.2? Then update symlinks project by project? What I want to avoid is the catastrophic situation where each user must re-install every package in every project.
Unfortunately, there isn't currently a direct way to replicate an renv cache for a new version of R. Instead, I would recommend something like the following. For each project:
Using R 4.1.x, call renv::snapshot() to ensure the project lockfile is up-to-date
Using R 4.2.x, call renv::restore() to install the packages associated with each project.
Alternatively, if you felt comfortable building something by hand, you could use internal renv APIs like renv:::renv_cache_list() to find installed packages in the cache (for R 4.1.), and then devise a scheme to install those packages (via renv::install() or similar).
Hopefully, this is something you'd be able to automate beforehand (to run on behalf of your users) so that this could be done in batch before the "official" migration. You might need to adjust the arguments provided to snapshot() and restore() depending on how the migration is eventually structured; please see the documentation for more details.
That said, it's important to be aware that the versions of packages installed with R 4.1.x may or may not be functional with R 4.2.x. In such a case, it might be better to install the latest-available versions of the packages in use for a project, rather than the exact versions installed before. This is especially true if users are making use of packages from Bioconductor, as Bioconductor releases are normally intended to be used only for a particular version of R.
You might also want to consider allowing users to make use of a shared renv cache. This can be done by setting the RENV_PATHS_CACHE environment variable, to point to a shared directory location that is readable and writable by all users. See Introduction to renv • renv for more details.
I've been trying to distribute rather than centralize/share things for the lab but in this case a shared renv cache is probably a better choice to reduce frustration. Plus we have lots of large data packages that should not be duplicated for each user.
Followup question: My plan is to
create the 4.2 cache in a shared directory
install it with latest-available versions of all packages in use
update all of the renv.lock files with renv::snapshot()
propagate lock files to users via git
redirect user cache paths as described in documentation
At this point renv will flag the users that their libraries and lock files are out of sync and suggest restore(). I have to admit I am often mystified by the behavior of restore(). When I think it should link from cache it downloads and rebuilds. And then things like this happen:
* The project may be out of sync -- use `renv::status()` for more details.
[ master ] > renv::status()
The following package(s) are out of sync:
Package Lockfile Version Library Version
brew 1.0-7 1.0-7
desc 1.4.1 1.4.1
dplyr 1.0.8 1.0.8
markdown 1.1 1.1
pkgbuild 1.3.1 1.3.1
reshape2 1.4.4 1.4.4
sessioninfo 1.2.2 1.2.2
waldo 0.4.0 0.4.0
Use `renv::snapshot()` to save the state of your library to the lockfile.
Use `renv::restore()` to restore your library from the lockfile.
[ master ] > renv::restore()
The following package(s) will be updated:
# CRAN ===============================
- brew [1.0-7 -> 1.0-7]
- desc [1.4.1 -> 1.4.1]
- dplyr [1.0.8 -> 1.0.8]
- markdown [1.1 -> 1.1]
- pkgbuild [1.3.1 -> 1.3.1]
- reshape2 [1.4.4 -> 1.4.4]
- sessioninfo [1.2.2 -> 1.2.2]
- waldo [0.4.0 -> 0.4.0]
Do you want to proceed? [y/N]: y
Installing brew [1.0-7] ...
OK [linked cache]
Installing desc [1.4.1] ...
OK [linked cache]
Installing dplyr [1.0.8] ...
OK [linked cache]
Installing markdown [1.1] ...
OK [linked cache]
Installing pkgbuild [1.3.1] ...
OK [linked cache]
Installing reshape2 [1.4.4] ...
OK [linked cache]
Installing sessioninfo [1.2.2] ...
OK [linked cache]
Installing waldo [0.4.0] ...
OK [linked cache]
[ master ] > renv::status()
The following package(s) are out of sync:
Package Lockfile Version Library Version
brew 1.0-7 1.0-7
desc 1.4.1 1.4.1
dplyr 1.0.8 1.0.8
markdown 1.1 1.1
pkgbuild 1.3.1 1.3.1
reshape2 1.4.4 1.4.4
sessioninfo 1.2.2 1.2.2
waldo 0.4.0 0.4.0
Use `renv::snapshot()` to save the state of your library to the lockfile.
Use `renv::restore()` to restore your library from the lockfile.
In this case this is all happening with R4.1. At first I thought ok the repo or the hash must be different for these packages which is why they are out of sync (but also unclear why this would happen only for these). But then after restore, they are still out of sync...until I run snapshot(). I just point this out to show that I don't have a good grasp on what restore is doing under various circumstances.
Obviously after I reinstall all of the packages for R4.2 into the new cache I just want users to be able to link directly to the new cache and get back to work. renv::restore() will do that, right?
That seems like it could be a bug in renv. Would you be able to share the output of the following?
data <- renv::status()
str(data$library$Packages$waldo)
str(data$lockfile$Packages$waldo)
renv:::renv_lockfile_diff_record(data$library$Packages$waldo, data$lockfile$Packages$waldo)
renv seems to believe the package records differ in some way for these packages, but it seems like they must differ in a way that should be considered inconsequential.
As you can see from the diff, all of this is coincident with my starting to use R universe for some lab packages. Unclear to me if that was actually the cause. Anyways, yes the hash is the same on these packages so it is inconsequential in the end.
Thanks! Indeed, this looks like a missing piece from renv's support for r-universe repositories. Would you be willing to file this at Issues · rstudio/renv · GitHub, so that this doesn't get lost?