Identifying linux system libraries for R packages

Hello all,

Our organization hosts a Shiny Server on CentOS. While package binaries are available for Ubuntu thanks to a lot of volunteer effort, our server has to install packages from source.

Some packages, like rgdal, have linux library dependencies listed in free text under the "SystemRequirements" field in their DESCRIPTION file, but the amount of detail there is insufficient to identify the specific libraries needed for a given linux distribution. Therefore, our admin team has to manually track down the library versions and install them - we would like to make this process less labor-intensive.

This problem occurs in other domains:

We scrape those lists, but they only contain libraries for packages that have been tested on r-hub or used on, respectively. Ideally, we would like to identify the libraries necessary for every package on CRAN.

It seems like CRAN must have this list since it tests every package it hosts and builds binaries for Windows and OS/X - does anyone know if that is true? How do others address this problem?



I'm not sure if there is a real place to house this sort of information on CRAN currently. I think it's best practice to disclose dependencies within the if the package is on Github.

I've found the easiest way to identify system dependencies is to attempt to compile the source code, and see if it compiles properly. You could use the command line to download and run an R CMD check on the .tar file, or you could use something like Docker to create a dockerfile specifying the R packages you want to install, trying to build your container, and see what errors you come up with. For example rJava requires llzma and lbz2, but I only found that out from building the package from source, and not from a tarball.

If you would like more information, you can check out my blog: