we are working on a R Studio server (Pro) version. Unfortunately we can only use the RStudio GUI and don't have access to the server itself. Each time a user encounter a problem with installing a new package, we need to approach the It department and ask them to install the needed dependencies. This can be very tiresome and annoying, especially as many users are using the server, doing completely different analyses and requiring different packages.
I have looked for a way to try and automate such installations but couldn't find any list of needed (or possible) dependencies. I have found a few topics, discussing it, mainly about the shinyapp.io server or this here, which is also not complete.
I was wondering if there is a better or more complete way of automating this tedious experience
You've hit on a really interesting problem. Packages can be hard to install - especially when you're reliant on a ticketing system and external groups to manage system dependencies. We've seen this problem happening in all sorts of organizations, which is why RStudio Package Manager was created.
As of today, Package Manager is a tool for the centralized management of R packages; the focus being on serving customized repositories of R packages and understanding R package dependency trees. This enables reproducibility and optionally things like a shared package environment baseline, or ensuring the use of a pre-determined set of "validated" packages.
In future releases of Package Manager we would like to gather information about what underlying system requirements are necessary for packages to compile, and provide tools to make those system dependencies available. If you have a few minutes, this talk Sean Lopp gave at RStudio Conf might be interesting to you. The part about features coming soon is right at the end (around minute 16:40).
Take a look at my old post What are the main limits to R in a production environment? on why we use the Nix functional package manager. As soon as your dependencies go beyond other R packages (system libraries, newer c++ compiler, java/scala/spark packages, python packages...) - which is very often - R package management cannot help. Nix also allows non-admin users to install all those external dependencies themselves without affecting anybody else. It is a solved problem but I keep seeing the wheel reinvented (conda's LD_LIBRARY_PATH mess, etc...).
To give you a sense for what the RStudio Package Manager support @kellobri describes will look like, here is a sneak peek at the in-development version:
This is a screen capture of the httr packages page. Behind the scenes, RStudio Package Manager has identified that httr depends on the curl package, which in turn has a dependency on libssl. We've translated that system requirement into specific install command based on the operating system you select (e.g. for Ubuntu 18.04 you need libcurl4-openssl-dev and libssl-dev, whereas for CentOS 6 you need libcurl-devel and openssl-devel).
Regardless of how you go about getting those dependencies resolved (be it through a nix installation, a Dockerfile modification, or just sending your sys-admin an email with package manager's recommended installation commands), you won't need to google around anymore or debug strange compilation errors.
We'd welcome any feedback on this approach as we wrap up the first pass.
To share experience, we maintain an Ansible playbook where we store all the dependency encounter and a yum repository with all this dependency. We are then able to install required dependency by asking the IT team to execute our playbook.
It is not perfect but it is something we put in place to ease the work with the IT servers maintainers.
Also, there is not so many dependencies left when you installed all the main in a shared library. Many users are using the same at the end.
Thanks @kellobri for this informativ reply. I think the idea is quite good and can be very helpful, but I'm sorry to say, not for that price.
As R every time gives an error message, when a package is missing, which let me know what OS-related library is missing, and I give away this information to our IT guys, which then install it for us, I can't see the need to spend so much money on this package manger, if it is not even able to install the missing packages.
As each OS has its own unique, but constant command/method of installing new libraries, at least for most of the cases. I think that it is quite feasible to figure out how to install a new missing library even for me with not so much related background.
but thanks anyway
thanks @alexv for this suggestion, I would definitely pass it along to my collaborators and we will discuss the possibility of using such a package manager.
As the R package manager is not free, (very very expensive in fact), I was looking for alternatives.
As far as I can see in the manual, nix is free to use. Is that correct?
Another thing I would like to know is - you mentioned, one doesn't need sudo permissions to install libraries. Does it mean that they are installed locally?
If several users needs the same library, would this mean that the will be installed multiple times (like conda ).
thanks
@Ayeroslaviz yes Nix is free and open source. Nix has a very clever architecture - everything is installed in common "nix store" which is only writable by the nix daemon so that the commands the users run communicate with the daemon. All the packages (applications, libraries, R packages, Python packages, etc.) go to their own subdirectories which have the hash code of the build as a part of the directory name.
That way if you build the same thing with different parameters (say R with some custom version of BLAS) it will be in a different directory in the store. That's how different versions coexist. The nix daemon manages the "profiles" which are basically collections of symlinks to the store. So if two users "install" the same package nix daemon will create symlinks in their profiles to the same location in the store - no duplication.
If you have any nix-related questions there is a forum at https://discourse.nixos.org/