Questions regarding pak package


I found a new package called pak, which is "another approach to package installation".

As per the documentation, I need to run the command pak_setup() after installation. This will create a new private library (currently in the location C:\Users\Username\AppData\Local\R-pkg\Cache/lib/3.5), apart from the default one (currently in the location C:/Users/Username/R/win-library/3.5), and copy all the existing packages to that. As far as I can tell, for all future installations via pak, it downloads first to the default location, and then copies it to the private library.

I like this package, but I don't really want two copies of the same packages on my device. I'd like to remove any one of them. So, is it possible to get rid of the private library? If not, will changing the default library path of R to this location will stop copying these (unnecessary) duplicates?

Any help will be appreciated.

It's fundamental to the safe operation of pak to have it's own private library. Yes, today the packages will be duplicated, but having separate libraries ensures that pak is completely isolated from the packages that you install for data analysis (and vice versa).

To extend Hadley's reply and clarify a bit more, pak does not copy all existing packages to its private library. It only copies those that it needs for its own operation, i.e. its (recursive) dependencies. This is currently 43 packages, and it will likely be less in the future. It is 72MB on my macOS. You can forget about pak's private library, pak itself will update it as needed.

This duplication and extra disk space is a small price to pay to make pak and package installation much more robust. In particular:

  • If a package in your regular library is removed or it is broken, pak itself will still work properly, and can help you re-install that package.
  • pak and your "regular" library do not have to use the same package versions. If a project needs a package version that is not compatible with pak, that is completely fine.
  • On Windows you cannot update a package with compiled code, if that package is loaded. This is an R core bug, and such an update leads to a broken package. The consequence of this is that if pak loaded such a dependency (e.g. rlang or curl) from your regular library, then it could not update it, the update would lead to a broken installation, and a broken pak installation as well. This is circumvented by the pak private library.

Yes, that makes much more sense. I guess I asked the question much earlier than I should. Actually I made a fresh install of R, and then installed pak as the first package. Then pak_setup created copies of them, and hence I was worried about space problem and created the topic. Then, while installing other packages later on, this didn't repeat.

I've two questions based on your reply.

First, I can see that the private library contains 40 packages, as opposed to 43 as indicated in your answer.

pak_revcursive_dependencies <- tools::package_dependencies(packages = 'pak',
                                                           recursive = TRUE)$pak
installed_packages <- = installed.packages())
installed_base_packages <- installed_packages[installed_packages$Priority %in% "base",][,1]
(expected_packages_in_private_library <- pak_revcursive_dependencies[!(pak_revcursive_dependencies %in% installed_base_packages)])
#>  [1] "assertthat"  "base64enc"   "callr"       "cli"         "cliapp"     
#>  [6] "crayon"      "curl"        "desc"        "filelock"    "glue"       
#> [11] "jsonlite"    "lpSolve"     "pkgbuild"    "pkgcache"    "prettyunits"
#> [16] "processx"    "ps"          "R6"          "rematch2"    "rprojroot"  
#> [21] "tibble"      "fansi"       "prettycode"  "progress"    "selectr"    
#> [26] "withr"       "xml2"        "digest"      "rappdirs"    "rlang"      
#> [31] "uuid"        "magrittr"    "backports"   "pillar"      "pkgconfig"  
#> [36] "utf8"        "hms"         "stringr"     "Rcpp"        "stringi"

Created on 2019-03-01 by the reprex package (v0.2.1)

These are the packages I find in the private library. Am I missing something to make a dumb question once again?

Second, you said that pak will update the private library itself as required. So, does that mean that I won't need to run pak_setup even when one of these dependencies get updated?

  1. I was using a different version of pak, probably the dev version. Or a different version of a pak dependency, that had more dependencies. Anyway, it does not matter much, pak will only put packages in the private lib that are needed for the actual pak version. You do not need to worry about this.

  2. Right. Actually you don't even need to run pak_setup() after the installation, pak will do it later, when you start installing packages with it. As the README says:

    After installation, you might also want to run pak::pak_setup() ; it’ll be run automatically when needed but you might want to do it now to save some time later.

    In practice pak only needs to update the private library if some packages are missing from it, or are broken, or if you update pak, and it needs a different set of dependencies.

1 Like

Thank you so much for the work on pak. Reducing the cost of package installation and avoiding error states in package libraries is very valuable. I have run into a lot of problems distributing R scripts to colleagues, and having to go back and forth with them trying to avoid breaking their library to run the scripts. pak is clear progress towards more robustness for this sort of use, and its really pleasant to use as well.