If I understand you correctly, you are hosting a CRAN mirror, or something like it? When you go to install.packages
, you point to this location with options("repos" = "http://my-simple-web-server")
and pull packages from it?
Also, your script that you use to copy the "new packages" - does that just parse changes through a page like: CRAN Packages By Date
Do you do anything to account for the archive of old package versions?
The danger of only keeping the latest package version is that occasionally there are breaking changes in a package, or using an older version of a package is desirable for reproducibility or some other reason. As a result, CRAN archives the old versions of packages to allow installing that version again if necessary.
Obviously, you can decide that you do not want to support that workflow, but there are inherent risks in doing so that I want to be sure you are aware of. In any case, it is a good time to be experiencing this pain! You might take a look at RStudio Package Manager, which just entered Beta this week. Package Manager addresses many of the problems you are attempting to address and optimizes storage. It also only downloads the packages/versions that you use (in lazy mode), while keeping all packages available.
Worth a look, at least!