Problem: I have an RProject directory that contains files needed to write and compile a bookdown project. This directory is also a local GitHub repository. I have been happily adding to it and committing/pushing with ease, however, lately RStudio has gotten SUPER SLOW whenever I open this project. By slow I mean: opening files, saving or deleting files, executing commands from a script (or in the console). Anything with git now takes a long time as well (I’ve stopped using the RStudio IDE to execute any git commands for this project). This slowness is not apparent with other projects.
My suspicion was that it was somehow related to a subdirectory of text files that I occasionally update and push to git. These text files amount to ~340 MB; it’s a big folder. These files by themselves shouldn’t be a problem for git; this is how I’ve stored them in the past, but not in an RProject folder.
I've tried a few different things. At each attempt, I logged the time it took to run bookdown::render_book("index.Rmd", "bookdown::gitbook")
. The final output is a very basic example; it loads some data and creates some plots, but there’s nothing in there that should take very long to run.
Record of attempts:
-
Baseline: Run as is
5.4 minutes, plus an extra 8 minutes just to load the time variable in the console in RStudio! RStudio just hangs. This is typical behaviour for this project. -
Start fresh: Copy all files minus git and minus large text files to a new directory
50.2 seconds -
Add large folder of text files
2.9 minutes -
Create new repository and project without text files
“GitHub first, then RStudio” style, as per Jenny Bryan’s Happy Git with R instructions
After getting the local repo set up, I again copied over the original files.
1.67 minutes -
Add large folder to new repository
I added the folder, knowing that it would take a long time to commit to git. Just adding it to the repo and restarting RStudio started causing serious lags. Opening a script took about a minute, making a small change and saving it took another minute, and selecting the new files to stage and commit failed. In fact, my computer crashed (generally I’d never try to do anything all at once with this many files; I generally add to the pile over time). But the fact that this was when the laggy RStudio behaviour began seems to confirm that this is the problem…
So, in summary:
Seems to be that RProject folders that are git repos have major slow-down problems if there are many or large files that need to interact with git. This is a problem that's been reported before, but still seems to be an open issue given what I've tried here.
I had wanted to avoid having a separate repo for my text files, but it looks like if I want RStudio and git to play nice, I may have to do this. I’m not sure what this means down the line as I add more complex modelling to the scripts (I know setting cache options and using RData files are some ways to alleviate this…).
Are there other techniques I should try or things I should be doing to keep this from happening as the project gets bigger? I can move my text files for now, but I’m worried about the project’s future…
It is absolutely worth mentioning that this may very well be a limit of my machine, too. I am running OSX 10.14 on a 1.4 GHz i5 2014 MacBook Air, so, not the most powerful of beasts.
RStudio version 1.1.463
R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14