I run a program on Databricks, and also test it locally. I'm about to use the renv package to have more control over the R environment on Databricks.
I know renv and the R environment don't control everything – and my local machine and Databricks are on different OS – and falls short of containerizing like with Docker, but as a halfway point, I'm wondering if it makes sense to upload the renvlockfile from my local machine so that I force Databricks to use the same package versions.
Would this work / do any good? Or would it likely fail because the underlying OS are different?
You will continue to run into challenges if you develop across platforms. Renv will probably help to ensure that packages match across environments but that’s about it. The trouble occurs when you have dependencies on folder paths, database connections or OS dependencies (I.e. libxml2).
I’d suggest trying to develop in an environment that is as close to the deployment environment as possible. If you can’t, make sure you retain a healthy fear of those issues above and that they’re handled (and tested) in your deployment approach.