Shiny and Git Best Practice

jpeck1989 · November 12, 2018, 2:46pm

Hi everyone.

My company is using Shiny and Shiny Dashboard to produce client facing dashboards, as well as RStudio Server and cronR to perform ETL jobs, and Shiny Apps to host our dashboards.

We are using Git (GitLab) for version control. What I'd like some help with is best practice for using Git repos with Shiny in a team environment. We are all using the RStudio plugin to pull, commit, push, branch etc. already. What I'd like input on is how to handle different package versions between team members, where to store our authorisation keys and details, what kind of workflow do people use for continual amends (Git issues etc), repository structure (separate repos for Apps and ETL scripts), CI/CD into Shiny Apps etc. etc...

As you can see there is a lot going on here that can be difficult to get right, so anybody's experiences and knowledge would be greatly appreciated.

Thanks in advance!

Jordan

wolfpack · November 12, 2018, 3:11pm

It sounds like your development environment is similar to ours. We use RStudio Server Pro, RStudio Connect, Gitlab and Postgres across several RHEL servers. We create internal customer facing dashboards that require significant validation and testing before rolling out. They are used on a daily basis for folks to do their job, so it requires a lot of planning, organization and coordination to keep things up to date.

For version control, we have a mantra that if it's more than ad hoc analysis, it gets pushed. That way information is centrally stored and others can collaborate on it easily. For branching, simple projects only have a master branch but more complex ones could have two types of branches:

Master/Develop: This is where the published branch is on Master but additional development work goes into develop. These are for simple apps where there are few feature requests and it has a very narrow scope.
Master/Version: Like above, Master is the published version, but each new version of the software has a branch created for it. This allows us to assign issues/milestones to a specific branch version.

In general, I strongly recommend using the issue/milestone featuring for complex apps. It's fantastic and it provides some much more transparency into development than most had seen.

For testing and validation, we are moving away from paper testing as it is slow and doesn't provide enough complexity. Shinytest for validating changes to shiny apps and testthat for unit testing are great tools. Definitely look into them and consider integrating them into your development pipeline. We don't use CI but Shinytest should allow you to perform that functionality along with Gitlab. Here's a link.

For ETL type jobs, we converted them to RMarkdown documents. Anything that's a cronjob can be converted into a RMarkdown document. This accomplishes a few things:
1: Allows for documentation in Gitlab
2. Creates transparency in how ETL works
3. Centralizes code development in RStudio Server
4. Centralizes code execution in RStudio Connect

One thing that isn't discussed here but is really important are commenting and naming convention. If your group is like ours, you have a team of non-software developers performing software development. In many cases these folks have not been taught to follow any type of logical naming convention or to unitize their code into reproducible chunks. This is a cultural thing but you can start by developing a standard naming convention for variables. It helps by getting everyone to try and standardize the feel of the code and slows them down when writing. We haven't completely eliminated every "data_final_2b" variable, but it has helped.

For other best practices, you can see some of my other posts. Most of it is around the IT components of setting up and administrating this type of environment, but others are how to treat and develop different types of Shiny apps. You may find how we handle publishing of different app versions of an app to RStudio Connect useful.

jpeck1989 · November 12, 2018, 3:39pm

Thank you so much for this!

The CI/CD is the most important at the moment. For us, since at the moment we are using rsconnect whilst also pushing to Git, which has already lead to some problems, as Local, Remote and Shiny Apps can all potentially be entirely different.

I'm interested in how you remotely store authorisation keys/files/tokens in Git? Do they reside on another server somewhere, that your apps remotely download?

wolfpack · November 12, 2018, 3:50pm

We run our apps as a service account which allows us to authenticate to systems using LDAP/AD. If that isn't feasible, Connect should allow you to store environmental variables. You could also store credentials on a mounted file share for your production server. We have identical mirrored file shares for Dev/Prod so pushing to Connect operates the same on the file structure.

Ultimately this comes down to how secure is "secure enough". If you have external customers looking at stuff, it has much greater security risk than internal customers. You may want to discuss with IT stakeholders if you work at a large organization.