Hello everyone! I would appreciate any help that guide me in the right direction.
Im currently working on a database project for managing base data and technical equipment of a big environmental monitoring network.
My agency is technically trapped in the early 2000s and flooded with hundrets of excel spreadsheets no one can overlook anymore. So im working on a solution to restructure and reorganising all of our important data. The goal is basically a database with some easy one-to-many relationships and a nice-to-use gui with some small dashboarding and evaluation functions, which is accessible by anyone within the corporate network.
I made a working prototype in ms access but i really have to say i hate access.
Its slow, laborious to build a working and good looking gui and overall lacks massivly in functionality of exporting and integration of external sources. I dont feel like diving into vba either. Since im pushing my careerpath into data anyway and work with R in sideprojects i had the idea to build the whole database/gui as a shiny application since R is far more flexible and modern than access and i love how easy it is to build good looking frontends with shiny.
But before i deep-dive into shiny-development i need some questions answered if this even makes sense to realize the task in Shiny.
The requirements are as follows:
Integration of external data that is only accessible via REST-API (GET request) within the corporate network (Easy with httr)
Multiuser capabilities
GUI to enter, store, change and pull structured, relational data
Selfhosted within the network
Selfcontained databasebackend since the network and IT is heavily restricted (SQLite should work?)
Merge external, queried data (not editable) with self stored (editable) in one frontend
Easy export of data in multiple formats
A common dataset i need to store looks like:
Station with some parameters (Station number, name, long/lat coordinates,...)
Station can have multiple problems we need to track (1 to many)
Station can have some technical equipment like dataloggers, modems... (1 to many)
Each datalogger can have multiple sensors and a sensor can have multiple coordinates (both 1 to many)
The whole databasedesign is finished and working in access.
I read a lot but some things are still unclear to me.
Questions:
Can i somehow query a local server if shiny is hosted outside of the network but used within?
Is shiny (with dplyr, DBI, pool) capable of managing relational data with multiple users at once - just like a access database would do?
How much SQL do i need to know? Can i use dplyr instead to retreave, store and change data?
Is it possible to have kind of a split-view gui like in access where i have a datatable view of all stations at the top and some details of the selected station in the bottom?
Does all of this make sense and would you say shiny is a far better tool for this job than access?
I would really appreciate any opinion from you guys & girls!
Cheers
Yes, as long as your loca server is accessible from outside the network or at least from the same location the app is deployed on.
Have in mind that this functionality is not specific to Shiny or even implement by it but rather by the database engine you use.
I think you are a little confused about the role R (shiny, dplyr, etc) is going to play in your application. The database engine is independent of the application and
the access management is carried on by the database engine so, this is not related to Shiny.
You can use dplyr like syntax that gets translated to sql commands in the background by using dbplyr but it has limitations and using SQL directly would be more efficient.
Yes, as long as you implement it but there is no "one-click" solution for that.
Yes, it makes a lot of sense, I use shiny as a GUI to interact with databases very extensively but I don't think you can compare Shiny with Access they are two completely different pieces of software.
Wow, I love this question, and while I don't have the answers for all of it, I hope I can weigh in with something helpful.
As for your questions about whether this is do-able in Shiny, I feel pretty confident the answer is yes, it just depends on how much time and effort you're willing to put in, and how soon it needs to be done/how much Shiny experience you already have. Hopefully others are able to weigh in on specific aspects and the lift required.
However, I'm most curious about your hosting plans. You mentioned that something needs to be hosted within the network, but then asked if Shiny apps hosted outside the network could access things on the network. So where would you host your Shiny apps? Common hosting platforms include: shinyapps.io (Not within your network, some fees I think), on-premise Shiny Server (this can be done for free, but then you/your org need to learn how to set it up, without professional support), on-premise RStudio Connect (this is a paying option, but it comes with lots more features than a bare Shiny Server, and you get professional support from RStudio). I know both server options can handle authentication for multiple users, I'm not sure if apps on shinyapps.io can do so as well.
Not OP but thanks for the reply. Do you have any resources where I can read about scaling database access through R applications?
I also work extensively with Shiny and relational DBs in my daily job, mainly MySQL and sometimes Postgres. I usually don't have any issues with reading/writing to and from the DB using a Shiny app as at most, only 1 or 2 users are using the application at once. However, I need to scale the app soon to be available to dozens of users, who will potentially be interacting with the DB simultaneously. Are there documentations on best practices from an R/Shiny perspective?
Not that I'm aware of. You can read about "production" grade shiny deployments using golem and shinyproxy (which manages the GUI side of things) and scaling up the database backend to manage the load with things like replication, parallel processing of queries and databases engine fine tuning (not a trivial task) but I have never heard of resources that address the specific combination of both.
I really like using Postgresql because it allows extreme fine grain tuning and personalization and you can even extend its functionality as needed with extensions. You can get a broad overview of all its capabilities by reading this book.