I am building a clinical trial risk monitoring application and would love your guidance on the best way to architect and host it. We have the components built but need advice on how to put everything together for production.
What the Application Does
RiskView is an RBQM (Risk-Based Quality Management) dashboard for clinical trials. It allows risk monitors to:
-
View Key Risk Indicator (KRI) dashboards with color-coded risk levels across clinical trial sites
-
Drill into site-level risk scores, trend charts, and patient-level data
-
Manage Quality Tolerance Limits (QTLs) — create, edit, archive thresholds
-
Mute/unmute KRI alerts with documented reasons
-
Link KRI flags to external issue trackers (Jira, etc.)
-
All write actions require a field-level audit trail for regulatory compliance
The app will support 20-30 clinical studies, each with 50-200 sites and 15-50 KRIs per study. We're targeting 100-200 concurrent users — primarily clinical operations and data management teams.
What We Have
The R Analytics Engine (computation-heavy)
The core of the application is a set of 12 R analytics files (~3,000 lines) that perform statistical computation:
-
KRI computation using survival models (survival), logistic regression (logistf), GAMs (mgcv), robust statistics (robustbase, psych, Hmisc), and distribution fitting (fitdistrplus)
-
Risk color coding (4-pathway classification logic)
-
Site Risk Index (SRI) scoring with weighted aggregation
-
System KRIs (15 built-in), custom KRIs, and percentage-based KRIs
These computations run per study and process all sites/KRIs for that study. For a large study (200 sites, 50 KRIs), a full computation run can take several minutes. This needs to run on a scheduled basis (nightly or on-demand), not
interactively.
The Shiny Dashboard (display + writeback)
An R Shiny application (~7,600 lines, 47 R files) packaged as an R package. It uses:
-
shiny, bslib, reactable, highcharter for the UI
-
R6 for a central reactive data store
-
dplyr, tidyr, tibble for data manipulation
-
Modular architecture with table-driven page routing
The dashboard is read-heavy — it mostly displays pre-computed results. But it also needs write-back functionality for QTL management, KRI muting, and issue linking, with full audit trails.
The Database
We have a Microsoft Fabric Lakehouse with Delta tables exposed via a SQL analytics endpoint (ODBC/TDS). The data is organized in a medallion architecture:
-
Silver tables (5): raw clinical trial data (patients, visits, adverse events, queries, custom KRIs) — loaded by upstream ETL pipelines
-
Gold tables (9): pre-computed analytics results (KRI values, risk colors, SRI scores, trending, errors, snapshots, job runs)
-
Writeback tables (9): user actions (QTLs, muting, issue links, audit log)
Authentication
Microsoft Entra ID (Azure AD) — all users authenticate via corporate SSO. Write operations need user identity for the audit trail.
What We Need Advice On
We have all the R code written — analytics, Shiny dashboard, and writeback logic. What we need help figuring out is the best way to host and connect everything. Specifically:
- Where should the analytics computation engine run? It needs full R 4.3.2 with heavy statistical packages (survival, logistf, mgcv, fitdistrplus, robustbase, psych, Hmisc, car, rms). It runs per-study on a schedule (not
interactively). Options we're considering include Azure Batch (Docker), Posit Connect scheduled reports, or something else entirely. What do you recommend?
-
Where should the Shiny dashboard be hosted? We're assuming Posit Connect, but how should we configure it for 100-200 concurrent users on a read-heavy dashboard? What process model settings do you recommend?
-
Where should the writeback/API layer live? The write operations (QTL CRUD, muting, issue linking) need authenticated endpoints with audit trails. Should this be a Plumber API on Posit Connect? Should it be part of the Shiny app? A
separate service? How does authentication flow between the dashboard and the write layer?
- How should everything connect to the Fabric Lakehouse? Both the analytics engine and the dashboard need ODBC access to Fabric. What's the recommended approach for connection management (pooling, etc.) on Posit Connect with
Microsoft Fabric's SQL endpoint? Are there known considerations with ODBC Driver 18 and Fabric?
- How should Entra ID authentication work across the components? Users authenticate via Entra ID. The dashboard needs to know who the user is, and write operations need to pass user identity for audit trails. What's the recommended
auth pattern for this on Posit Connect?
- What's the recommended CI/CD approach? We have GitHub Actions. How should we deploy the Shiny app, the API (if separate), and the analytics engine from one repo?
What We're Hoping For
We'd love Precommended architecture for this type of application — where each component should live, how they should communicate, and any best practices for hosting a multi-user Shiny app with a Fabric backend and a writeback
layer. If there are reference architectures, documentation, or examples of similar deployments, that would be incredibly helpful.
Thank you!