Hello community,
I have a bit of an architectural question on rmarkdown, R and quarto. More like several questions. Sorry in advance for the length of the entry.
Context
I work on an open source project (cornflow) that handles the asynchronous execution of optimisation tasks, among other things. It stores a problem's input data, results, etc. using formal (but somehow abstract) definitions modelled as classes: Instance, Solution, etc. These are personalized for each optimization problem we build.
Each optimization problem we handle can be very different (e.g., vehicle routing problems, a sudoku, graph-coloring, task-scheduling, maximum flow in a network, etc.). But by structuring the input data and the results (via a jsonschema), we build pluggable functionality that shares a common interface. Some examples of functionalities: solution methods (i.e., engines / solvers that generate a solution), case storage and comparison, checks and validations, solving scheduling and queuing, a REST API, unit tests, user permissions, etc. I want to tackle a so-far elusive functionality: user interface.
What I want
I want to have a catalogue of automated templates that take as input a "solved instance" (e.g., in json format) and produce a standard & pretty report ready to be consumed by a user. Each problem can have more than one report. And I want a user to ideally be able to ask for the report via the REST API that we already have.
More details
I'm a big fan of parametrized rmarkdown and I've used it in the past to successfully show/ communicate/ share results of complex optimization problems with colleagues. I want to bring pluggable automated reports that understand the input data and solution structure of a given problem and generates a pretty and powerful self-contained document (html, pdf, etc.). I imagine a rest api endpoint where the client asks "please, generate the report of solved case with id=1543" and the rest api returns the compiled document somehow.
Everything we have server-side is currently built in python.
Example implementation
Taken from this tree. tsp
is a problem, vrp
is another problem.
I've added a vrp/reports
directory below, where I envisioned the Rmarkdown templates will be. These reports assume we have a data structure compliant with the schemas/input.json
and schemas/output.json
. Or a vrp.core.experiment.Experiment
python object if it's done with python. Both are equivalent.
Likewise, we would have a tsp/reports
directory somewhere inside tsp
with the reports for the tsp
problem and compliant with its schemas (tsp/schemas/input.json
, ...).
├── tsp
│ ├── (...)
└── vrp
├── core
│ ├── experiment.py
│ ├── instance.py
│ └── solution.py
├── reports
│ ├── report1.Rmd
│ └── report2.Rmd
├── data
│ ├── input_test_1.json
│ ├── input_test_1_small.json
│ ├── input_test_2.json
│ └── output_test_1.json
├── README.rst
├── schemas
│ ├── input.json
│ └── output.json
└── solvers
├── modelClosestNeighbor.py
├── modelMIP.py
├── model_ortools.py
└── model.py
Some questions
- Should we aim at using rmarkdown, knowing that it would add an R dependency on the server-side? Should we go with Quarto + python? We can always run an R function from python (as command line or
reticulate
) - In case we go with R, is it possible to replicate the well-structured codebase we have in python in R to help in the production of the rmarkdown files? In python we have modules, classes, type hints, etc (see the
vrp/core/experiment.py
above). In R I've always ended up creating several scripts, each one with several stateless functions. It works but it always felt a bit dirty. - If we go for Quarto+python, is the functionality available in python as good as with R? I'm in love with ggplot, leaflet, knitr, tidyverse, magrittr. And I'm not at all convinced of using pandas, matplotlib, etc. Maybe plotly?
- Is it better to offer a report "on-demand" via our REST API? Or is it better to generate the document and store it in the server and let the user download it? Some documents can be really fast to compile, some others may not.
- How far can we go with the html automatic report? How close can we get to having a static webpage with links and a menu? I've used a collapsible TOC in html that already helps a lot in navigation. I've checked the bookdown package and seems promising with the
single html
option, are there examples of extremely rich 1-file reports to be sent via email/ chat?
Thanks!
Franco