I'm new to plumber and would really appreciate your advice on the following: We want to use plumber and RStudio Connect to publish a predictive model through a REST API. For every request made, the REST service needs to log the input values and prediction. What would be the best way to do that? Create a database connection and write a new record to a table in a database for each request or could we may be use a filter-logger?
I would suggest monitoring the R process itself, then you can log plumber input by writing to R console.
This is what we use internally (we deploy using docker with kubernetes). Log are redirected to stackdriver on GCP.
Docker file
FROM rstudio/r-base:4.1.0-focal
WORKDIR /src
RUN apt-get update && apt-get install -y --no-install-recommends \
gdal-bin \
git \
libcurl4-openssl-dev \
libgdal-dev \
libgeos-dev \
libicu-dev \
libproj-dev \
libsodium-dev \
libssl-dev \
libudunits2-dev \
make \
zlib1g-dev
RUN echo 'options(repos = c(REPO_NAME = "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"))' >> ~/.Rprofile
RUN Rscript -e "install.packages(c('curl','data.table','generics','globals','googleCloudStorageR','parsnip','remotes','sf','tidyr','yaml','future','hms','plumber', 'xml2'))"
COPY ./api/startup.R /etc
COPY ./api/plumber.R .
# Compile xgboost without OPENMP supports to avoid nthreads problems
RUN apt-get -qq -y install cmake
RUN git clone --recursive https://github.com/dmlc/xgboost
WORKDIR /src/xgboost
RUN git submodule init
RUN git submodule update
RUN mkdir build
WORKDIR /src/xgboost/build
RUN cmake .. -DR_LIB=ON -DUSE_OPENMP=OFF
RUN make -j$(nproc)
RUN make install
WORKDIR /src
EXPOSE 8004
ENTRYPOINT ["R", "-f", "/etc/startup.R", "--slave"]
api/plumber.R
#* Health check
#* @get /
#* @serializer unboxedJSON
function() {
list(status = "OK")
}
api/startup.R (logging happen here)
library(plumber)
pr <- plumb()
postroute = function(req) {
if (req$REQUEST_METHOD == "POST") {
cat("[", req$REQUEST_METHOD, req$PATH_INFO, "] - REQUEST - ", req$postBody, "\n", sep = "")
}
}
postserializewithoutpayload <- function(req, res) {
if (req$REQUEST_METHOD == "POST") {
cat("[", req$REQUEST_METHOD, req$PATH_INFO, "] - RESPONSE - ", res$status, "\n", sep = "")
}
}
postserializewithpayload <- function(req, res) {
if (req$REQUEST_METHOD == "POST") {
cat("[", req$REQUEST_METHOD, req$PATH_INFO, "] - RESPONSE - ", res$status, " - BODY - ", res$body, "\n", sep = "")
}
}
hooklist <- list(postserialize = postserializewithoutpayload)
debughooklist <- list(postserialize = postserializewithpayload, postroute = postroute)
if (Sys.getenv("DBG_ENABLE", FALSE) == TRUE) {
pr$setDebug(TRUE)
pr$registerHooks(debughooklist)
} else {
pr$registerHooks(hooklist)
}
pr$run(host = "0.0.0.0", port = 8004)
There are of course other ways to achieve a similar result. Let me know if that gives you enough to get started.
Thank you @meztez! Your reply was really helpful!
I've created two files, namely plumber.R
library(plumber)
data <- iris[, c("Sepal.Length", "Sepal.Width")]
names(data) <- c("length", "width")
model <- lm(length ~ width, data)
#* Predict sepal length
#* @param width
#* @get /predict
#* @serializer unboxedJSON
function(width){
new_obs <- tibble::tibble(width = as.numeric(width))
predict(model, new_obs)
}
and entrypoint.R
pr(file = "plumber.R") %>%
pr_hook("postroute", function(req) {
if (paste0(req$REQUEST_METHOD, req$PATH_INFO) == "GET/predict") {
cat("[", req$REQUEST_METHOD, req$PATH_INFO, "] - REQUEST - ", req$postBody, "\n", sep = "")
}
}) %>%
pr_hook("postserialize", function(req, res){
if (paste0(req$REQUEST_METHOD, req$PATH_INFO) == "GET/predict") {
cat("[", req$REQUEST_METHOD, req$PATH_INFO, "] - RESPONSE - ", res$status, " - BODY - ", res$body, "\n", sep = "")
}
}) %>%
pr_run()
The postserialize hook is working fine, but the postroute isn't. The value of the input variable width
isn't included in the output:
[GET/predict] - REQUEST -
Could you please point out to me what I'm doing wrong?
there is no postBody with the GET as all parameters are passed after the ? in the URL. You might want to log something else like req$args or req$QUERY_STRING. See Routing & Input • plumber
Thank you so much! Now it works, thanks to your suggestions and this example: Plumber Logging · R Views
plumber.R
library(plumber)
data <- iris[, c("Sepal.Length", "Sepal.Width")]
names(data) <- c("length", "width")
model <- lm(length ~ width, data)
#* Predict sepal length
#* @param width
#* @get /predict
#* @serializer unboxedJSON
function(width){
new_obs <- tibble::tibble(width = as.numeric(width))
predict(model, new_obs)
}
entrypoint.R
library(plumber)
library(logger)
log_dir <- "logs"
if (!fs::dir_exists(log_dir)) fs::dir_create(log_dir)
log_appender(appender_tee(tempfile("plumber_", log_dir, ".log")))
convert_empty <- function(string) {
if (string == "") {
"-"
} else {
string
}
}
pr(file = "plumber.R") %>%
pr_hook("postserialize", function(req, res){
if (paste0(req$REQUEST_METHOD, req$PATH_INFO) == "GET/predict") {
log_info('width = {convert_empty(req$argsQuery$width)} prediction = {convert_empty(res$body)}')
}
}) %>%
pr_run()
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.