Response takes much longer than endpoint function to return result, serializer issue?

I have created an endpoint with plumber (plumber_1.1.0) that generates and returns a data.frame with 400 rows and 35k columns of numeric values which corresponds to a tab-delimited file of 243MB. When I query this endpoint, e.g. on the commad line with wget and curl, it takes around 7 seconds until the download starts. However, the endpoint function itself is finished after around 1 second as measured with system.time. I am wondering where the 6 seconds come from. If it can be attributed to the serialization to tsv it feels like it is a rather bad implementation then.

plumber.R

#* @get /test
function( res){

  res$serializer = serializer_tsv()

  result = data.frame()
  time = system.time({
    result = data.frame(replicate(400, runif(35000, min=0, max=100)))
  })
  print(time)

  res = result
}

run-plumber.R

#!/usr/bin/env Rscript

library(plumber)
pr("test.R") %>% pr_run(port=4000, host="0.0.0.0")

You can test the serialization time by itself.

The implementation use readr

system.time({readr::format_tsv(res)})

You can change for your own implementation with something like

serializer_tsv <- function(..., type = "text/tab-separated-values; charset=UTF-8") {
  if (!requireNamespace("readr", quietly = TRUE)) {
    stop("`readr` must be installed for `serializer_tsv` to work")
  }

  serializer_content_type(type, function(val) {
    readr::format_tsv(val, ...)
  })
}

register_serializer("tsv",  serializer_tsv)

Seems like 7 seconds make sense. I've tested a few other packages and nothing was significantly faster so far.

result = data.frame(replicate(400, runif(35000, min=0, max=100)))
system.time(a <- readr::format_tsv(result))
1 Like

Thank you @meztez. At least now it is clear what the main issue is. However it seems that plumber produces some overhead and I am not sure were it is coming from. On my server readr::format_tsv takes around 4.3 seconds which doesn't add up to the 7 seconds for the above example. I have also seen that for other examples.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.