Questions about R in production

I am not so sure about this (though I can say I haven't tried it yet). It seems to me that if you have an R API (e.g. using opencpu) you should be able to scale it horizontally as much as you need by throwing more hardware at it, though indeed at that point one might question why not to reimplement your model/computations into something more performant (say, C# or java or golang or whatever). That being said, you could probably make similar observations for python or any other interpreted language for that matter.

Well, these are quite different things that you are throwing in together. For instance, glue offers much much more functionality than simple paste. Similarly, the purrr family of functions offers much more functionality than the basic lapply. I was also myself very skeptical of purrr at the beginning because it seemed to reinvent the wheel but boy, I was wrong. Purrr is an amazing package with a lot of added functionality.
I haven't used plumber myself but it seems to adhere to a different, more "low threshold" philosophy for turning R functions into callable API's. That being said, I agree that opencpu is a great tool which would benefit from more rstudio support; I think if more people worked on improving it (other than Jeroen Ooms) we would have a serious R contender to the python flask framework.
But on glue and purrr you're dead wrong, I'm afraid.

I guess one thing which is not clear in your complaint is what you mean by "speed". Base R is not the fastest thing, granted, but dplyr and (even more) data.table are as fast as anything you can find in e.g. python. It is true that R has some limitations (single-threadedness) that make it awkward to use for e.g. a web API, but I don't think you can tackle these limitations, nor the slowness of base R, at the level of packages; they seem things that you can only really tackle at the lowest levels of R's implementation (i might be wrong on this though).

Sure, but what's the problem with that? That's not necessarily a problem with R. You would have to do that with python too, if you had a website with a lot of concurrent visitors (I admit python would handle more users/traffic than R, but up to a point).

Yeah but, R was developed with totally different concerns from C. With the same reasoning you could say: "I use C like I use C# and Java. Not like R and excel and I haven't seen any data analysis/logistic regression implemented in C other than with thousands of lines of code that it would take weeks to write only to discover that maybe Bayesian models are better, and let's restart this C circle of hell from scratch..."

The point is, use every tool for the job it was designed for. R wasn't designed for fast scalable web API's, in the same way in which C wasn't designed for fast data and model exploration.
I do agree that R is behind python when it comes to "clout as a production ready language" and not only as an analysis language, but I think that's just because most developers are more familiar with python than R. In any case, Rstudio is doing a lot to make R gain more "production clout", just look at sparklyr for an example.

R.

3 Likes