R since 2008 — what's new?


I came across this 2008 paper co-written by one of the creators of R, Ross Ihaka. In it, the authors point to some problems with R and advocate moving to Lisp instead.

So it's been a long time now since this was written. My computer science is weak and my R history is weaker. For those who know, has R evolved over these ~14 years to address some of these concerns or has it flourished in spite of them? Of course, some of the concerns raised had to do with expectations about how data analysis would change in the future, but it doesn't seem to be that the authors are wrong about the rise of big data, cloud computing, etc.

I'll mention some of the deficiencies of R raised in the article for those who don't want to read through:

  • Pass-by-value semantic in R has huge memory/performance cost.
  • Vectorization comes with performance costs when scalar operations are needed since you essentially must "un-vector" the object.
  • Use of C is often necessary for acceptable performance which reduces the universe of R users who can write performant software and/or contribute to existing projects written in C.
  • Type declaration and checking doesn't exist in R which can lead to non-robust software in production environments.

Would these still be fair criticisms today? Has R changed in some ways to address these things? Are more improvements related to these issues in development? Needless to say, most of these aforementioned criticisms come with significant benefits that I didn't bother writing out here but are acknowledged in that paper.

I haven't got sufficient knowledge of R's history nor its development since 2008 to address the points properly. The criticisms may be fair, but the use of R has exploded since 2008 so have hardly hindered its growth. I remember one of the authors (it may well have been Ross Ihaka) mentioning similar points in a conference video a few years ago advocating a new language be developed to take over from R. As I never heard of this again I doubt it went anywhere beyond his PhD student's work.

There have been numerous improvements to base R since 2008, as well as packages like data.table overcoming the pass-by-value limitations and providing massive performance improvements. Plenty of packages are written in C (or use C++ via Rcpp), so maybe this isn't the restriction it was previously envisaged.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.