Best Practices for Optimizing R Performance on HPC Servers

Hi everyone,

I'm currently working on a project that involves running R on HPC server, and I'm looking for some advice on optimizing performance. Are there any best practices or specific configurations that can help improve computational efficiency in this environment?

Additionally, are there any R packages or tools that are particularly useful for optimizing performance on HPC systems?

Any tips or experiences you can share would be greatly appreciated!

Thanks in advance!

Hi @leoarthur, welcome to the forum!

The majority of R code is going to run single-threaded, as that's the easiest and most portable way to program. Packages that use C++/Rcpp (like much of the tidyverse) will be faster, but still run mostly as single-threaded mode.

Depending on your field of interest, there may be other packages that are specifically written to be parallel (calling FORTRAN/BLAS type linear algebra code), packages like sparklyr (that run against Apache Spark), packages that can hand off the computation to GPUs (such as tensorflow and the like), but that all depends on the use case.

Do you have a specific use-case in mind that doesn't run as well as you'd like?