Missing OpenMP support for P3M macOS binaries (affects data.table, fixest, and likely others)

Background

Since R 4.3.0, CRAN has shipped an OpenMP runtime with its macOS distribution of R. This means that packages can support multithreading out of the box on Mac without users needing to configure ~/.R/Makevars and/or install from source (see OpenMP on macOS with Xcode tools). Several packages have already taken advantage of this by shipping configure scripts that detect the OpenMP runtime at build time and enable the appropriate compiler/linker flags. For example:

The upshot is that running install.packages() from CRAN for these packages, now gives macOS users multithreaded performance by default, which is obviously a big quality-of-life improvement.

Problem

Unlike CRAN, P3M's macOS binaries do not have OpenMP enabled. Users installing from P3M (silently) get single-threaded builds of these packages, losing the multithreading support that CRAN binaries provide. This obviously carries through to some of the popular reproducibility tools that use P3M as a target mirror (e.g., renv, rv).

MWE

# Install from P3M

install.packages("data.table", repos = "https://p3m.dev/cran/latest", type = "binary")
install.packages("fixest", repos = "https://p3m.dev/cran/latest", type = "binary")
data.table::getDTthreads()
#> 1
fixest::getFixest_nthreads()
#> 1

# Install from CRAN

install.packages("data.table", repos = "https://cloud.r-project.org", type = "binary")
install.packages("fixest", repos = "https://cloud.r-project.org", type = "binary")
data.table::getDTthreads()
#> 7
fixest::getFixest_nthreads()
#> 7

(Run on macOS arm64, R 4.5.3, but I've confirmed from other users that the problem persists on other MacOS architectures.)

Root cause?

Both data.table and fixest use configure scripts to detect OpenMP availability at build time and set the appropriate compiler/linker flags. CRAN builds natively on macOS hardware where the OpenMP runtime is available, so detection succeeds.

In contrast, P3M cross-compiles macOS binaries from Linux using osxcross (IIUC per this Posit blog post). In this cross-compilation environment, the macOS OpenMP runtime is presumably not available, so the configure scripts fall back to single-threaded builds. Additionally, the runtime test approach used by both packages (compile, load, and execute a test shared library) cannot work in a cross-compilation context since you can't execute a macOS binary on Linux.

Impact

I haven't tried to gauge the full scope of affected packages, but data.table and fixest at least are among the most widely used R packages for data manipulation and econometrics. For both of these packages, OpenMP multithreading yields substantial performance gains; especially for large datasets. Moreover, popular reproducibility tools like renv and rv default to P3M as their package repository, so users silently get degraded single-threaded performance without realizing it. (This is where I got bitten.)

Thanks in advance and happy to help where I can!

2 Likes

@grantmcd thanks for the report! I've added this to our internal issue tracker, but can't say when we'll be able to fix it. It looks like it won't be a trivial fix.

The macOS builds were done before or around the time of R 4.3, so back then, CRAN also did not build with OpenMP on macOS. Nice to see that it has changed now.

2 Likes

Thanks @greg, much appreciated.

1 Like