I am running an xgboost model in tidymodels, trying to use doParallel
to parallelize the model training when doing cross-validation for hyper-parameters.
However, I have run into an issue when doing this after using fixest
. The below reprex reproduces the issue.
First I just setup a very minimal xgboost model adapting what is done at Tune XGBoost with tidymodels and #TidyTuesday beach volleyball | Julia Silge.
Then, in the Parallel Section
I tune the xgboost parameters using 10 parallel cores using doParallel
.
Note that I have commented out a VERY minimal fixest::feols
call which is told to run on 10 threads.
As written (with the fixest
call commented out) this runs as expected.
However, if I uncomment the fixest
call and rerun everything (on a fresh R session), then the 10 processes are started (I can see them in my task manager) but they just idle there without actually running anything.
Note that if I set nthreads=1
instead of nthreads = 10
in the fixest
call then there is no issue running it.
library(tidymodels)
library(tidyverse)
# XGBoost Setup -------------------------------------------------------------
vb_train <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-19/vb_matches.csv', guess_max = 76000) %>%
na.omit() %>%
mutate_if(is.character, factor) %>%
mutate(win = factor(rbinom(n=nrow(.), size=2, prob=.5))) %>%
slice_sample(prop=.01)
#> Rows: 76756 Columns: 65
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (17): circuit, tournament, country, gender, w_player1, w_p1_country, w_...
#> dbl (42): year, match_num, w_p1_age, w_p1_hgt, w_p2_age, w_p2_hgt, l_p1_age...
#> date (5): date, w_p1_birthdate, w_p2_birthdate, l_p1_birthdate, l_p2_birthdate
#> time (1): duration
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
xgb_spec <- boost_tree(
trees = 1000,
tree_depth = tune(),
learn_rate = tune()
) %>%
set_engine("xgboost") %>%
set_mode("classification")
xgb_grid <- grid_latin_hypercube(
tree_depth(),
learn_rate(),
size = 2
)
xgb_wf <- workflow() %>%
add_formula(win ~ .) %>%
add_model(xgb_spec)
vb_folds <- vfold_cv(vb_train, v = 3)
# Parallel Section --------------------------------------------------------
set.seed(234)
#fixest::feols(hp~mpg, data=mtcars, nthreads = 10)
doParallel::registerDoParallel(cores = 10)
xgb_res <- tune_grid(
xgb_wf,
resamples = vb_folds,
grid = xgb_grid,
control = control_grid(parallel_over = "everything")
)
You might think this is just a fixest
issue, but I am asking it here, because fixest
doesn't seem to cause issues with other uses of the doParallel
backend.
This is seen in the below reprex (with sessionInfo()
), which runs without issue even with the same fixest
call (with nthreads=10
before using doParallel
.
getPrimeNumbers <- function(n) {
n <- as.integer(n)
if(n > 1e6) stop("n too large")
primes <- rep(TRUE, n)
primes[1] <- FALSE
last.prime <- 2L
for(i in last.prime:floor(sqrt(n)))
{
primes[seq.int(2L*last.prime, n, last.prime)] <- FALSE
last.prime <- last.prime + min(which(primes[(last.prime+1):n]))
}
which(primes)
}
fixest::feols(hp~mpg, data=mtcars, nthreads = 10)
#> OLS estimation, Dep. Var.: hp
#> Observations: 32
#> Standard-errors: IID
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 324.08231 27.43330 11.81347 8.2455e-13 ***
#> mpg -8.82973 1.30959 -6.74239 1.7878e-07 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 42.5 Adj. R2: 0.589185
library(doParallel)
#> Loading required package: foreach
#> Loading required package: iterators
#> Loading required package: parallel
registerDoParallel(cores = 10)
result <- foreach(i=10:10000) %dopar% getPrimeNumbers(i)
sessionInfo()
#> R version 4.3.2 (2023-10-31)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 22.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/Chicago
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] parallel stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] doParallel_1.0.17 iterators_1.0.14 foreach_1.5.2
#>
#> loaded via a namespace (and not attached):
#> [1] nlme_3.1-162 cli_3.6.1 knitr_1.43
#> [4] rlang_1.1.1 xfun_0.40 fixest_0.11.2
#> [7] Formula_1.2-5 data.table_1.14.8 glue_1.6.2
#> [10] zoo_1.8-12 htmltools_0.5.6 rmarkdown_2.24
#> [13] grid_4.3.2 evaluate_0.21 fastmap_1.1.1
#> [16] yaml_2.3.7 lifecycle_1.0.3 numDeriv_2016.8-1.1
#> [19] compiler_4.3.2 codetools_0.2-19 fs_1.6.3
#> [22] sandwich_3.0-2 Rcpp_1.0.11 rstudioapi_0.15.0
#> [25] dreamerr_1.2.3 lattice_0.21-8 digest_0.6.33
#> [28] reprex_2.1.0 tools_4.3.2 withr_2.5.0
Created on 2024-01-17 with reprex v2.1.0
Ultimately, then, it seems like something with tidymodels
use of doParallel
is causing an issue with its interaction of fixest
nthreads