Tidymodels, processing through CPU vs. GPU

I was only able to find below two posts on this topic -- barely any conversation. Is there any support or documentation for tidymodels on using GPU?

1 Like

When I try

xg_model <-
    # trees = 1000, 
    # tree_depth = tune(), min_n = tune(), 
    # loss_reduction = tune(),                     ## first three: model complexity
    # sample_size = tune(), mtry = tune(),         ## randomness
    # learn_rate = tune(),                         ## step size
  ) %>%
  set_engine("xgboost", tree_method = 'gpu_hist') %>%
  set_mode("classification") %>% fit(Species ~.,data=iris)

I get

Error in xgb.iter.update(bst$handle, dtrain, iteration - 1, obj) :
[09:00:18] amalgamation/../src/metric/../common/common.h:156: XGBoost version not compiled with GPU support.
Timing stopped at: 0 0 0

which would imply to me that it would use gpu, if my xgboost was compiled with that ability.



Do you happen to know if there is a list of methods within tidymodels that would use the GPU? It could be resampling methods, modelling, recipes, etc.

sorry, I have no idea about that.
probably another good place to ask might be the issues board on Issues · tidymodels/tidymodels (github.com)

1 Like

You have to compile xgboost yourself if you want GPU support. CRAN doesn't build the package for that. See this thread.

tidymodels just calls the function so, I believe , that it should just work as long as the package itself can do it.


My understanding is the GPU will really shine when using deep learning models or XGBoost and other tree models (if they have GPU support). It's because you need thousands of processes that can run in parallel. I don't think GPU will provide any boost for data pre-processing or resampling.

Now it is easier to use XGBoost with GPU support in R.
Here more info: https://xgboost.readthedocs.io/en/latest/install.html#r

For me, it was worth the trouble to follow the links in this thread to install the the gpu-aware version of xgboost. I got the pre-compiled windows version. GPU version runs in 40% of the time vs. multicore CPU version.

Processor: 11th Gen Intel i9-11950H @ 2.60GHz, 2611 Mhz, 8 Cores, 16 Logical
GPU: Nvidia GeForce RTX 3080 (laptop).

xg_model_cpu <- parsnip::boost_tree(trees = 100,tree_depth = 50) %>% 
  set_engine("xgboost",tree_method = "hist",nthread =  cores) %>% 

# install instructions for gpu build of xgboost
# https://xgboost.readthedocs.io/en/latest/install.html#r
xg_model_gpu <- parsnip::boost_tree(trees = 100,tree_depth = 50) %>% 
  set_engine("xgboost",tree_method = "gpu_hist") %>% 

> tic()
>   xg_fit_gpu <- fit(wf_xg_gpu,tweet_train)
> toc()  
68.03 sec elapsed
> tic()
>   xg_fit_cpu <- fit(wf_xg_cpu,tweet_train)
> toc()
167.77 sec elapsed
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.