tune_bayes doesnt recognize mtry when hyperparameter tuning random forest

Tobiascas · January 14, 2025, 8:09pm

Hello!

why cant i get tune_bayes to work, when trying to tune my random forest model. please, what am i doing wrong?

The following is my code:

# ==============================================================================
# Specifying the RF model and setting up the workflow
# ==============================================================================

# defining the model specification
rf_spec <- rand_forest(trees = 4000,
                       mtry = tune(),
                       min_n = tune()) %>%
  set_engine("ranger", importance = "impurity", splitrule = "extratrees") %>%
  set_mode("regression")

# setting up the workflow
workflow_rf <- workflow() %>%
  add_recipe(recipe_selected) %>%
  add_model(rf_spec)

# ==============================================================================
# Exploring hyperparameter ranges for RF
# ==============================================================================

# checking where to set the hyperparameter limits
tune_res <- tune_grid(workflow_rf,
                      resamples = folds,
                      grid = 20)

# visualizing the perfprmance of the randomly chosen parameters
tune_res %>%
  collect_metrics() %>%
  filter(.metric == "rsq") %>%
  select(mean, min_n, mtry) %>%
  pivot_longer(min_n:mtry,
               values_to = "value",
               names_to = "parameter") %>%
  ggplot(aes(value, mean, color = parameter)) +
  geom_point(show.legend = FALSE) +
  facet_wrap(~parameter, scales = "free_x") +
  labs(x = NULL, y = "rsq")

# ==============================================================================
# Building the final model
# ==============================================================================

rf_grid <- grid_random(mtry(range = c(15, 30)),
                       min_n(range = c(15, 30)),
                       size = 10)

initial_tuning_results <- tune_grid(workflow_rf,
                                    resamples = folds,
                                    grid = rf_grid,
                                    metrics = metric_set(rmse, rsq))

# running the tuning using bayesion tuning and block cross validation
(tuning_results_rf <- tune_bayes(workflow_rf,
                                 resamples = folds,
                                 param_info = rf_set,
                                 iter = 30,
                                 initial = 5,
                                 control = control_bayes(save_pred = TRUE, verbose = TRUE),
                                 metrics = metric_set(rmse, rsq)))

Tobiascas · January 14, 2025, 8:11pm

oh and the error i am getting is:

i Gaussian process model
x Gaussian process model:
  Error in purrr::map2():
  ℹ In index: 1.
  Caused by error in .f():
  ! The parameter object contains unknowns.
Error in check_gp_failure():
! Gaussian process model was not fit.

Max · January 15, 2025, 11:23am

It is very hard to tell without the data (or a small reproducible example).

For mtry, there is no standard range for that parameter since it depends on the number of predictors in the data. Bayesian optimization needs that range to be able to work.

I think this is the issue because of the message “The parameter object contains unknowns.”

tidymodels will try to figure that out but there are some cases where it can’t. If you use a recipe that tunes some parameter that affects the number of columns, it can’t possibly tell and gives this error.

Can you show us recipe_selected? The name suggests that the problem is there.

Providing data and complete code (and version numbers) is the way to solve this. Otherwise, it’s like working with rumor and innuendo, not concrete information.

system · April 15, 2025, 11:24am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.