Is there any other tool to extract parameter sets after tuning besides select_best()?

jpm92 · April 30, 2023, 11:09am

I've tuned succesfully several models using {tidymodels} and {workflow_set}. However, when testing the validation dataset with tune::last_fit(), the parameters obtained by tune::select_best don't behave well. This makes me want to manually test other sets of parameters on the validation set. I find tune::show_best() and tune::select_best() very limited for doing so, since they only consider one metric when choosing the right parameters. I've managed to filter the tibbles with a more complex logic involving several metrics using pure {dplyr} but this is not optimal and is time consuming, involving manually finalizing each model every time I want to test one of the models.

Is there a way to cherry pick a set of parameters based on some id (for example tune_bayes iteration number)?

It also would be really helpful that tune::select_best() could take more conditions to pick a model.

This is the classical process to get the "best" set of parameters (which unfortunately is not in my case since I get a model with a very high roc_auc but very bad spec for example).

models_tuned <- models %>% 
  workflow_map("tune_bayes",
               resamples = cv_folds,
               initial = 20,
               iter= 10,
               metrics = mm_metrics,
               verbose = TRUE)

best_results <- models_tuned %>% 
  extract_workflow_set_result(id = "norm_nnet") %>% 
  select_best(metric = "accuracy")

fitted_workflow <- models_tuned %>%
  extract_workflow(id = "norm_nnet") %>%
  finalize_workflow(best_results) %>% 
  last_fit(split=split_df,
           metrics=mm_metrics)

Max · April 30, 2023, 3:02pm

You can pass any parameters you want to finalize_workflow(). The parameters argument takes any tibble that includes values for the tuning parameters.

I do want to say that you are probably going to end up overfitting by taking this approach. It's unclear what is happening; the word "validation" makes sense but the code makes me think that you are going to repeatedly check against the test set. There's no code to suggest how split_df was made.

We do have an experimental package called desirability2 that uses a tool called desirability functions to do multi-metric optimization (also used here). There is an example on the package website.

jpm92 · April 30, 2023, 8:04pm

Thanks for your response Max, probably I wasn't clear, so let me clarify my question.
My dataset is split up manually into training and testing and this is due to the nature of the data.

I've used a workflow set with resampling of the train dataset(using k-fold cv) to tune the parameters of a bunch of models.

When I explore the resulting object with collect_metrics(), I can see how I have models with very good and balanced metrics, and also other models that have a very good metric where having aweful estimates for other metrics, which leads to me choosing a model with select_best with a very high roc_auc but bad sens, or one with a very good sens but bad spec, etc.

That's why I'm interested in manually picking the set of results that I see with a balanced set of metrics (for example a set with roc_auc, sens, and spec >=0.8)

My intention is to then finalize the model and run last_fit to evaluate the performance with the test set.

I hope I'm being more clear this time.

Thanks for your support!

system · May 21, 2023, 8:04pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.