When tuning workflows (recipe/model combinations) in a workflow set, how do the specified metric functions influence the tuning?
More detail:
- We can create a few different recipes and store them in a list ("recipes_list")
- We can specify a few different models and store them in a list ("model_specs")
- We can specify a list of metric functions (a metric set) to be used for tuning; for example:
metrics_tuning <- metric_set(yardstick::f_meas, yardstick::sensitivity, yardstick::specificity,
yardstick::pr_auc, yardstick::roc_auc, yardstick::accuracy,
yardstick::precision, yardstick::average_precision)
- We can combine recipes and models into workflows and store them in a workflowset:
# configure the tuning
cv_folds <- vfold_cv(data_train, v = 8) # number of cross-validation folds
tune_grid_size <- 50 # number of hyperparameter combinations to try
# create a workflow_set
wf_set <- workflow_set(
preproc = recipes_list, # add multiple recipes, as appropriate
models = model_specs, # add multiple models, as appropriate
cross = TRUE # execute all combinations of recipes and models
)
- Then we can tune the workflows in the workflowset (using different approaches):
# tune the models and update the workflow_set with ALL tuned results
wf_set_tuned_results <- workflow_map(
wf_set,
fn = "tune_race_anova", # repeated measures ANOVA; a more efficient search
verbose = TRUE,
seed = 123
resamples = cv_folds,
grid = tune_grid_size,
metrics = metrics_tuning,
control = control_race(verbose = TRUE, allow_par = TRUE, parallel_over = "everything", save_pred = TRUE, save_workflow = TRUE)
)
Given that context, what I'm trying to understand is how the list of metric functions influence the tuning. Does the tuning try to optimize all metrics at once? Does it try to optimize all metrics, but apply more weight to those specified earlier in the list? Does it only use the first metric to guide the tuning (but including others in the list allows you to see their calculated values in the results)?