Error While Finalizing Parameters of SVM

To create a tuning grid for a set of workflows (recipe/model combinations), I execute a loop that extracts the parameter set dials and then finalizes any unknown parameter ranges. This works great on all workflows, EXCEPT for those involving the SVM model; calling finalize() always results in an error:

#> Error in `map()`:
#> ℹ In index: 2.
#> Caused by error in `object$finalize()`:
#> ! The matrix version of the initialization data is not numeric.
#> Run `rlang::last_trace()` to see where the error occurred.

It's worth noting that all variables in the baked recipe are doubles, except for the outcome variable—which is a factor. Here's a reprex:

# Load required libraries
library(modeldata)
library(tidymodels)
library(dplyr)

# Load data
data(attrition)

# Create a recipe that ensures a
base_recipe <- recipe(Attrition ~ ., data = attrition) %>%
  step_zv(all_predictors()) %>%
  step_naomit(all_predictors()) %>%
  step_corr(all_numeric_predictors(), threshold = 0.9) %>%
  step_YeoJohnson(all_numeric_predictors()) %>%
  step_dummy(all_nominal_predictors()) %>%
  step_zv(all_predictors()) %>%
  step_normalize(all_predictors())

# list the variables and data types
base_recipe %>% prep() %>% bake(new_data = NULL) %>% glimpse()
#> Rows: 1,470
#> Columns: 59
#> $ Age                              <dbl> 0.521960942, 1.275977863, 0.102055073…
#> $ DailyRate                        <dbl> 0.75903101, -1.33414327, 1.33990843, …
#> $ DistanceFromHome                 <dbl> -1.49357558, 0.24333238, -1.03086414,…
#> $ HourlyRate                       <dbl> 1.35416941, -0.21060357, 1.26266458, …
#> $ MonthlyIncome                    <dbl> 0.28586776, 0.05281527, -1.44713307, …
#> $ MonthlyRate                      <dbl> 0.74743488, 1.39681749, -1.88197043, …
#> $ NumCompaniesWorked               <dbl> 1.62077939, -0.57110745, 1.27090654, …
#> $ PercentSalaryHike                <dbl> -1.4884114, 1.6791185, 0.2010641, -1.…
#> $ StockOptionLevel                 <dbl> -0.9316973, 0.2419060, -0.9316973, -0…
#> $ TotalWorkingYears                <dbl> -0.24422094, 0.05247754, -0.41036000,…
#> $ TrainingTimesLastYear            <dbl> -2.5781989, 0.2173107, 0.2173107, 0.2…
#> $ YearsAtCompany                   <dbl> 0.13964671, 0.76240120, -2.22884098, …
#> $ YearsInCurrentRole               <dbl> 0.20549482, 0.88358757, -1.59589482, …
#> $ YearsSinceLastPromotion          <dbl> -1.09449009, 0.09682314, -1.09449009,…
#> $ YearsWithCurrManager             <dbl> 0.48998195, 0.90932557, -1.54963149, …
#> $ Attrition                        <fct> Yes, No, Yes, No, No, No, No, No, No,…
#> $ BusinessTravel_Travel_Frequently <dbl> -0.4816947, 2.0745914, -0.4816947, 2.…
#> $ BusinessTravel_Travel_Rarely     <dbl> 0.6396229, -1.5623576, 0.6396229, -1.…
#> $ Department_Research_Development  <dbl> -1.3735834, 0.7275275, 0.7275275, 0.7…
#> $ Department_Sales                 <dbl> 1.5147284, -0.6597352, -0.6597352, -0…
#> $ Education_1                      <dbl> -0.89138490, -1.86779013, -0.89138490…
#> $ Education_2                      <dbl> -0.04251052, 2.24372610, -0.04251052,…
#> $ Education_3                      <dbl> 1.6079970, -0.5447859, 1.6079970, -1.…
#> $ Education_4                      <dbl> -1.00681362, 0.07983544, -1.00681362,…
#> $ EducationField_Life_Sciences     <dbl> 1.1936384, 1.1936384, -0.8372047, 1.1…
#> $ EducationField_Marketing         <dbl> -0.3481364, -0.3481364, -0.3481364, -…
#> $ EducationField_Medical           <dbl> -0.678910, -0.678910, -0.678910, -0.6…
#> $ EducationField_Other             <dbl> -0.2429766, -0.2429766, 4.1128232, -0…
#> $ EducationField_Technical_Degree  <dbl> -0.3139866, -0.3139866, -0.3139866, -…
#> $ EnvironmentSatisfaction_1        <dbl> -0.6603060, 0.2545383, 1.1693826, 1.1…
#> $ EnvironmentSatisfaction_2        <dbl> -0.9928824, -0.9928824, 1.0064835, 1.…
#> $ EnvironmentSatisfaction_3        <dbl> 1.4469968, -1.2421123, 0.5506271, 0.5…
#> $ Gender_Male                      <dbl> -1.2243282, 0.8162188, 0.8162188, -1.…
#> $ JobInvolvement_1                 <dbl> 0.379543, -1.025818, -1.025818, 0.379…
#> $ JobInvolvement_2                 <dbl> -0.4271984, -0.4271984, -0.4271984, -…
#> $ JobInvolvement_3                 <dbl> -0.7783145, 1.5160483, 1.5160483, -0.…
#> $ JobRole_Human_Resources          <dbl> -0.1914326, -0.1914326, -0.1914326, -…
#> $ JobRole_Laboratory_Technician    <dbl> -0.4623065, -0.4623065, 2.1615955, -0…
#> $ JobRole_Manager                  <dbl> -0.2729664, -0.2729664, -0.2729664, -…
#> $ JobRole_Manufacturing_Director   <dbl> -0.3306955, -0.3306955, -0.3306955, -…
#> $ JobRole_Research_Director        <dbl> -0.2398224, -0.2398224, -0.2398224, -…
#> $ JobRole_Research_Scientist       <dbl> -0.4977039, 2.0078601, -0.4977039, 2.…
#> $ JobRole_Sales_Executive          <dbl> 1.8726493, -0.5336396, -0.5336396, -0…
#> $ JobRole_Sales_Representative     <dbl> -0.2445418, -0.2445418, -0.2445418, -…
#> $ JobSatisfaction_1                <dbl> 1.1528613, -0.6606284, 0.2461164, 0.2…
#> $ JobSatisfaction_2                <dbl> 0.9821324, -1.0175000, -1.0175000, -1…
#> $ JobSatisfaction_3                <dbl> 0.5496309, 1.4543985, -1.2599042, -1.…
#> $ MaritalStatus_Married            <dbl> -0.9186088, 1.0878621, -0.9186088, 1.…
#> $ MaritalStatus_Single             <dbl> 1.4581537, -0.6853322, 1.4581537, -0.…
#> $ OverTime_Yes                     <dbl> 1.5912040, -0.6280274, 1.5912040, 1.5…
#> $ PerformanceRating_1              <dbl> -0.426085, 2.345353, -0.426085, -0.42…
#> $ PerformanceRating_2              <dbl> -0.426085, 2.345353, -0.426085, -0.42…
#> $ PerformanceRating_3              <dbl> -0.426085, 2.345353, -0.426085, -0.42…
#> $ RelationshipSatisfaction_1       <dbl> -1.5836393, 1.1910327, -0.6587487, 0.…
#> $ RelationshipSatisfaction_2       <dbl> 1.037082, 1.037082, -0.963588, -0.963…
#> $ RelationshipSatisfaction_3       <dbl> -0.3486405, 0.5365090, 1.4216585, -1.…
#> $ WorkLifeBalance_1                <dbl> -2.4929720, 0.3379811, 0.3379811, 0.3…
#> $ WorkLifeBalance_2                <dbl> 2.3033457, -0.4338557, -0.4338557, -0…
#> $ WorkLifeBalance_3                <dbl> 0.02755972, -0.75153240, -0.75153240,…

The random forest works fine (as do others):

# define a random forest
rf <- rand_forest(
  mtry = tune(),
  trees = tune()
) %>%
  set_engine("ranger") %>%
  set_mode("classification")

# Create a workflow
rf_workflow <- workflow() %>%
  add_recipe(base_recipe) %>%
  add_model(rf)

# Extract and finalize the parameters
rf_params <- extract_parameter_set_dials(rf_workflow)
rf_params
#> Collection of 2 parameters for tuning
#> 
#>  identifier  type    object
#>        mtry  mtry nparam[?]
#>       trees trees nparam[+]
#> 
#> Model parameters needing finalization:
#>    # Randomly Selected Predictors ('mtry')
#> 
#> See `?dials::finalize` or `?dials::update.parameters` for more information.

# finalize the parameters
rf_params_finalized <- finalize(rf_params, attrition)
rf_params_finalized
#> Collection of 2 parameters for tuning
#> 
#>  identifier  type    object
#>        mtry  mtry nparam[+]
#>       trees trees nparam[+]

And now the SVM:

# Define the SVM model specification
svm <- svm_rbf(
  cost = tune(),
  rbf_sigma = tune()
) %>%
  set_engine("kernlab") %>%
  set_mode("classification")

# Create a workflow
svm_workflow <- workflow() %>%
  add_recipe(base_recipe) %>%
  add_model(svm)

# Extract and finalize the parameters
svm_params <- extract_parameter_set_dials(svm_workflow)
svm_params
#> Collection of 2 parameters for tuning
#> 
#>  identifier      type    object
#>        cost      cost nparam[+]
#>   rbf_sigma rbf_sigma nparam[+]

svm_params_finalized <- finalize(svm_params, attrition)
#> Error in `map()`:
#> ℹ In index: 2.
#> Caused by error in `object$finalize()`:
#> ! The matrix version of the initialization data is not numeric.

The SVM doesn't need to be finalized since it doesn't have unknown parameters, but I'm not sure of a way around this since it is being called in a loop. (It also seems that it shouldn't cause an error.)

Any insights or suggestions?

Thank you!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.