I’m testing the tidymodels package to do some modeling.
So far I started with the basic workflow (recipe + spec), no cross-validation.
Here you have a repex for a sample dataframe, with one endpoint column (factor of 0 and 1), and 20 descriptor variables (genes, numeric variables).
For this I want to perform an SVM with a radial kernel for all the genes predicting the endpoint. The formula would be endpoint ~ ., data = data
However, I’m getting the following error:
> svm_fit <- fit(svm_wflow, data)
Error in eval_tidy(env$formula[[2]], env$data) : object '.' not found
Apparently is not recognizing the . in the formula, but not sure why, nor how to fix it.
library(tidymodels)
tidymodels_prefer()
# Create the endpoint variable
endpoint <- factor(rep(c(1, 0), each = 10))
# Create the gene columns
genes <- data.frame(matrix(rnorm(20 * 20), ncol = 20))
# Combine the endpoint and gene columns into a dataframe
data <- data.frame(endpoint, genes)
# Recipe
svm_linear_rec <-
recipe(endpoint ~ ., data = data) %>%
step_normalize(all_numeric_predictors()) %>%
step_dummy(endpoint)
# Spec
svm_spec <-
svm_rbf(cost = tune(), rbf_sigma = tune()) %>%
set_engine("kernlab") %>%
set_mode("classification")
# Workflow
svm_wflow <-
workflow() %>%
add_model(svm_spec) %>%
add_recipe(svm_linear_rec)
# Fit model
svm_fit <- fit(svm_wflow, data)
You are getting that specific error because you are using step_dummy() on the outcome endpoint, so when it is time to fit the model, it tries to use endpoint as the outcome, but that variable doesn't exists anymore as it was removed by step_dummy().
When using {tidymodels} for classification you shouldn't need to create dummy variables for the outcome.
With the specification of your model, if you want to tune() the parameters you need to use one of the tune_*() functions. See more information here tidymodels - Tune model parameters