GAM models, workflow sets and "`fit()` must be used with GAM models (due to its use of formulas). " error

Hi!

I run into the "fit() must be used with GAM models (due to its use of formulas)." error when trying to fit a GAM inside a workflow set.

I have seen this post in the community that explains that we need to add "formula = gam_formula" inside the 'add_model()' function when defining a workflow. Error in `fit_xy()` with GAM model

However, I am not sure where I could do this when defining a workflow inside a workflow set.

Here's a reprex, slightly modified from the original post:

library(tidyverse, quietly = TRUE)
library(dplyr, quietly = TRUE)
library(sf, quietly = TRUE)
library(tidymodels, quietly = TRUE)
library(EnvStats, quietly = TRUE)

dep_var <- "y"
treat_var <- "xt"
ID_vars <- c("sampleID", "replicateID")
quant_indep <- c("X1", "X2", "X3")
qual_indep <- c("N1", "N2")
all_indep_vars <- c(ID_vars, quant_indep, qual_indep)

create artificial dataset

txt <- c("A", "B", "C")
samp_ids <- c("sample1", "sample2")
repl_ids <- c("rep1", "rep2", "rep3", "rep4", "rep5")

modl_df <- data.frame(y = EnvStats::rnormTrunc(250, mean = 0, sd = 0.7, min = -1.5, max = 1.5),
sampleID = array(sample(samp_ids), 250),
replicateID = array(sample(repl_ids),250),
xt = rnorm(250, mean = 0, sd = 1),
x = matrix(sample(runif(5000, min = -0.5, max = 1.5), 750), 250, 3),
c = matrix(sample(txt, 500, replace = TRUE), 250, 2))
modl_df <- modl_df %>% dplyr::rename(X1 = x.1, X2 = x.2, X3 = x.3, N1 = c.1, N2 = c.2)

my_split <- initial_split(modl_df)
train <- training(my_split)
test <- testing(my_split)

my_resamples <- vfold_cv(train, v = 3)

Workflow for GAM modeling

gam_formula <- as.formula("y ~ xt + sampleID + replicateID + X1 + X2 + X3 + N1 + N2")

gam_recipe <- recipe(formula = gam_formula, data= train)

gam_model <- gen_additive_mod(select_features = TRUE, engine = "mgcv") %>%
set_engine("mgcv") %>%
set_mode("regression")

gam_wkflo <- workflow() %>%
#add_model(gam_model) %>%
add_model(gam_model, formula = gam_formula) %>%
add_recipe(gam_recipe)

response_gam <- gam_wkflo %>% parsnip::fit(data = modl_df)

all_workflows <-
workflow_set(
preproc = list("recipe" = gam_recipe),
models = list(my_gam_spec = gam_model)
)

all_workflows <- all_workflows %>%
workflow_map(resamples = my_resamples, grid = 4, verbose = TRUE)

i No tuning parameters. fit_resamples() will be attempted
i 1 of 1 resampling: recipe_my_gam_spec
→ A | error: fit() must be used with GAM models (due to its use of formulas).
There were issues with some computations A: x3
:heavy_multiplication_x: 1 of 1 resampling: recipe_my_gam_spec failed with
Warning messages:
1: All models failed. Run show_notes(.Last.tune.result) for more information.
2: Unknown or uninitialised column: .notes.

EDIT: I think I found it. I can use update_workflow_model() to add the formula to the model spec.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.