Hello,
I have previously had this issue when imputing with a recipe and parallel processing. I don't think I need to show the whole notebook to explain the issue.
I have the following recipe:
vars_median <- c("WarehouseToHome", "HourSpendOnApp",
"OrderAmountHikeFromlastYear","CouponUsed" )
vars_linear <- c("Tenure", "OrderCount", "DaySinceLastOrder")
varstoimputewith <- names(Echurn_train[5:20])
churn_recipe <- Echurn_train %>%
recipe(Churn ~ .) %>%
step_rm(CustomerID) %>%
step_impute_median(
all_of(vars_median) ) %>%
step_impute_linear(all_of(vars_linear),
impute_with = varstoimputewith ) %>%
step_normalize(all_numeric_predictors()) %>%
step_mutate_at(all_logical_predictors(), fn =as.factor) %>%
step_dummy(all_factor_predictors())
But when tuning a model (like LASSO or KNN) using parallel processing, I will get an error because the variables
"vars_median" and "vars_linear" are not getting passed to the workers.
I have previously worked around this (based most likely on some google searches over a year ago) using
clusterEvalQ(cl, {vars_median <- c("WarehouseToHome", "HourSpendOnApp",
"OrderAmountHikeFromlastYear","CouponUsed" );
vars_linear <- c("Tenure", "OrderCount", "DaySinceLastOrder")})
But now I am getting a warning that I should be using the futures package instead. In the futures package, I am not aware of a way to directly pass variables to the workers.
I can still solve this by explicitly coding the variables I want imputed in the recipe (it's just messier). Like this:
churn_recipe <- Echurn_train %>%
recipe(Churn ~ .) %>%
step_rm(CustomerID) %>%
step_impute_median(
all_of(c("WarehouseToHome", "HourSpendOnApp",
"OrderAmountHikeFromlastYear","CouponUsed" )) ) %>%
step_impute_linear(all_of(
c("Tenure", "OrderCount", "DaySinceLastOrder") ),
impute_with = varstoimputewith ) %>%
step_normalize(all_numeric_predictors()) %>%
step_mutate_at(all_logical_predictors(), fn =as.factor) %>%
step_dummy(all_factor_predictors()) #
I am wondering if there is an approach here I am not considering or if there is a fix on the way in the tune package?
Note by the way that for whatever reason, "varstoimputewith" is passed as expected without an issue.
Thanks in advance.