I'm trying to use the recipe package for database management and would like to know if there is any way to get levels that have been collapsed into step_other.
library(tidyverse)
library(tidymodels)
data("ames")
ames %>%
summarise(lvl_neighb = n_distinct(Neighborhood),
lvl_exterior = n_distinct(Exterior_1st))
rec_ames <-
recipe(Sale_Price ~., data = ames) %>%
step_other(c(Neighborhood,Exterior_1st), threshold = 0.01) %>%
prep()
rec_ames %>%
juice() %>%
summarise(lvl_neighb= n_distinct(Neighborhood),
lvl_exterior= n_distinct(Exterior_1st))
#levels that have been kept in the database
rec_ames$steps[[1]]$objects$Neighborhood$keep
I would like to print a report showing the count and names of records that are currently not being used (levels that have collapsed). Has anyone ever done this?
Thank you so much!