Hi,
I am trying to implement an ML scoring model using vetiver
.
Basically I have a file that trains a model which then saves that model to a model
folder to be picked up by a different script to actually screen the raw data.
Below some pseudo code
library(vetiver)
library(tidymodels)
library(pins)
### Some Code for Data manipulation using recipes
# Here I fit my final XGB model to a recipe i created based on finding optimal values for parameters
mod_final <- final_xgb %>%
fit(mydf)
# Then i save the version model to a folder call 'model'
v <- mod_final %>%
vetiver_model(model_name = "test-mod")
# I write the vetiver object to a network drive
model_board <- board_folder(path = here::here('model'))
model_board %>% vetiver_pin_write(v)
Now i have a folder which contains the vetiver object in the folder model/20221024T132328Z-8fbb5/test-mod.rds
The second part of the project involves scoring the data in a completely different script and in a completely different environment.
In general without using vetiver
, I know i can save the workflow as an RDS and when i import it in again and apply it to new data it will apply the transformations and then score the data.
I am not able to work this out with vetiver
library(vetiver)
# Pull in our data to be screened from the DB..
mydf <- read_from_db()
# Now we pull in the vetiver object. Since we have a new version of the model each time we train
# we try and find the most recent vetiver object and sort by the most recent created model and import that in.
all_paths <- list.dirs(path = here::here("model")) %>%
enframe() %>%
filter(str_detect(value, '[0-9]')) %>%
mutate(modified_date = file.info(.$value)$ctime) %>%
filter(modified_date == max(modified_date))
mod_path = str_c(all_paths$value, "test-mod.rds", sep = '/')
eu_wf_model <- readRDS(mod_path)
# Pull the new data from the database
score_df <- pull_daily_data()
# Here we score the attributes and here is where it breaks
report <- score_df %>%
bind_cols(predict(eu_wf_model, score_df , type = "prob"))
It breaks on the last line with the error
Error in UseMethod("predict") : no applicable method for 'predict' applied to an object of class "list"
This is because the vetiver object is a list of $model, $raw, $ptype and $required_pkgs
My question is: Is there a way to use vetiver to apply a workflow to the new data?
Thank you for your time