Extracting the logistic regression model from this tuned workflow (in tidymodels)?

for-the-love-of-cod · October 19, 2023, 9:33pm

I am creating logistic regression model using some data from patients in an Intensive Care Unit. The model seeks to predict if a patient is likely to live or die in the next 7 days based on their response to a certain treatment.

For this I am using the tidymodels suite in R. I have successfully trained and tuned an elastic net logistic regression model, but I want to see the specific models that have been created (i.e. which variables are in that model, break down the weighting it is giving each variable, etc). I am very close, but I just can't quite get the last little step.

My workflow is as follows. Initial data splitting:

proning_initial_split28 <- 
  raw_proning_mortality %>% 
  initial_split(prop = 0.9, strata = mortality_28)

proning_modelTrain_28 <- 
  proning_initial_split28 %>% 
  training()

Creation of k-fold object with 5 folds-

lr_v_fold <- 
  vfold_cv(data = proning_modelTrain_07,
           v = 5, 
           repeats = 5, 
           strata = mortality_28)

Creation of recipe for data processing-

lr_recipe <- 
  recipe(proning_modelTrain_28, formula = mortality_28 ~ .) %>% 
  step_rm(mortality_07) %>% 
  step_dummy(all_factor_predictors(), -mortality_28) %>% 
  step_impute_bag(all_predictors()) %>% 
  step_corr(all_predictors(), threshold = 0.9) %>% 
  step_zv(all_predictors()) %>% 
  prep()

Creation of model and tuning grid for model-

lr_model_01 <- 
  logistic_reg(mode = 'classification',
               engine = 'glmnet',
               penalty = tune(),
               mixture = tune()) %>% 
  set_args(maxit=1e+06)


lr_tuning_grid <- 
  grid_max_entropy(penalty(),
                   mixture(),
                   iter = 2000)

Creation of final workflow to bring it all together. I have passed the control_grid() instructions to both save the predictions made and to save the model which is generated at each step-

lr_workflow <- 
  workflow() %>% 
  add_model(lr_model_01) %>% 
  add_recipe(lr_recipe) %>% 
  tune_grid(resamples = lr_v_fold,
            grid = lr_tuning_grid,
            metrics = metric_set(sens, spec, ppv, npv, roc_auc),
            control = control_grid(save_pred = T, 
                                   extract = extract_fit_parsnip))

cannot access the models within this final workflow object. lr_workflow$.extracts seems to contain the models (example below).

However, getting much deeper is difficult.

The above image shows item [[25]. I can go in part of this with lr_workflow$.extracts[[25]]$.extracts[1], but the output I get is as follows-

[[1]]
parsnip model object


Call:  glmnet::glmnet(x = maybe_matrix(x), y = y, family = "binomial",      alpha = ~0.0505244575906545, maxit = ~1e+06) 

    Df  %Dev Lambda
1    0  0.00 4.4620
2    1  0.16 4.0660
3    1  0.33 3.7050
(continues for a total 100 rows)

How can I get a better breakdown of any of the logistic regression models I have trained? By this I mean a breakdown that outlines the coefficients that apply to each variable, deviance, residuals, AIC, etc.

nirgrahamuk · October 24, 2023, 12:43pm

most generally theres a convention in R that a model with coefficients should have an associated method that implements coef() as a way to access those coefficients.
simple example

 x <- 1:5; coef(lm(c(1:3, 7, 6) ~ x))
#(Intercept)           x 
#       -0.7         1.5

generic R documentation can be found by

 ?coef

furthermore tidymodels includes broom which has tidyfriendly ways to access model info. glance() comes to mind

system · November 14, 2023, 12:43pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.