tidymodels logistic regression with successes and failures

I have a situation where I need to use the cbind(successes, failures) form of logistic regression. Reading through the recipes() documentation, I thought I could use recipe(sucesses + failures ~ .), but that doesn't seem to work. Below I have an example using the Challeng data from the alr4 package.

library(alr4)
library(tidyverse)    
library(tidymodels)

tidymodels_prefer(quiet = TRUE) # avoid conflicts

data("Challeng")

challenge <- Challeng %>% 
  rownames_to_column(var = "date") %>% 
  mutate(date = mdy(date),
         success = n - fail,
         at_least_one_fail = if_else(fail > 0, 1, 0))

recipe_chall <- recipe(fail + success ~ temp,
                     challenge)


log_mod <-
  # Define logistic
  logistic_reg() %>%
  # Set engine to "glm"
  set_engine('glm') %>% 
  # Set mode - not necessary
  set_mode("classification")


log_wf_chall <- 
  # Set up the workflow
  workflow() %>% 
  # Add the recipe
  add_recipe(recipe_chall) %>% 
  # Add the modeling
  add_model(log_mod)


log_fit_chall <- log_wf_chall %>% 
  fit(challenge)

Gives this error:
Error in check_outcome():
! For a classification model, the outcome should be a factor, not a tbl_df.
Run rlang::last_trace() to see where the error occurred.

You can't currently do this in tidymodels but there is an issue about it open here. I recommend that you add a bit about your use case or at the least :+1: the issue. I know that I think support for this should be added! :wink:

1 Like

Thanks, Julia! I'll add a comment there. And thanks for replying so quickly - hope you didn't feel pressure to do that just because I said you're so good about doing that at the Hangout last week :slight_smile:

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.