I'm a little stuck here, and I would really appreciate any help to solve this!
I have a data set with three different columns/variables that would like to utilize to create a new categorical variable (preferably using tidyverse or dplyr):
code1 <- c("E1003", "30024", "E41202", "60034")
code2 <- c("X3323", "A1234", "7972", "5555")
code3 <- c("Z2232", "A1234", "E41202", "9999")
df <- data.frame(code1, code2, code3)
For any of code1, code2, or code3, I would like to create a new categorical variable (cat_var), based on the following conditions
cat_var = "group 1" if code1 or code2 or code3 start with "E100" or "A123" or "99"
cat_var="group 2"if code1 or code2 or code3 start with "79" or "E41" or "300"
cat_var="group 3"if code1 or code2 or code3 start with "Z2" or "55" or "X33"
I tried the following script below, but it didn't work:
all_columns = c("code1 ", "code2 ", "code3")
df_new <- df %>%
mutate(cat_var=case_when((starts_with(all_columns , c("E100", "A123", "99"))) ~ "group 1",
(starts_with(all_columns , c( "79", "E41", "300"))) ~ "group 2",
(starts_with(all_columns , c( "Z2", "55", "X33"))) ~ "group 3"))
I get the following error:
"Error in mutate()
:
! Problem while computing cat_var = case_when(...)
.
Caused by error in peek_vars()
:
! starts_with()
must be used within a selecting function.
i See https://tidyselect.r-lib.org/reference/faq-selection-context.html.
Run rlang::last_error()
to see where the error occurred."
Thanks in advance!