Loop or better way for multiple mutate and case_when criteria

jasongeslois · September 24, 2021, 3:53pm

I know there is a better way or some type of loop to do this, but what would be the best way to do the mutate/case_when step below instead of doing so many mutate steps. Essentially, I'm trying to add 168 columns to the "new" dataset, one for each "type", where "group1" will always be "A", and then again where "group2" will always be "C". There are 84 unique types.

library(tidyverse)

set.seed(42)  
n <- 84
datA <- data.frame(id=1:n, 
                  type=factor(paste("type", 1:n)),
                  group1=sample(rep(LETTERS[1:2], n/2)),
                  group2=sample(rep(LETTERS[3:4], n/2)))
datB <- data.frame(id=1:n, 
                   type=factor(paste("type", 1:n)),
                   group1=sample(rep(LETTERS[1:2], n/2)),
                   group2=sample(rep(LETTERS[3:4], n/2)))
dat <- rbind(datA, datB)

new <- dat %>%
    mutate(type1Group1 = case_when(
        (type == "type 1" & group1 == "A") ~ TRUE,
    )) %>%
    mutate(type1Group2 = case_when(
        (type == "type 1" & group2 == "C") ~ TRUE,
    ))%>%
    mutate(type2Group1 = case_when(
        (type == "type 2" & group1 == "A") ~ TRUE,
    ))%>%
    mutate(type2Group2 = case_when(
        (type == "type 2" & group2 == "C") ~ TRUE,
    ))

technocrat · September 25, 2021, 11:57pm

The only shortcoming of the code is that the case_when code produces NAs where unaddressed combinations occur. Avoid with

suppressPackageStartupMessages({
  library(dplyr)
})

set.seed(42)  
n <- 84
datA <- data.frame(id=1:n, 
                   type=factor(paste("type", 1:n)),
                   group1=sample(rep(LETTERS[1:2], n/2)),
                   group2=sample(rep(LETTERS[3:4], n/2)))
datB <- data.frame(id=1:n, 
                   type=factor(paste("type", 1:n)),
                   group1=sample(rep(LETTERS[1:2], n/2)),
                   group2=sample(rep(LETTERS[3:4], n/2)))
dat <- rbind(datA, datB)
new <- dat %>%
  mutate(type1Group1 = ifelse(type == "type 1" & group1 == "A",TRUE,FALSE))
  # repeat for other tests          
head(new)
#>   id   type group1 group2 type1Group1
#> 1  1 type 1      A      D        TRUE
#> 2  2 type 2      A      D       FALSE
#> 3  3 type 3      A      D       FALSE
#> 4  4 type 4      B      C       FALSE
#> 5  5 type 5      B      C       FALSE
#> 6  6 type 6      B      C       FALSE

nirgrahamuk · September 26, 2021, 10:54am



library(rlang)
library(glue)   

casemaker <- function(t){
 c1 <-  glue("case_when(type=='type {t}' & group1=='A' ~ TRUE, 
            TRUE ~ FALSE)")
  c2 <- glue("case_when(type=='type {t}' & group2=='C' ~ TRUE, 
            TRUE ~ FALSE)")
  list(parse_expr(c1) ,
       parse_expr(c2)) %>% set_names(nm = c(glue("type{t}Group1"),
                                            glue("type{t}Group2")))
}

(case_statements <- map(1:n,
                        casemaker)  %>% flatten)


new <- dat %>%  mutate(!!!(case_statements)) %>% tibble

jasongeslois · September 26, 2021, 11:37pm

That did the trick. Thanks so much.

system · October 3, 2021, 11:38pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.