Use `fct_unify()` in dplyr pipeline

heather · February 17, 2019, 2:00pm

I am working on an example where I want to set the levels of two factors to be the same.

A tidy-ish solution is as follows:

library(dplyr)
library(purrr)
example_data <- tibble(letter1 = LETTERS[1:5],
                       letter2 = LETTERS[2:6])
example_data
#> # A tibble: 5 x 2
#>   letter1 letter2
#>   <chr>   <chr>  
#> 1 A       B      
#> 2 B       C      
#> 3 C       D      
#> 4 D       E      
#> 5 E       F

tidied <- example_data %>%
  mutate(letter1 = factor(letter1, sort(unique(c(letter1, letter2)))),
         letter2 = factor(letter2, levels(letter1)))
map(tidied, levels)
#> $letter1
#> [1] "A" "B" "C" "D" "E" "F"
#> 
#> $letter2
#> [1] "A" "B" "C" "D" "E" "F"

I saw that forcats has a fct_unity() function specifically designed to unify the levels of factors. However, I can't see how to integrate this with a dplyr pipeline since it requires a list of factors as input. I tried

library(forcats)
example_data %>%
  mutate(letter1 = factor(letter1),
         letter2 = factor(letter2)) %>%
  mutate_at(vars(letter1, letter2), fct_unify)
#> Error: `fs` must be a list

but that's no good as mutate_at is still working column-wise. Something like

fct_tidied <- example_data %>%
    mutate(letter1 = factor(letter1),
           letter2 = factor(letter2))
fct_tidied[c("letter1", "letter2")] <- 
  fct_unify(fct_tidied[c("letter1", "letter2")])
map(fct_tidied, levels)
#> $letter1
#> [1] "A" "B" "C" "D" "E" "F"
#> 
#> $letter2
#> [1] "A" "B" "C" "D" "E" "F"

would work, but requires a step outside the tidying pipeline, which isn't so tidy and therefore not much better than my original approach. Is there a way I can be super tidy and use fct_unify() in my dplyr pipeline? Note that my real example has additional columns in the data frame that I want to carry along.

heather · February 18, 2019, 3:32pm

I've settled on this variation of my last idea for the moment, which at least handles the conversion to factors in a single (piped) step:

example_data[c("letter1", "letter2")] <- 
  example_data %>%
  transmute(letter1 = factor(letter1),
         letter2 = factor(letter2)) %>%
  fct_unify()
map(example_data, levels)
#> $letter1
#> [1] "A" "B" "C" "D" "E" "F"
#> 
#> $letter2
#> [1] "A" "B" "C" "D" "E" "F"

system · March 11, 2019, 3:44pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.