I am working on an example where I want to set the levels of two factors to be the same.
A tidy-ish solution is as follows:
library(dplyr)
library(purrr)
example_data <- tibble(letter1 = LETTERS[1:5],
letter2 = LETTERS[2:6])
example_data
#> # A tibble: 5 x 2
#> letter1 letter2
#> <chr> <chr>
#> 1 A B
#> 2 B C
#> 3 C D
#> 4 D E
#> 5 E F
tidied <- example_data %>%
mutate(letter1 = factor(letter1, sort(unique(c(letter1, letter2)))),
letter2 = factor(letter2, levels(letter1)))
map(tidied, levels)
#> $letter1
#> [1] "A" "B" "C" "D" "E" "F"
#>
#> $letter2
#> [1] "A" "B" "C" "D" "E" "F"
I saw that forcats has a fct_unity()
function specifically designed to unify the levels of factors. However, I can't see how to integrate this with a dplyr pipeline since it requires a list of factors as input. I tried
library(forcats)
example_data %>%
mutate(letter1 = factor(letter1),
letter2 = factor(letter2)) %>%
mutate_at(vars(letter1, letter2), fct_unify)
#> Error: `fs` must be a list
but that's no good as mutate_at
is still working column-wise. Something like
fct_tidied <- example_data %>%
mutate(letter1 = factor(letter1),
letter2 = factor(letter2))
fct_tidied[c("letter1", "letter2")] <-
fct_unify(fct_tidied[c("letter1", "letter2")])
map(fct_tidied, levels)
#> $letter1
#> [1] "A" "B" "C" "D" "E" "F"
#>
#> $letter2
#> [1] "A" "B" "C" "D" "E" "F"
would work, but requires a step outside the tidying pipeline, which isn't so tidy and therefore not much better than my original approach. Is there a way I can be super tidy and use fct_unify()
in my dplyr pipeline? Note that my real example has additional columns in the data frame that I want to carry along.