Continuing the discussion from Use `fct_unify()` in dplyr pipeline:
The original post ends in a folder of unified factor variables. The last part, to use the unified factors in a dplyr pipeline was missing, which for me leaves an unsatisfactory feeling. The repex below completes the entire pipeline.
# original example data
example_data <-
tibble(
letter1 = LETTERS[1:5],
letter2 = LETTERS[2:6]) |>
transmute(
letter1 = factor(letter1),
letter2 = factor(letter2))
# unify both factors using mutate. `letter1` and `letter1_uf` are the same data, but differ only in factor levels. Same with `letter2`/`letter2_uf`.
df <-
example_data |>
mutate(
letter1_uf = (list(letter1, letter2) |> forcats::fct_unify())[[1]],
letter2_uf = (list(letter1, letter2) |> forcats::fct_unify())[[2]]
)
str(df)
Some overhead is present because the unification is calculated twice, questioning if this should be part of the pipeline.