Hello, all.
I'm trying to create a function to automatically collapse labelled 4-point survey scales to binary variables and relabel them. Here's the test data I'm working with:
data <- bind_cols(favA = sample(c(1:4, 98, 99, 200), 10, replace = T),
favB = sample(c(1:4, 98, 99, 200), 10, replace = T),
appC = sample(c(1:4, 98, 99, 200), 10, replace = T)
)
I want to select a subset of the variables, recode them to make sure they are standard, apply labels, and then recode them again collapsing the 4-point scale to a 2-point scale, and then applying labels to the new variables. I took the syntax that I use to do it outside a function and tried to translate it to a function. I came up with this:
bicode <- function(df, vars) {
df %>%
mutate(across({{vars}},
.fns = list(
~dplyr::recode(., `1` = 1L, `2` = 2L, `3` = 3L, `4` = 4L, `98` = 99L, `99` = 99L, .default = NA_integer_),
~haven::labelled(., labels = c("Very unfavorable" = 1L, "Somewhat unfavorable" = 2L, "Somewhat favorable" = 3L, "Very favorable" = 4L, "DK/NA" = 99L))),
.names = "{.col}")) %>%
mutate(across({{vars}},
.fns = list(~dplyr::recode(., `1` = 1L, `2` = 1L, `3` = 2L, `4` = 2L, `98` = 99L, `99` = 99L, .default = NA_integer_),
~haven::labelled(., labels = c("Unfavorable" = 1L, "Favorable" = 2L, "DK/NA" = 99L))),
.names = "{.col}2"))
}
When I run any of the following, I get an error telling me that I have duplicate column names, as below
data %>% bicode(vars(favA))
data %>% bicode(vars(favA:favB))
data %>% bicode(starts_with('fav'))
Error: Problem with `mutate()` input `..1`.
ℹ `..1 = across(...)`.
x Names must be unique.
x These names are duplicated:
* "favA2" at locations 1 and 2.
* "favB2" at locations 3 and 4.
* "favA_12" at locations 5 and 6.
* "favA_22" at locations 7 and 8.
* "favB_12" at locations 9 and 10.
* ...
Run `rlang::last_error()` to see where the error occurred.
What do I need to do to a) replace the original columns with the first transformation and b) return new, unique columns with the second? Ideally, the output would look like this:
# A tibble: 10 x 5
favA favB appC favA2 favB2
<dbl+lbl> <dbl+lbl> <int> <dbl+lbl> <dbl+lbl>
1 3 [Some… 200 2 2 [Favo… NA
2 4 [Very… 99 [DK/… 98 2 [Favo… 99 [DK/…
3 1 [Very… 1 [Ver… 1 1 [Unfa… 1 [Unfa…
4 99 [DK/N… 3 [Som… 99 99 [DK/… 2 [Favo…
5 3 [Some… 4 [Ver… 2 2 [Favo… 2 [Favo…
6 99 [DK/N… NA 200 99 [DK/… NA
7 2 [Some… 4 [Ver… 1 1 [Unfa… 2 [Favo…
8 4 [Very… 4 [Ver… 2 2 [Favo… 2 [Favo…
9 99 [DK/N… 1 [Ver… 99 99 [DK/… 1 [Unfa…
10 99 [DK/N… 2 [Som… 99 99 [DK/… 1 [Unfa…
Any help would be much apprciated!
UPDATE: The first answer below helped me clarify my question:
I don't understand how to get the first mutate command to return the original variables rather than a new variable at var1. I thought by adding .names = "{.col}", the new variables would replace the old ones. Then in the second mutate, it would select the same variables it just mutated and then mutate them into new variables "var2" because .names = "{.col}2".
Thanks,
David