Combining `rename` with `any_of` or `all_of`when there are multiple matching names

I am using the any_of() functionality combined with rename() to dynamically rename columns in hundreds of data frames that are being processed as part of a data cleaning pipeline. Thus far, I've written code that successfully works, given that there are not multiple matching columns in the data frame that match my renaming schema. Instead, the code errors since the resulting data frame would have two columns with the same name. See a minimal example below.

My desired behavior would be either to choose the first matching column and rename it, or to have behavior similar to the _join() family that appends ".x" or ".y" (or some user given specification) to the end of the duplicate columns, rather than just failing outright.

library(tidyverse)

df_a <- tibble(
  testing = 1
)

df_b <- tibble(
  tester = 1
)

df_c <- tibble(
  testing = 1,
  tester = 1
)

rename_vars <- c(
  "test" = "testing",
  "test" = "tester"
)

rename(df_a, any_of(rename_vars))
#> # A tibble: 1 × 1
#>    test
#>   <dbl>
#> 1     1
rename(df_b, any_of(rename_vars))
#> # A tibble: 1 × 1
#>    test
#>   <dbl>
#> 1     1
rename(df_c, any_of(rename_vars))
#> Error in `rename()`:
#> ! Names must be unique.
#> ✖ These names are duplicated:
#>   * "test" at locations 1 and 2.
#> Backtrace:
#>      ▆
#>   1. ├─dplyr::rename(df_c, any_of(rename_vars))
#>   2. └─dplyr:::rename.data.frame(df_c, any_of(rename_vars))
#>   3.   └─tidyselect::eval_rename(expr(c(...)), .data)
#>   4.     └─tidyselect:::rename_impl(...)
#>   5.       └─tidyselect:::eval_select_impl(...)
#>   6.         ├─tidyselect:::with_subscript_errors(...)
#>   7.         │ └─rlang::try_fetch(...)
#>   8.         │   └─base::withCallingHandlers(...)
#>   9.         └─tidyselect:::vars_select_eval(...)
#>  10.           └─tidyselect:::ensure_named(...)
#>  11.             └─vctrs::vec_as_names(names(pos), repair = "check_unique", call = call)
#>  12.               └─vctrs (local) `<fn>`()
#>  13.                 └─vctrs:::validate_unique(names = names, arg = arg, call = call)
#>  14.                   └─vctrs:::stop_names_must_be_unique(names, arg, call = call)
#>  15.                     └─vctrs:::stop_names(...)
#>  16.                       └─vctrs:::stop_vctrs(...)
#>  17.                         └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = call)

Created on 2024-02-26 with reprex v2.0.2

you can use dplyr::rename_with and a custom function to do this like below. This solution appends a left_join-eque method of appending "." and index as a suffix. If only one match is found, no suffix is applied:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

df_a <- tibble(testing = 1,
               apples = 1)

df_c <- tibble(testing = 1,
               tester = 1,
               apples = 1)

vars_to_rename = c("testing", "tester")

# New function to facilitate renaming with dplyr::rename_with
# If more than one match is found, append ".index" suffix to new_name
## cols: character vector of column names to rename
## new_name: string value for the new column name
rename_col = function(cols, new_name) {
  if(length(cols) > 1) {
    paste(new_name,seq_along(cols),sep = ".")
  } else {
    new_name
  }
}


# no suffix appended to new column names
rename_with(.data = df_a,
            .fn = \(x) rename_col(x, "test"),
            .cols = any_of(vars_to_rename)
)
#> # A tibble: 1 × 2
#>    test apples
#>   <dbl>  <dbl>
#> 1     1      1

# suffixes applied
rename_with(.data = df_c,
            .fn = \(x) rename_col(x, "test"),
            .cols = any_of(vars_to_rename)
)
#> # A tibble: 1 × 3
#>   test.1 test.2 apples
#>    <dbl>  <dbl>  <dbl>
#> 1      1      1      1

Created on 2024-03-01 with reprex v2.1.0

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.