str_to_upper() for specific words in few column

Hello,

Some of the words in few columns of my dataset are in CAPS, some in Titles & some in lower case. For the most part I need all titles and thus converted the columns using str_to_title(). However, there are some words that need to be in CAPS. How can we define to convert only those specific words to be converted to_upper(). If I pass all relevant strings to_upper(Abc, Li), It doesn't work.

So, I used recoding. However, my original data is huge and recoding can add errors as well as is tine consuming. Is there a better way to get the resolution here. The output should be the one in test_data, but instead of recoding, I would like to use more efficient method

library(datapasta)
library(tidyverse)

df <- data.frame(
  stringsAsFactors = FALSE,
        Variable.A = c("ABC", "Cargo", "CdA - PROCESS", "CARGO MAIN", "Def"),
        Variable.B = c("VAR", "Var", "Abc - DEF", "Abc - Def", "Test LL")
) %>%
                  rename("Variable A" = 1, "Variable B" = 2)

# Converting All to Titles
data <- df %>%
mutate(across(c(`Variable A`, `Variable B`), ~str_to_title(.)))

# Converting only few to CAPS and this is the output I am looking for but with better method so new method can be utilized in large dataset

test_data <- data %>%
              mutate(`Variable A` = recode(`Variable A`, "Abc" = "ABC",
                                           "Cda - Process" = "CDA - Process"
                                           ))%>%
             mutate(`Variable B` = recode(`Variable B`, "Abc - Def" = "ABC - Def",
                                                        "Test Ll" = "Test LL"
                                            ))

Thanks for your help!

Do you have a vector of those words? Otherwise, from the example, there appears no unambiguous way to get from the initial data to the result shown. Also, is it necessary to preserve the hyphens?

You could use str_replace() for the specific strings like "ABC" and "LL"

I would do something along these lines :

library(tidyverse)

df <- data.frame(
  stringsAsFactors = FALSE,
  Variable.A = c("ABC", "Cargo", "CdA - PROCESS", "CARGO MAIN", "Def"),
  Variable.B = c("VAR", "Var", "Abc - DEF", "Abc - Def", "Test LL")
) %>%
  rename("Variable A" = 1, "Variable B" = 2)

# Converting All to Titles
data <- df %>%
  mutate(across(c(`Variable A`, `Variable B`),
                ~str_to_title(.)))


my_translations <- c(
  "Abc" = "ABC",
  "Cda - Process" = "CDA - Process",
  "Abc - Def" = "ABC - Def",
  "Test Ll" = "Test LL"
)

my_replace <- function(var,tl = my_translations){
  vloc <- which(var %in% names(my_translations))
  var[vloc] <- my_translations[var[vloc]]
  var
}
data |> mutate(across(everything(),
                      my_replace))

This is great!
Thanks @nirgrahamuk!

Thank you all!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.