@scottyd22 's tidyverse approach works. Because I'm a geriatric, it's too hard for me, though. There are too many discrete operations smushed together and too much syntax to remember. So, I'd do it in base
d <- data.frame(country = c("US", "UK", "Malasia", "Albania", "Poland"),
abbr = c("SP num1", "SP num1", "MSP num2", "ASD num1", "ASD num3"))
# first pass--strip first letter from 3-letter prefix
d$abbr <- gsub("[^SDP]","",d$abbr)
# second pass--expand abbreviation
d$abbr <- gsub("SP","state police",d$abbr)
d$abbr <- gsub("SD","state duma",d$abbr)
# if only some countries are to be prefixed
to_prefix <- which(d$country %in% c("Malasia","Albania"))
d[to_prefix,"abbr"] <- paste(d[to_prefix,"country"],d[to_prefix,"abbr"])
# otherwise, if all countriea
# d$abbr <- paste(d$country,d$abbr)
# the desired output for Poland, asd num3, is inconsistent but could be
# handled similarly
# to lowercase everything
d$country <- tolower(d$country)
d$abbr <- tolower(d$abbr)
d
#> country abbr
#> 1 us state police
#> 2 uk state police
#> 3 malasia malasia state police
#> 4 albania albania state duma
#> 5 poland state duma
Hello! Thank you for your answer. Unfortunately, I realised that my data sometimes includes abbreviations in the middle of the string like: "first ASD num1". Do you have any ideas how to work with it?
How many of these do you have ?
if you have 10 I would prefer one way over if you had 1000 where I might prefer another
and if you have 100's could there be more with different 'forms' that you may not have realised yet, and you will come back to ask about those also ?
I'd like to be more informed of the context needing a solution before attempting one.