I have a vector of strings like below:
x <- c("AA.3", "BEAM.2", "BRK.B", "BF.B", "CB.1", "DD.2",
"IGT.1", "LSI.1", "NSM.2", "PLL.1", "SUN.1", "DTV.2",
"ALTR.1", "NVLS.1", "AGN.2", "CPWR.1", "LIFE.3", "FTI.1",
"SE.7", "MMI.3", "ABC", "AS.")
I need to trim these strings. For the strings ending with a dot and number (.1 for example), remove the dot and number. For strings ending with a dot, remove the dot. For the strings ending with a dot and letter (.B for example), do not change them. For other strings, do not change them.
I am looking at the stri_trim_right function from the package stringi, but could not figure out how to do it.
what I would try is to first separate it using the dot and then put it back together according to the rules you indicate
This is a job for regular expressions
library(stringr)
x <- c("AA.3", "BEAM.2", "BRK.B", "BF.B", "CB.1", "DD.2",
"IGT.1", "LSI.1", "NSM.2", "PLL.1", "SUN.1", "DTV.2",
"ALTR.1", "NVLS.1", "AGN.2", "CPWR.1", "LIFE.3", "FTI.1",
"SE.7", "MMI.3", "ABC", "AS.")
str_remove(x, "\\.\\d?$")
#> [1] "AA" "BEAM" "BRK.B" "BF.B" "CB" "DD" "IGT" "LSI" "NSM"
#> [10] "PLL" "SUN" "DTV" "ALTR" "NVLS" "AGN" "CPWR" "LIFE" "FTI"
#> [19] "SE" "MMI" "ABC" "AS"
Created on 2021-01-01 by the reprex package (v0.3.0.9001)
Thanks so much. My data is large and I prefer to use the package stringi, which is much faster than stringr. Do you know how to do it in stringi? Thanks.
stringr
uses stringi
as backend so it shouldn't make much of a difference but you can use this function
library(stringi)
x <- c("AA.3", "BEAM.2", "BRK.B", "BF.B", "CB.1", "DD.2",
"IGT.1", "LSI.1", "NSM.2", "PLL.1", "SUN.1", "DTV.2",
"ALTR.1", "NVLS.1", "AGN.2", "CPWR.1", "LIFE.3", "FTI.1",
"SE.7", "MMI.3", "ABC", "AS.")
stri_replace(x, "", regex = "\\.\\d?$")
#> [1] "AA" "BEAM" "BRK.B" "BF.B" "CB" "DD" "IGT" "LSI" "NSM"
#> [10] "PLL" "SUN" "DTV" "ALTR" "NVLS" "AGN" "CPWR" "LIFE" "FTI"
#> [19] "SE" "MMI" "ABC" "AS"
Created on 2021-01-01 by the reprex package (v0.3.0.9001)
Thanks so much! Are there any learning materials that can help me understand what these regex means?
A general overview
And a more complete resource
Regular expressions are an extremely powerful tool for manipulating text and data. They are now standard features in a wide range of languages and popular tools, including Perl, Python, Ruby, … - Selection from Mastering Regular Expressions, 3rd...
Thanks so much! I will read the first link.
system
Closed
January 9, 2021, 2:40am
9
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.