I would appreciate help with constructing regex pattern. I have a character vector such as

vector <- c("r01052",

I would like to create two columns. First, with first 5 characters and if there is a letter after the fifth character => with first 6 characters. Hence, in this case, I would get:

first_column <- c("r0105",

Second, with last character and if there is a letter as a last character => with two last characters. Again, in this case, I would get:

second_column <- c("2",

Could you help me with constructing regex patterns for str_substract? Note that the provided vector is just an example and individual letters and individual numbers change, the only thing which is constant is the pattern of letters and numbers.

Hi @Jakub_Komarek

I used the same regex for both columns, one with str_extract, the other one with str_remove.

vector <- c("r01052",

first_column <- str_extract(vector, "^\\w{5}[a-zA-Z]?")
second_column <- str_remove(vector, "^\\w{5}[a-zA-Z]?")

Hope it helps.

Another option


vector <- c("r01052",

str_match(vector, "(?<first>^[a-z]\\d{4}[a-z]?)(?<second>\\d[a-z]?$)")[,2:3]
#>      first    second
#> [1,] "r0105"  "2"   
#> [2,] "r0105a" "2"   
#> [3,] "r0105"  "2a"  
#> [4,] "r0105a" "2a"

Created on 2022-03-19 by the reprex package (v2.0.1)


