Why a regular expression dont make match?

Hi community, I want to get the words that have 3 letters repeated but not the e, the others can be. For explample "appropiate" becasue has 3 "p" letters.

But when run the code show me character(0)

stringr::words
pattern_1  <- "(?i)([a-fz])\\1{2}" # `character(0)`
pattern _2 <- "([a-df-zA-DF-Z])\\1{2}" # `character(0)`
pattern_3  <- "(.)\\1{2,}" # `character(0)`
matches <- stringr::str_subset(stringr::words, pattern)

This pattern brings me the letters that repeat 2 times but when I modify it for the letters that repeat 3 times it does not work. It seems strange to me.


pattern_3  <- "(.)\\1{1,}" 
matches <- stringr::str_subset(stringr::words, pattern) # see `appropiate`

[1] "accept"      "account"     "across"      "add"         "address"    
[6] "affect"      "afford"      "afternoon"   "agree"       "all"        
[11] "allow"       "apparent"    "appear"      "apply"       "appoint"    
[16] "approach"    "appropriate" "arrange"     "associate"   "assume" 

Tnks!

stringr:: words isn't very big, so it just doesn't have what you're looking for

dict <- readLines("/usr/share/dict/words")
pattern = "(.)\\1{2}"
stringr::str_subset(dict, pattern)

gives

[1] "bossship"         "demigoddessship"  "goddessship"     
[4] "headmistressship" "patronessship"    "wallless"        
[7] "whenceeer"

I you are looking for words with three occurrences of one letter, possibly with other letters in between, your regular expression has to include patterns for the letters that are skipped over. Try
"(?i)([a-df-z]).*\\1.*\\1"

library(stringr)
str_match("appropriate",pattern = "(?i)([a-df-z]).*\\1.*\\1")
#>      [,1]    [,2]
#> [1,] "pprop" "p"
str_match("sassafras",pattern = "(?i)([a-df-z]).*\\1.*\\1")
#>      [,1]        [,2]
#> [1,] "sassafras" "s"
str_match("nonaligned",pattern = "(?i)([a-df-z]).*\\1.*\\1")
#>      [,1]       [,2]
#> [1,] "nonalign" "n"

Created on 2024-08-17 with reprex v2.0.2

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.