Creating regex for multiple values

Hi,
I have a dataset with the following values for one of the organisms:
Carbapenem Resistant Escherichia Coli (Organism)
Carbapenem Resistant Escherichia Coli
Escherichia Coli, Carbapenem Resistant
Escherichia Coli: Carbapenem Resistant Enterobacteriaceae (Cre)
Escherichia Coli (Organism)
Carbapenem Resistant E. Coli
Carbapenemase-Producing Escherichia Coli
Escherichia Coli (Cre)
Escherichia Coli
I'm working on a dynamic regex that matches all the values and may be other values that may be added to the data later on.
I tried: (?:Carbapenem[ -]?Resistant[ -]?)?(?:Escherichia[ -.]?Coli|E[ .-]?Coli)(?:[ -]?(?:\(Organism\)|Carbapenem[ -]?(?:Resistant|ase[ -]?Producing)|Cre|Enterobacteriaceae))?|Carbapenem[ -]?Resistant[ -]?E[ .-]?Coli
And this: (Carbapenem[ -]?Resistant[ -]?)?(E[ .-]?)?Coli(Carbapenem[ -]?Resistant[ -]?(Enterobacteriaceae[ -]?)?(\(Cre\))?)?
Both miss some matches.
This worked but I was wondering if there is a more concise way to do it: (?:[ -]?(?:\(Organism\)|Carbapenem[ -]?(?:Resistant|ase[ -]?Producing)|Cre|Enterobacteriaceae|.*))?
Can you please help?
Thank you

Hi, this is not more concise, but looks to match everything listed and may be easier to update with future additions:

> regexPattern <- "Carbapenem Resistant (E. Coli|Escherichia Coli( \\(Organism\\)|))|Carbapenemase-Producing Escherichia Coli|Escherichia Coli(, Carbapenem Resistant|: Carbapenem Resistant Enterobacteriaceae \\(Cre\\)| \\(Organism\\)| \\(Cre\\)|)"
> matches <- str_extract_all(f, regexPattern)
> matches
[[1]]
[1] "Carbapenem Resistant Escherichia Coli (Organism)"               
[2] "Carbapenem Resistant Escherichia Coli"                          
[3] "Escherichia Coli, Carbapenem Resistant"                         
[4] "Escherichia Coli: Carbapenem Resistant Enterobacteriaceae (Cre)"
[5] "Escherichia Coli (Organism)"                                    
[6] "Carbapenem Resistant E. Coli"                                   
[7] "Carbapenemase-Producing Escherichia Coli"                       
[8] "Escherichia Coli (Cre)"                                         
[9] "Escherichia Coli"

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.