em_y
April 8, 2023, 11:34am
1
Hi,
currently I am using the following code to extract "phyla" part of a long label in a new column of dataset ITS_counts.
ITS_counts3 <- ITS_counts |> mutate(Phyla = str_extract(taxonomy, "(?<=;p__).+;c"))
this allows me to isolate the part of the taxonomy column that I want, but leaves the ;c on the end, which I want to get rid of. How would I do this? Thanks.
FJCC
April 8, 2023, 12:16pm
2
The regular expression [^;]+
means "one or more characters that are not a semicolon".
library(stringr)
taxonomy <- "sdfljfsldj;p__ThePhylum;c__lskdflsdjf"
str_extract(taxonomy, "(?<=;p__)[^;]+")
#> [1] "ThePhylum"
Created on 2023-04-08 with reprex v2.0.2
em_y
April 8, 2023, 1:39pm
3
that has worked thank you!!!
system
Closed
April 15, 2023, 1:40pm
4
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.