Modify taxonomy

Hello everyone,

I have a taxonomy table that I intend to merge with the OTU table using tools in the RStudio environment. However, before doing this, I would like to modify the taxonomies that contain "unclassified," "uncultured," and "NA" in columns of different ranks by adding the previous taxonomy to them. This way, the taxonomy would look like: Proteobacteria_uncultured; Bacteroidota_uncultured... To achieve this, I am trying to use some tools from the dplyr package, but only the Species column is being modified. The columns with the other taxonomic classifications are not being changed. The code I am using is structured as follows:

# Modify "unclassified" or "uncultured" taxonomies
modified_data <- processed_data %>%
  mutate(across(Kingdom:Species, ~ ifelse(. %in% c("unclassified", "uncultured", "NA"),
                                        paste(lag(.), ., sep = "_"), .)))

What could be going wrong with the code above? Has anyone done a similar modification before and could assist me?

Thanks so much,
Jessy

Hi @jessybbio , welcome!

For get a good help is important put a reproducible example of data for better understand all the community.
See this guide:

Reprex

1 Like

Hello @M_AcostaCH!

Thank you for your attention!

I apologize for the lack of clarity in my question. I'm not sure if the example below is sufficient to explain the problem. Feel free to ask if you need more details.

I have an OTUs table in which many classifications, for classes, genera, and species, have been assigned the "uncultured." I would like to add the previous taxonomic classification to this general classification because it makes a significant difference in the microbial composition analysis.
Currently, my table looks like this:

head(data)
#> Kingdom Phylum Class Order Family Genus Species
#> Bacteria Armatimonadota uncultured uncultured uncultured uncultured uncultured_bacterium
#> Bacteria Desulfobacterota uncultured uncultured uncultured uncultured uncultured_Dongia
#> Bacteria Desulfobacterota uncultured uncultured uncultured uncultured uncultured_delta

Basically, I would add the previous taxonomic classification to the "uncultured" taxonomy. So, with the modification I desire, the taxonomy of the OTU table would look as follows:

head(data)
#> Kingdom Phylum Class Order Family Genus Species
#> Bacteria	Armatimonadota uncultured_Armatimonadota uncultured_Armatimonadota uncultured_Armatimonadota uncultured_Armatimonadota uncultured_bacterium
#> Bacteria	Desulfobacterota uncultured_Desulfobacterota uncultured_Desulfobacterota uncultured_Desulfobacterota uncultured_Desulfobacterota uncultured_Dongia
#> Bacteria	Desulfobacterota uncultured_Desulfobacterota uncultured_Desulfobacterota uncultured_Desulfobacterota uncultured_Desulfobacterota uncultured_delta

To do this, I used the following code:

# Modify "uncultured" taxonomies
modified_data <- data %>%
  mutate(across(Kingdom:Species, ~ ifelse(. %in% c( "uncultured"),
                                        paste(lag(.), ., sep = "_"), .)))

However, it hasn't worked entirely. Many of the cells with the classification "uncultured" are not changed. Can you assist me with this?

I appreciate your attention,
Jéssica

Hi, can you post the output of dput(head(data)) instead?

1 Like

What error did you get? I've added a reprex for you.


library(tidyverse)

# the reprex
data <- tibble::tribble(
  ~Kingdom,            ~Phylum,       ~Class,       ~Order,      ~Family,       ~Genus,               ~Species,
  "Bacteria",   "Armatimonadota", "uncultured", "uncultured", "uncultured", "uncultured", "uncultured_bacterium",
  "Bacteria", "Desulfobacterota", "uncultured", "uncultured", "uncultured", "uncultured",    "uncultured_Dongia",
  "Bacteria", "Desulfobacterota", "uncultured", "uncultured", "uncultured", "uncultured",     "uncultured_delta"
)


data %>%
  mutate(across(Kingdom:Species,
                ~if_else(c("unclassified", "uncultured", "NA") %in% .x,
                         paste(lag(.x), ., sep = "_"),
                         .x)))


# # A tibble: 3 x 7
# Kingdom  Phylum           Class                 Order                 Family                Genus                 Species             
# <chr>    <chr>            <chr>                 <chr>                 <chr>                 <chr>                 <chr>               
# 1 Bacteria Armatimonadota   uncultured            uncultured            uncultured            uncultured            uncultured_bacterium
# 2 Bacteria Desulfobacterota uncultured_uncultured uncultured_uncultured uncultured_uncultured uncultured_uncultured uncultured_Dongia   
# 3 Bacteria Desulfobacterota uncultured            uncultured            uncultured            uncultured            uncultured_delta    
1 Like

Hello @William,

The error I've been experiencing is that names marked with "0" and "uncultured" are not being modified according to the previous taxonomy assigned. In the first row, for example, the taxonomic classifications of "Class," "Order," "Family," and "Genus" that are labeled as "uncultured" are the ones I'm trying to add the known taxonomy to, which in this case is Phylum: "Armatimonadota." Therefore, the desired result is:

 ~Kingdom,            ~Phylum,       ~Class,       ~Order,      ~Family,       ~Genus,               ~Species,
  "Bacteria",   "Armatimonadota", "uncultured_Armatimonadota", "uncultured_Armatimonadota", "uncultured_Armatimonadota", "uncultured_Armatimonadota", "uncultured_bacterium",

However, I'm not able to make the modification with the script described previously.

Thanks for the help,

Jessy

1 Like

So like this? The first line could be changed:

data %>%
  mutate(across(Class:Genus,
                ~if_else(.x %in% c("unclassified", "uncultured", "NA"),
                         paste(lag(.x), lag(Phylum), sep = "_"),
                         .x)))

# # A tibble: 3 x 7
# Kingdom  Phylum           Class                       Order                       Family                      Genus                       Species             
# <chr>    <chr>            <chr>                       <chr>                       <chr>                       <chr>                       <chr>               
# 1 Bacteria Armatimonadota   NA_NA                       NA_NA                       NA_NA                       NA_NA                       uncultured_bacterium
# 2 Bacteria Desulfobacterota uncultured_Armatimonadota   uncultured_Armatimonadota   uncultured_Armatimonadota   uncultured_Armatimonadota   uncultured_Dongia   
# 3 Bacteria Desulfobacterota uncultured_Desulfobacterota uncultured_Desulfobacterota uncultured_Desulfobacterota uncultured_Desulfobacterota uncultured_delta

Hello @williaml

The code works!

Thanks so much!

Jessy

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.