Hi
My data looks like this:
data.frame(
stringsAsFactors = FALSE,
species = c("Bubasis agape agape",
"Bubasis agape","Bubasis agape","Bubasis agape",
"Bubasis agape ruby","Bubasis agape ruby","Bubasis agape ruby",
"Bubasis agape")
)
Some of these species names have a subspecies name attached as well, and I'm wondering if it's possible to write a code that removes the third word within each row of a particular column?
Thank you!!
Hi!
A tidyverse solution would be:
library(tidyverse)
test <- data.frame(
stringsAsFactors = FALSE,
species = c("Bubasis agape agape",
"Bubasis agape","Bubasis agape","Bubasis agape",
"Bubasis agape ruby","Bubasis agape ruby","Bubasis agape ruby",
"Bubasis agape")
)
test %>%
as_tibble() %>%
mutate(
species_clean = map_chr(
str_split(species, pattern = "\\s+"),
~ str_flatten(.x[1:2], " ")))
#> # A tibble: 8 × 2
#> species species_clean
#> <chr> <chr>
#> 1 Bubasis agape agape Bubasis agape
#> 2 Bubasis agape Bubasis agape
#> 3 Bubasis agape Bubasis agape
#> 4 Bubasis agape Bubasis agape
#> 5 Bubasis agape ruby Bubasis agape
#> 6 Bubasis agape ruby Bubasis agape
#> 7 Bubasis agape ruby Bubasis agape
#> 8 Bubasis agape Bubasis agape
Created on 2022-03-01 by the reprex package (v2.0.1)
1 Like
Perfect, thank you. May I ask a follow up question. If I instead wanted to replace the current column with the new one you crated (species_clean), how would I do that?
1 Like
You could name the new column species
instead of species_clean
in the mutate statement. This will overwrite the old column
1 Like
system
Closed
March 8, 2022, 10:48am
5
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.