I need to separate a column in a dataframe that has this format: "0.70.7", "0.20.2", "10.610.6". The final result should be "0.7", "0.2","10.6" but I don't have a character or space that I can use as a wildcard. I try
df <-df %>% separate(`column 00-24`, into = c("Data", "Duplicate"), sep = "", remove = TRUE, convert = FALSE, extra = "merge")
But results is not what I'm looking for. Also try:
df$column_clean <- gsub("(\\..*?)\\.", "\\1", df$`column 00-24`)
But again incorrect output.
begin <- c("0.70.7", "0.20.2", "10.610.6")
target <- c("0.7", "0.2","10.6")
(repl <- gsub("(\\d+\\.\\d)(\\d+\\.\\d+)", "\\1", begin))
#> [1] "0.7" "0.2" "10.6"
identical(target,repl)
#> [1] TRUE
Created on 2023-09-16 with reprex v2.0.2
if your strings are always repeats, then the simplest approach must be to cut them in half ?
(mydata <- data.frame( a1 = c("0.70.7", "0.20.2", "10.610.6")))
# in base R
mydata$s = substr(mydata$a1,
start = 1,
stop = nchar(mydata$a1)/2)
mydata
# or tidyverse
library(tidyverse)
(mydata <- mutate(mydata,
s = str_sub(a1,1,nchar(a1)/2)))
2 Likes
system
Closed
September 23, 2023, 9:00am
4
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.