Separate column in two

I need to separate a column in a dataframe that has this format: "0.70.7", "0.20.2", "10.610.6". The final result should be "0.7", "0.2","10.6" but I don't have a character or space that I can use as a wildcard. I try

df <-df %>% separate(`column 00-24`, into = c("Data", "Duplicate"), sep = "", remove = TRUE, convert = FALSE, extra = "merge")

But results is not what I'm looking for. Also try:

df$column_clean <- gsub("(\\..*?)\\.", "\\1", df$`column 00-24`)

But again incorrect output.

begin <- c("0.70.7", "0.20.2", "10.610.6")
target <- c("0.7", "0.2","10.6")
(repl <- gsub("(\\d+\\.\\d)(\\d+\\.\\d+)", "\\1", begin))
#> [1] "0.7"  "0.2"  "10.6"
identical(target,repl)
#> [1] TRUE

Created on 2023-09-16 with reprex v2.0.2

if your strings are always repeats, then the simplest approach must be to cut them in half ?


(mydata <- data.frame( a1 =  c("0.70.7", "0.20.2", "10.610.6")))

# in base R 

mydata$s = substr(mydata$a1,
                  start = 1,
                  stop = nchar(mydata$a1)/2)
mydata

# or tidyverse
library(tidyverse)
(mydata <- mutate(mydata,
                  s = str_sub(a1,1,nchar(a1)/2)))
2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.