Separate column with () characters

Hi all.
In this dataframe

cities <- data.frame(
  stringsAsFactors = FALSE,
       check.names = FALSE,
             Month = c("September", "October", "November", "December"),
  `City.(Country)` = c("Paris(France)",
                       "Madrid(Spain)","London(UK)","Berlin(Germany)")
)

I trying to split the second column in two: I am able to do this when there is a separating character with separate(df, into = c("df1", "df2"), sep = " "), but in this case with the () characters, I always get some kind of error.

Any idea how I can make to create the two columns: City and Country?

Regards.

Here are four examples of using separate to split the second column. You can use the extra argument to suppress the warning using the default sep value or you can set sep = "\\(" and remove the trailing ).

library(tidyr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
cities <- data.frame(
  stringsAsFactors = FALSE,
  check.names = FALSE,
  Month = c("September", "October", "November", "December"),
  `City.(Country)` = c("Paris(France)",
                       "Madrid(Spain)","London(UK)","Berlin(Germany)")
)
separate(cities, `City.(Country)`, into = c("City", "Country")) #warning due to trailing )
#> Warning: Expected 2 pieces. Additional pieces discarded in 4 rows [1, 2, 3, 4].
#>       Month   City Country
#> 1 September  Paris  France
#> 2   October Madrid   Spain
#> 3  November London      UK
#> 4  December Berlin Germany
separate(cities, `City.(Country)`, into = c("City", "Country"), extra = "drop")
#>       Month   City Country
#> 1 September  Paris  France
#> 2   October Madrid   Spain
#> 3  November London      UK
#> 4  December Berlin Germany

separate(cities, `City.(Country)`, into = c("City", "Country"), sep = "\\(")  #leaves trailing )
#>       Month   City  Country
#> 1 September  Paris  France)
#> 2   October Madrid   Spain)
#> 3  November London      UK)
#> 4  December Berlin Germany)
separate(cities, `City.(Country)`, into = c("City", "Country"), sep = "\\(") |> 
  mutate(Country = sub("\\)", "", Country))
#>       Month   City Country
#> 1 September  Paris  France
#> 2   October Madrid   Spain
#> 3  November London      UK
#> 4  December Berlin Germany

Created on 2021-10-31 by the reprex package (v2.0.1)

Thanks FJCC for all these possibilities, all works fine.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.