Is there a way to assign a column_name using as_tibble?

marcelo_carvalho · August 10, 2023, 1:03am

Hello,
I would like your help.
How could I create a column using as_tibble function and assigning a name to the column? Below the code using as.data.frame and col.names with the expected result.

library(tidyverse)

df <- tribble(~concat,
                       "S:AMP,R:AUG,R:CFZ,S:CFZ       ,        R:TMP")


df %>% 
  str_squish() %>% 
  str_replace_all(" ","") %>% 
  str_split(",") %>% 
  as.data.frame(col.names = "concat") %>% #here I'm trying to use as_tibble
  separate_wider_delim(
    cols = "concat", 
    delim = ":", 
    names = c("type","product"))

arangaca · August 10, 2023, 9:47am

Simply use tibble instead of as_tibble.

df %>%
  # no need to squish and remove spaces
  # you can simply split using a regex
  str_split(" *, *") %>%
  unlist() %>%
  tibble(concat = .) %>% 
  separate_wider_delim(
    cols = "concat", 
    delim = ":", 
    names = c("type","product"))

Alternatively, you can use as_tibble_col if you want to keep a similar syntax as as.data.frame:

df %>%
  str_split(" *, *") %>%
  unlist() %>%
  as_tibble_col(column_name = "concat")
  ...

nirgrahamuk · August 10, 2023, 11:33am

you start out with concat being a relevant name; switch to a world of string/char objects; then put the concat name back in, so that you can refer to it with seperate_wider_delim which results in the name dissappearing from the result.
I think it was only really needed when initiating the process.

how about

library(tidyverse)

df <- tribble(
  ~concat,
  "S:AMP,R:AUG,R:CFZ,S:CFZ       ,        R:TMP",
  "S:AMP,R:AUG,R:CFZ,S:XXX       ,        T:TMP"
)

map_dfr(df$concat, ~ {
  str_squish(.x) %>%
    str_replace_all(" ", "") %>%
    str_split(",")
}[[1]] %>% data.frame()) %>%
  separate_wider_delim(
    cols = 1,
    delim = ":",
    names = c("type", "product")
  )

p.s. i extended the initial data to be more interesting/challenging. I noticed that doing so I got undesirable results from your original code; but I suppose you may not have to deal with more rows.

marcelo_carvalho · August 10, 2023, 2:49pm

The code you made has become much cleaner and easier to read. Before it, I tried to use as_tibble_col but returned a list. I learned a lot from you today. Thank you I really appreciate it.

marcelo_carvalho · August 10, 2023, 3:04pm

You anticipated a problem that could exist because you realized it is a database and new data points could arise.
You added complexity, solved, and communicated brilliantly. I learned a lot from you in this post. You have my utmost gratitude.

technocrat · August 11, 2023, 12:34am

# convert input string to vector of characters
input <- "S:AMP,R:AUG,R:CFZ,S:CFZ       ,        R:TMP" |>
# remove spaces
  gsub(" ","",x = _)  |>
# replace : with ,
  gsub(":",",",x = _) |>
# split along ,
  strsplit(x= _,",")  |>
# convert resulting list to a vector
  unlist()
# construct data frame 
output <- data.frame(
  # use every other element of vector as first field (beginning with first) 
  type = input[1:length(input)%%2 == 1],
  # use every other element of vector as second field (beginning with second) 
  product = input[1:length(input)%%2 == 0]
)
output
#>   type product
#> 1    S     AMP
#> 2    R     AUG
#> 3    R     CFZ
#> 4    S     CFZ
#> 5    R     TMP

^{Created on 2023-08-10 with reprex v2.0.2}