Why function usage in tidyverse mutate is sometimes not working well

Hi,
in this simple test I just want to get the first element before the first '|' character of my string value.
But when I test the function in a simple dataframe , all rows get the same value ​​as the first one.

library(tidyverse)

reformat_string <- function(string) {
     str_split(string, pattern = '\\|')[[1]][1]
}

test <- reformat_string("A|1")
test

df <- data.frame(x = c("A|B|0", "D|C|9", "F|A|5"))
df

df %>% 
  mutate(y = reformat_string( x ))

Where am I wrong ?

When you run mutate(), the entire column x is passed to reformat_string() at once. It is not processed row by row. Does the following make sense?

library(tidyverse)

df <- data.frame(x = c("A|B|0", "D|C|9", "F|A|5"))

reformat_string <- function(string) {
  str_split(string, pattern = '\\|')[[1]][1]
}

str_split(df$x, pattern = '\\|')
#> [[1]]
#> [1] "A" "B" "0"
#> 
#> [[2]]
#> [1] "D" "C" "9"
#> 
#> [[3]]
#> [1] "F" "A" "5"
#So, str_split(string, pattern = '\\|')[[1]][1] 
#returns "A" and that is used to fill every row

reformat_string2 <- function(string) {
  tmp <- str_split(string, pattern = '\\|')
  map_chr(tmp, \(VEC) VEC[1])
}
reformat_string2(df$x)
#> [1] "A" "D" "F"


df %>% 
  mutate(y = reformat_string2( x ))
#>       x y
#> 1 A|B|0 A
#> 2 D|C|9 D
#> 3 F|A|5 F

Created on 2024-02-17 with reprex v2.0.2

Change the last bit to the following:

df %>% rowwise() %>% mutate(y = reformat_string( x ))

Thanks FJCC for these explanations.
I understand now why I was wrong.

Another way to solve my problem.
Thanks a lot Paul :slight_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.