I'm trying to compare two strings to each other from two colums in a tibble. I want to know if any element of the first string is in the second. I wrote a small function to do this and it should be returning a single value per row. I can use rowwise to do this, but how can I vectorise the function so I don't have to use rowwise?
I see Dean Attali's use of Vectorize(): https://deanattali.com/blog/mutate-non-vectorized/
Seems like cheating. Is there a best practice for vectorizing functions out there?
library(tidyverse)
# split input into character vector, compare each one to other input as string and if any match output is TRUE
string_compare <-
function(string_1, string_2) {
map_lgl(unlist(str_split(string_1, "")), function(x) grepl(x, string_2)) %>% any(.)
}
data_tbl <- tibble(NP = c("A", "AB", "BC", "AB"), value = c("A", "C", "A", "BC"))
# errors out
data_tbl %>% mutate(comparison = string_compare(NP, value))
# individual comparisons
string_compare("A", "A")
string_compare("AB", "C")
string_compare("BC", "A")
string_compare("AB", "BC")
# rowwise works
data_tbl %>% rowwise() %>% mutate(comparison = string_compare(NP, value))
#vectorize works, seems like cheating
string_compare_v <- Vectorize(string_compare)
data_tbl %>% mutate(comparison = string_compare_v(NP, value))