Mutate when a predicate is true? Am I overthinking my use case?

brodriguesco · June 14, 2022, 1:27pm

Hi there,

I’m trying to construct a new variable, but only if certain conditions are TRUE, and this new variable
should pull the value from another variable at a specific position... it’s quite tricky to explain, but
I have prepared some toy data and a working script in this repo:

Here is the script:

library(tidyverse)

df <- readr::read_csv("https://raw.githubusercontent.com/b-rodrigues/mutate_when_predicate/main/example_df.csv")

ids <- df %>%
  group_by(person, col_a, col_c) %>%
  summarise(needs_correction = n_distinct(to_fill)) %>%
  filter(needs_correction > 1) %>%
  mutate(needs_correction = TRUE)

full_join(df, ids) %>%  
  group_by(person, col_a, col_c) %>%
  mutate(new_loc = ifelse(all(needs_correction),
                            pull(
                              filter(cur_data(),
                                     col_b == "W"),
                              to_fill),
                          NA_character_)) %>% 
  ungroup() %>%  
  mutate(new_loc2 = coalesce(new_loc, to_fill))

The use case is as follows:

for each person in the data, I need to fill the column called to_fill. However, I want to do so only:

1 - where to_fill has more than one unique value by col_a and col_c (basically, for a given individual grouped by col_a and col_c , to_fill is not constant) BUT

2 - only if it is empty where col_b is "S"

3 - and do so by groups formed by col_a and col_c

after running the code, new_loc2 is the required solution; values of to_fill where col_b equal "S" are replaced by the values where col_b equals "W", or otherwise ignored.

Am I overthinking this? Is there a simpler solution?

smingerson · June 14, 2022, 1:54pm

This result is equivalent up to FALSE instead of NA in needs_correction. It makes use of the fact you can filter one vector by another in the data-masked function.

Edit: You could even use n_distinct(to_fill) > 1 directly in the if_else() predicate.

res <- df %>% 
  group_by(person, col_a, col_c) %>% 
  mutate(needs_correction = n_distinct(to_fill) > 1,
         new_loc = if_else(needs_correction, to_fill[col_b == "W"][1], 
                           NA_character_)
         ) %>% 
  ungroup() %>% 
  mutate(new_loc2 = coalesce(new_loc, to_fill))

brodriguesco · June 15, 2022, 5:42am

Very nice, I knew that I was over thinking things! I didn’t realize you could slice into columns inside if_else, very nice!

system · June 22, 2022, 5:42am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.