It is surely not the shortest solution but it is one approach that works: Create a table with the value you want for each ID then join to the first table
ID <- c("A", "A", "A", "B", "B", "c")
Value <- c("blue", "blue", "green", "red", "orange", NA)
library(tidyverse)
df <- tibble(ID, Value)
# create a replacement table
replacement <- df %>%
# create ordered factor as we want first after
mutate(Value = fct_inorder(Value)) %>%
# group_by all columns
group_by_all() %>%
# count the occurrence in last column and sort
tally(sort = TRUE) %>%
# keep only the first row in each remaining group
# (the max because sorted)
slice(1) %>%
ungroup() %>%
select(ID, Value_output = Value)
#> Warning: le package 'bindrcpp' a été compilé avec la version R 3.4.4
df %>%
inner_join(replacement)
#> Joining, by = "ID"
#> # A tibble: 6 x 3
#> ID Value Value_output
#> <chr> <chr> <fct>
#> 1 A blue blue
#> 2 A blue blue
#> 3 A green blue
#> 4 B red red
#> 5 B orange red
#> 6 c <NA> <NA>
This was very helpful for me as I had a very similar question... Could you maybe also give me a hint on how to do it if I have to group by ID and year? Thank you for your help.
Are you asking about if you have two grouping variables, ID and year? Or are you asking if you have the same scenario as above, except the two variables are ID and year (where you'd want the most frequent year to be the replacement value)?