I'm not 100% sure what you mean by "retrieve the max process", but you will likely get what you want by replacing filter(process = max(process)) with max(process). Filter turns a large data frame into a smaller data frame, and I don't think you actually want to replace multiple rows of the process column with embedded data frames.
but I get all 10003 (this command substitutes all that are greater than 10000 to 10003). That would not be what I want. Instead what I would like would be to return only the following:
For each group means a group by, then you can use max to find the maximum for each group, and filter for 1 OR the group maximum
txtdt <- "group_id process
111000820 1
111000820 1
111000820 10001
111000820 10003
111000820 10003
111000821 1
111000821 1
111000821 10001
111000821 10002
111000821 10002"
example <- read.table(text=txtdt, header=TRUE)
library(dplyr)
example %>% group_by(group_id) %>%
filter(process == max(process) | process == 1)
and maybe throw in an ungroup() at the end depending on what is next.
by having the filter after the group_by it is at the group level, and by using OR in the filter it finds both groups you want.