> df_toy
# A tibble: 15 × 3
week completion cat
<int> <dttm> <chr>
1 2 2023-01-25 17:05:48 0000
2 2 2023-02-08 17:07:45 0000
3 2 2023-02-14 19:04:45 0000
4 3 2023-02-01 15:29:50 0000
5 3 2023-02-08 17:08:44 0000
6 3 2023-02-14 19:05:45 0000
7 4 2023-02-08 17:09:48 0000
8 4 2023-02-14 03:12:57 0000
9 4 2023-02-14 19:06:46 0000
10 8 2023-03-08 18:21:53 0000
11 8 2023-03-13 21:04:04 0000
12 4 2023-02-08 19:00:47 PPPP
13 4 2023-02-09 13:47:45 PPPP
14 4 2023-02-14 21:30:45 PPPP
15 8 2023-03-08 20:57:59 PPPP
Given the above tibble, how do I keep the maximum of completion
when week
is in the vector weeks_special
and cat == "0000"
; similarly, how can I keep the minimum of completion
when week
is not in the vector weeks_special
but cat == "0000"
?. For other cat
I want to keep all the observations. I want to achieve this by week
.
Suppose weeks_special <- c(4, 8)
, then my desired output is:
# A tibble: 8 × 3
week completion cat
<int> <dttm> <chr>
1 2 2023-01-25 17:05:48 0000
2 3 2023-02-01 15:29:50 0000
3 4 2023-02-14 19:06:46 0000
4 8 2023-03-13 21:04:04 0000
5 4 2023-02-08 19:00:47 PPPP
6 4 2023-02-09 13:47:45 PPPP
7 4 2023-02-14 21:30:45 PPPP
8 8 2023-03-08 20:57:59 PPPP
library(tidyverse)
# Toy data----
## vector====
weeks_special <- c(4, 8)
## df====
df_toy <- structure(list(week = c(2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 8L,
8L, 4L, 4L, 4L, 8L), completion = structure(c(1674666348, 1675876065,
1676401485, 1675265390, 1675876124, 1676401545, 1675876188, 1676344377,
1676401606, 1678299713, 1678741444, 1675882847, 1675950465, 1676410245,
1678309079), tzone = "", class = c("POSIXct", "POSIXt")), cat = c("0000",
"0000", "0000", "0000", "0000", "0000", "0000", "0000", "0000",
"0000", "0000", "PPPP", "PPPP", "PPPP", "PPPP")), row.names = c(NA,
-15L), class = c("tbl_df", "tbl", "data.frame"))