Hi everyone,
I have a little problem which I simply cannot get solved. I have a dataframe where I want to compare to columns for each group. When the two columns match, I want to extract all the rows from a fourth colum that are one group as well. This sounds super strange, so I tried to create an example. It is not the best, but I think it shows the point more or less:)
set.seed(2)
dr = seq(as.Date("2020-01-01"),as.Date("2021-01-01"), by = "day")
df = data.frame(
date.x = dr,
date.y = sample(dr, size = length(dr), replace = T)
)
g1 = rep(seq(1,length(dr), by = 10), each = 10)
g2 = rep(seq(1,length(dr), by = 5), each = 5)
df["g1"] = g1[1:nrow(df)]
df["g2"] = g2[1:nrow(df)]
date.x date.y g1 g2
1 2020-01-01 2020-12-06 1 1
2 2020-01-02 2020-07-16 1 1
3 2020-01-03 2020-09-18 1 1
4 2020-01-04 2020-09-29 1 1
5 2020-01-05 2020-12-14 1 1
6 2020-01-06 2020-07-22 1 6
7 2020-01-07 2020-10-23 1 6
8 2020-01-08 2020-06-26 1 6
9 2020-01-09 2020-03-15 1 6
10 2020-01-10 2020-05-10 1 6
So I want to comple date.x
and date.y
for each group of g1
. When they match I want to look at the column g2
and extract all the same values.
So I received the tip about doing it like this:
df %>%
group_by(g1, g2) %>%
filter(
g2 == g2[date.x == date.y]
)
But this does not work. Maybe someone has a tip here:) Thanks a lot already!