I have this data.base
head(df)
# A tibble: 6 × 3
anio id_mujer fecha_muestra
<chr> <dbl> <dttm>
1 2015 4807 2015-06-26 00:00:00
2 2017 4807 2017-06-02 00:00:00
3 2018 4807 2018-11-07 00:00:00
4 2018 8029 2018-03-23 00:00:00
5 2019 8029 2019-09-06 00:00:00
6 2021 8029 2021-04-23 00:00:00
Each group of women grouped by id_woman consists of 3 observations.
I want to see which women have a distance of 90 days or less between the variable 'fecha_muestra' of the second observation with the same variable of the third observation.
df %>%
group_by(id_mujer) %>%
mutate(distancia = difftime(lead(fecha_muestra, 2), lead(fecha_muestra, 1), units = "days"),
igual= difftime(lead(fecha_muestra, 2), lead(fecha_muestra, 1), units = "days") <= 90) %>%
filter(any(igual)==T) %>%
relocate(igual, fecha_muestra)
view()
3 women meet this condition. The problem is that when I ask to see that woman I get the TRUE in the first observation, instead of the corresponding one.
How can I solve it?
EDIT:
I'm gonna add an example.
df %>%
filter(id_mujer==79051)
# A tibble: 3 × 3
anio id_mujer fecha_muestra
<chr> <dbl> <dttm>
1 2016 79051 2016-07-19 00:00:00
2 2017 79051 2017-10-23 00:00:00
3 2017 79051 2017-09-12 00:00:00
As we can see, between observation 2 and 3 the date is less than 90 days.
But if I try to do this calculation, R marks it in row 1.
df %>%
group_by(id_mujer) %>%
mutate(distancia = difftime(lead(fecha_muestra, 2),
lead(fecha_muestra, 1),
units = "days"),
igual= difftime(lead(fecha_muestra, 2),
lead(fecha_muestra, 1),
units = "days") <= 90) %>%
filter(any(igual)==T) %>%
relocate(igual, fecha_muestra) %>%
filter(id_mujer==79051)
# A tibble: 3 × 5
# Groups: id_mujer [1]
igual fecha_muestra anio id_mujer distancia
<lgl> <dttm> <chr> <dbl> <drtn>
1 TRUE 2016-07-19 00:00:00 2016 79051 -41 days
2 NA 2017-10-23 00:00:00 2017 79051 NA days
3 NA 2017-09-12 00:00:00 2017 79051 NA days