I'm assuming you want to test the date in a given row against the date immediately prior; If I'm wrong in that assumption, the rest of my reply is not relevant.
First you need to put all of your dates in order. Then you need to somehow determine what interval is acceptable - I used lubridate::%m-%
to subtract a month from a date, but that may not fit your needs (e.g. if you define a month as 30 days, etc.). So essentially, you calculate the interval, then use dplyr::lag
to test the interval you just calculated against the month in the prior row.
Hope this helps.
library(dplyr)
library(lubridate)
# Your code ##################
# I would like to fiter obs if date is repeated less than a month by a identifier
data <- data.frame(
stringsAsFactors = FALSE,
identifer=c("a","a","a","b","b","b","c","c", "d", "d", "d", "d"),
id = c(1L, 2L, 3L,1L, 2L, 3L,1L, 2L,1L, 2L, 3L,4L ),
date = c("2020-01-25","2020-02-20","2020-03-25", "2021-02-10","2021-03-09","2021-03-11", "2021-03-15","2021-04-16","2021-03-17","2021-04-17","2021-04-30","2021-05-18")
)
#
data_intervals <- data %>%
arrange(date) %>%
mutate(
date = as.Date(date),
# Test whether a date minus a month is greater than or equal to the prior date
interval_less_than_one_month = date %m-% months(1) >= lag(date)
) %>%
filter(
interval_less_than_one_month
)
data_intervals
#> identifer id date interval_less_than_one_month
#> 1 a 3 2020-03-25 TRUE
#> 2 b 1 2021-02-10 TRUE
Created on 2022-03-28 by the reprex package (v1.0.0)