Hi guys,
I have this unusual (or maybe typical?) task.
For one of my projects months start not on the 1st but on the 24th day of each month (so from 25th to 24th).
Now I am trying to recode my date variable into these specific months having this data:
data <- data.frame(
Date = c("2024-11-11","2024-11-15","2024-11-24",
"2024-11-25","2024-12-01","2024-12-15","2024-12-24",
"2024-12-25", "2024-12-30","2025-01-10","2025-01-15","2025-01-24")
)
data
Month and year name should be takem from the last date in each range I guess. As a result we should have:
3 days in Nov 2024
4 days in Dec 2024
5 days in Jan 2025
Is it easy to do? Is it possible doing that without sorting data (this code will be used in DisplayR)
Here's a custom function that takes the dates and classifies them into months based on a given day, defaults to 25th.
It returns a date vector which can be cast as a yearmon object using the {zoo} package.
data <- data.frame(
Date = c("2024-11-11","2024-11-15","2024-11-24",
"2024-11-25","2024-12-01","2024-12-15","2024-12-24",
"2024-12-25", "2024-12-30","2025-01-10","2025-01-15","2025-01-24")
)
#' Custom month classification
#'
#' A custom function to classify an input vector of character dates to a month
#' defined as starting on `month_start_day`.
#'
#' @param dates Character vector of dates in format "YYYY-MM-DD"
#' @param month_start_day Integer specifying the first day of the month (default = 25)
#'
#' @returns Vector of <date> indicating the start of the month `dates` are classified to
custom_month <- function(dates, month_start_day = 25) {
# parse the inputs as date types
dates <- lubridate::ymd(dates)
# add one month to dates (will take month and year from this)
add_month <- lubridate::add_with_rollback(dates, months(1))
# set output 'dates' as first of the month
next_month <- lubridate::make_date(
lubridate::year(add_month),
lubridate::month(add_month),
month_start_day
)
this_month <- lubridate::make_date(
lubridate::year(dates),
lubridate::month(dates),
month_start_day
)
# decide which month to classify each date
dates_return <- dplyr::case_when(
lubridate::day(dates) >= month_start_day ~ next_month,
.default = this_month
)
# return the result
return(dates_return)
}
# classify the dates in 'data'
data_classified <-
data |>
dplyr::mutate(
month_start_date = custom_month(Date), # date types
month = zoo::as.yearmon(month_start_date) # zoo::yearmon object, e.g. 'Nov 2024'
)
# count
data_classified |> dplyr::count(month)
#> month n
#> 1 Nov 2024 3
#> 2 Dec 2024 4
#> 3 Jan 2025 5