I've got a problem with a report I'm creating, where my charts have a datetime x axis and I am getting unwanted labels on the x axis (my data ends in Dec 2023 but I'm getting a Jan 2024 label created, which I believe is unwanted behaviour and confusing to my readers). I'm using date_breaks = "3 months"
in my report.
I've created a reprex (below) to illustrate the issue using synthetic data. In the first three examples, data is constrained within a single calendar year, and the x axis is labelled in what I would consider a reasonable and accurate way.
In the the second half of the reprex, everything is set up the same, except there are now 24 months of data. The x axis labelling now behaves unpredictably, in my opinion. (It starts going Feb-May-Aug-Nov instead of Mar-Jun-Sep-Dec, and then if I reduce the period for the date breaks, I start getting labels for months that are outside the date range of my data - this is the issue I am having in my actual report).
The last example shows my attempt to use the limits
argument to constrain the x axis labels. This is "successful" in that, but then loses months of data in a way that I do not understand.
library(ggplot2)
library(lubridate, warn.conflicts = FALSE)
t <- as.POSIXct("2021-01-01")
e1 <- as.POSIXct("2021-12-31 23:59:59")
n1 <- as.numeric(lubridate::as.duration(e1 - t))
s1 <- sample(seq.int(n1), 3000L)
d1 <- tibble::tibble(dttm = t + s1) |>
dplyr::mutate(mon = lubridate::floor_date(dttm, "month")) |>
dplyr::summarise(total = dplyr::n(), .by = "mon")
d1 |>
ggplot2::ggplot(aes(x = mon, y = total)) +
ggplot2::geom_line() +
ggplot2::scale_x_datetime(
date_breaks = "3 months",
date_labels = "%b %y"
) +
labs(
title = "x axis labelled Mar-Jun-Sep-Dec as expected",
x = NULL,
y = NULL
)
d1 |>
ggplot2::ggplot(aes(x = mon, y = total)) +
ggplot2::geom_line() +
ggplot2::scale_x_datetime(
date_breaks = "2 months",
date_labels = "%b %y"
) +
labs(title = "x axis labelled as expected", x = NULL, y = NULL) +
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust = 0.5))
d1 |>
ggplot2::ggplot(aes(x = mon, y = total)) +
ggplot2::geom_line() +
ggplot2::scale_x_datetime(
date_breaks = "1 month",
date_labels = "%b %y"
) +
labs(title = "x axis labelled as expected", x = NULL, y = NULL) +
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust = 0.5))
e2 <- as.POSIXct("2022-12-31 23:59:59")
n2 <- as.numeric(lubridate::as.duration(e2 - t))
s2 <- sample(seq.int(n2), 6000L)
d2 <- tibble::tibble(dttm = t + s2) |>
dplyr::mutate(mon = lubridate::floor_date(dttm, "month")) |>
dplyr::summarise(total = dplyr::n(), .by = "mon")
d2 |>
ggplot2::ggplot(aes(x = mon, y = total)) +
ggplot2::geom_line() +
ggplot2::scale_x_datetime(
date_breaks = "3 months",
date_labels = "%b %y"
) +
labs(title = "x axis labelled with Feb-May-Aug-Nov", x = NULL, y = NULL)
d2 |>
ggplot2::ggplot(aes(x = mon, y = total)) +
ggplot2::geom_line() +
ggplot2::scale_x_datetime(
date_breaks = "2 months",
date_labels = "%b %y"
) +
labs(
title = "x axis includes Jan 23 (outside the dataset)",
x = NULL,
y = NULL
) +
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust = 0.5))
d2 |>
ggplot2::ggplot(aes(x = mon, y = total)) +
ggplot2::geom_line() +
ggplot2::scale_x_datetime(
date_breaks = "1 month",
date_labels = "%b %y"
) +
labs(
title = "x axis includes Dec 20 and Jan 23 (outside the dataset)",
x = NULL,
y = NULL
) +
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust = 0.5))
d2 |>
ggplot2::ggplot(aes(x = mon, y = total)) +
ggplot2::geom_line() +
ggplot2::scale_x_datetime(
date_breaks = "1 month",
date_labels = "%b %y",
limits = \(x) c(x[[1]] + month(1), x[[2]] - month(1))
) +
labs(
title = "correcting the limits works but loses 4 months of data",
x = NULL,
y = NULL
) +
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust = 0.5))
#> Warning: Removed 4 rows containing missing values (`geom_line()`).
Created on 2024-01-30 with reprex v2.1.0