Hi there. I'm trying to summarize a very large dataframe so I can see the total quantity in each type at some specific timestamps, by day. I hope to use this data so I can do a comparison exercise later to see how today's quantity at 16:00 (by category, and also overall) compares to previous day's at 16:00 (etc).
The structure of the dataframe is like this:
df <- structure(list(date = structure(c(19034, 19034, 19034, 19034, 19034, 19034, 19034, 19034, 19034, 19034),
class = "Date"),type = c("UKS", "USD", "UKS", "UKS", "USD", "USD", "UKS", "USD", "UKZ", "UKY"),
time = structure(c(28793, 32403, 36003, 43203, 46803, 50404, 50408, 54011, 54014, 58815), units = "secs", class = c("hms", "difftime")),
quantity = c(0.003, 0.007, 0.002, 0.001, 0.03, 0.123, 0.017, 0.019, 0.012, 0.01 ),cumvol = c(0.003, 0.01, 0.012, 0.013, 0.043, 0.166, 0.183, 0.202, 0.214, 0.224)),
class = "data.frame", row.names = c(NA, -10L))
And my attempt thus far is this, however I'm getting an error around the group_by(date) line.
library(tidyverse)
library(lubridate)
library(hms)
time_check <- c(13, 16, 18)
df %>%
mutate(time_hr = hour(time), .after = time) %>%
filter(time_hr %in% time_check) %>%
group_by(date, type, time_hr) %>%
summarize(cat_total = sum(quantity)) %>%
group_by(date) %>%
mutate(date_total = sum(cat_total)) %>%
ungroup()
The summarize line seems to be dropping the date and then throwing up the error:
Error: Must group by variables found in `.data`.
* Column `date` is not found.
Does anyone advice on how to fix this?