I have a dataset that recorded at 15-min interval (non-continuous) .
How can I aggregate this dataset to 30-min, 45-min, 1-hr and 2-hr?.
value <- c(1.869, 1.855, 1.855, 1.855, 1.855, 1.855, 1.855, 1.848, 1.848, 1.848, 1.848, 1.848, 1.848, 1.849)
date <- c("2013-08-28 08:00:00 UTC", "2013-08-28 08:15:00 UTC",
"2013-08-28 08:45:00 UTC", "2013-08-28 09:15:00 UTC",
"2013-09-28 09:45:00 UTC", "2013-08-28 10:15:00 UTC",
"2013-08-28 10:30:00 UTC", "2013-08-28 10:45:00 UTC",
"2013-08-28 11:00:00 UTC", "2013-08-28 11:45:00 UTC",
"2013-08-28 12:00:00 UTC", "2013-08-29 00:15:00 UTC",
"2013-08-29 01:00:00 UTC", "2013-08-29 01:15:00 UTC")
data <- data.frame(date=date,value=value)
Thank you..!!!
1 Like
Hi, something like this? This is for 30 mins, but it is the same for the other times.
library(tidyverse)
library(lubridate)
data %>%
mutate(date = ymd_hms(date),
time_30min = ceiling_date(date, "30 minutes")) %>%
group_by(time_30min) %>%
summarise(value = sum(value))
# A tibble: 11 × 2
time_30min value
<dttm> <dbl>
1 2013-08-28 08:00:00 1.87
2 2013-08-28 08:30:00 1.86
3 2013-08-28 09:00:00 1.86
4 2013-08-28 09:30:00 1.86
5 2013-08-28 10:30:00 3.71
6 2013-08-28 11:00:00 3.70
7 2013-08-28 12:00:00 3.70
8 2013-08-29 00:30:00 1.85
9 2013-08-29 01:00:00 1.85
10 2013-08-29 01:30:00 1.85
11 2013-09-28 10:00:00 1.86
2 Likes
Thank you so much @williaml .
The code worked for other time scales too except for 45-min.
I got it for 45-mins like this
> data %>%
+ mutate(date = ymd_hms(date),
+ time_45min = ceiling_date(date, "45 minutes")) %>%
+ group_by(time_45min) %>%
+ summarise(value = sum(value))
# A tibble: 11 × 2
time_45min value
<dttm> <dbl>
1 2013-08-28 08:00:00 1.87
2 2013-08-28 08:45:00 3.71
3 2013-08-28 09:45:00 1.86
4 2013-08-28 10:45:00 5.56
5 2013-08-28 11:00:00 1.85
6 2013-08-28 11:45:00 1.85
7 2013-08-28 12:00:00 1.85
8 2013-08-29 00:45:00 1.85
9 2013-08-29 01:00:00 1.85
10 2013-08-29 01:45:00 1.85
11 2013-09-28 09:45:00 1.86
I was assuming to get something like this
time_30min value
<dttm> <dbl>
1 2013-08-28 08:00:00 xx
2 2013-08-28 08:45:00 xx
3 2013-08-28 09:30:00 xxx
4 2013-08-28 10:15:00 xxx
............
........
........
Sorry, what were you expecting the output to be? It is unclear from your post.
I was expecting the outputs to be in intervals of 45-min.
time_30min value
<dttm> <dbl>
1 2013-08-28 08:00:00 xx
2 2013-08-28 08:45:00 xx
3 2013-08-28 09:30:00 xxx
4 2013-08-28 10:15:00 xxx
Here, the time difference between the rows is 45-mins.
Even though your script's result is at every 45-min (interval is 1-hour)
such as
time_45min value
<dttm> <dbl>
1 2013-08-28 08:00:00 1.87
2 2013-08-28 08:45:00 3.71
3 2013-08-28 09:45:00 1.86
4 2013-08-28 10:45:00 5.56
Hope this clears.
Thank you
1 Like
I see. Not sure about that one.
I think this might be what you're looking for.
library(tidyverse)
library(lubridate)
data %>%
mutate(date = ymd_hms(date),
min_since_start = interval(min(date), date) %>% time_length('minute'),
period = floor(min_since_start / 45),
period_start = min(date) + minutes(period * 45)) %>%
group_by(period_start) %>%
summarise(value = sum(value))
3 Likes