How to average number of rows into hourly data

mhab · June 2, 2021, 9:44pm

Hi all,

I need your help to make an average of my data which is in hour and minutes to only single value for hourly interval. Below example for the dataset.

Example or Dataset

datetime	Flux
31/08/2018 12:55	2.88
31/08/2018 13:06	2.66
31/08/2018 13:16	2.72
31/08/2018 13:27	2.85
31/08/2018 13:38	2.8
31/08/2018 13:48	2.81
31/08/2018 13:59	2.52
31/08/2018 14:09	2.56
31/08/2018 14:20	2.56
31/08/2018 14:30	2.47
31/08/2018 14:41	2.32
31/08/2018 14:52	2.74
31/08/2018 15:02	2.76
31/08/2018 15:13	2.8
31/08/2018 15:23	2.76
31/08/2018 15:34	2.69
31/08/2018 15:44	2.86
31/08/2018 15:55	2.65
31/08/2018 16:05	2.47
31/08/2018 16:16	2.67
31/08/2018 16:26	2.48
31/08/2018 16:37	2.67
31/08/2018 16:47	2.53
31/08/2018 16:58	2.58

My expected output will be:

Datetime Flux
31/08/2018 12 5
31/08/2018 13 6
31/08/2018 14 5

Thank you.

FJCC · June 2, 2021, 10:56pm

Your desired output seems to have the count of the rows of data rather than the average of the Flux. I put both values in the code below.

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
DF <- read.csv("~/R/Play/Dummy.csv")
DF <- DF %>% mutate(datetime = dmy_hm(datetime), Date = as.Date(datetime), Hour = hour(datetime))
head(DF)
#>              datetime Flux       Date Hour
#> 1 2018-08-31 12:55:00 2.88 2018-08-31   12
#> 2 2018-08-31 13:06:00 2.66 2018-08-31   13
#> 3 2018-08-31 13:16:00 2.72 2018-08-31   13
#> 4 2018-08-31 13:27:00 2.85 2018-08-31   13
#> 5 2018-08-31 13:38:00 2.80 2018-08-31   13
#> 6 2018-08-31 13:48:00 2.81 2018-08-31   13
Averages <- DF %>% group_by(Date, Hour) %>% 
  summarize(Avg = mean(Flux), Count = n())
#> `summarise()` regrouping output by 'Date' (override with `.groups` argument)
Averages
#> # A tibble: 5 x 4
#> # Groups:   Date [1]
#>   Date        Hour   Avg Count
#>   <date>     <int> <dbl> <int>
#> 1 2018-08-31    12  2.88     1
#> 2 2018-08-31    13  2.73     6
#> 3 2018-08-31    14  2.53     5
#> 4 2018-08-31    15  2.75     6
#> 5 2018-08-31    16  2.57     6

^{Created on 2021-06-02 by the reprex package (v0.3.0)}

mhab · June 3, 2021, 10:15am

Hi,

Thank you for your help and sharing the code,
It was really helpful and solved my problems.
Have a nice day.
Cheers.

MHA

system · June 24, 2021, 10:16am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.