generate hourly data from minutes

I am trying to generate the hourly data from every 60 min data

I am using

magee$new_date <- as.POSIXct(as.character(paste(magee$Date, magee$Time)), format="%Y-%m-%d %H:%M:%S")

magee %>%
  group_by(Datetime = floor_date(new_date, "hour")) %>%
  summarise(BC6_h = mean(BC6_micro, na.rm = TRUE))

But, I am not getting the data at each hour,

new_date is in the data file looks like

2022-01-01 00:00:00
2022-01-01 00:01:00
2022-01-01 00:02:00
...
...
2022-01-01 23:59:00
2022-01-02 00:00:00
...
...

Data can be found

https://www.dropbox.com/t/PabKIfOnoxrn2iDG

I want to create the data like

2022-01-01 00:00:00
2022-01-01 01:00:00
2022-01-01 02:00:00
2022-01-01 03:00:00
2022-01-01 04:00:00
2022-01-01 05:00:00
2022-01-01 06:00:00
2022-01-01 07:00:00
2022-01-01 08:00:00
2022-01-01 09:00:00
2022-01-01 10:00:00
2022-01-01 11:00:00
2022-01-01 12:00:00
2022-01-01 13:00:00
2022-01-01 14:00:00
2022-01-01 15:00:00
2022-01-01 16:00:00
2022-01-01 17:00:00
2022-01-01 18:00:00
2022-01-01 19:00:00
2022-01-01 20:00:00
2022-01-01 21:00:00
2022-01-01 22:00:00
2022-01-01 23:00:00
2022-02-01 00:00:00
2022-02-01 01:00:00
2022-02-01 02:00:00
2022-02-01 03:00:00
2022-02-01 04:00:00
2022-02-01 05:00:00
2022-02-01 06:00:00
2022-02-01 07:00:00
2022-02-01 08:00:00
2022-02-01 09:00:00
2022-02-01 10:00:00
2022-02-01 11:00:00
2022-02-01 12:00:00
2022-02-01 13:00:00
2022-02-01 14:00:00
2022-02-01 15:00:00
2022-02-01 16:00:00
2022-02-01 17:00:00
2022-02-01 18:00:00
2022-02-01 19:00:00
2022-02-01 20:00:00
2022-02-01 21:00:00
2022-02-01 22:00:00
2022-02-01 23:00:00

Thanks

What exactly are you after?

Because your original dataset looks like this (nothing like what your example):

# A tibble: 29,423 × 7
   Date       Time     Day   BC6   BC1 BC6_micro BC1_micro
   <date>     <time> <dbl> <dbl> <dbl>     <dbl>     <dbl>
 1 2022-01-01 00'00"     1  1593  1613     1.59       1.61
 2 2022-01-01 01'00"     1  1573  1571     1.57       1.57
 3 2022-01-01 02'00"     1  1645  1632     1.64       1.63
 4 2022-01-01 03'00"     1  2425  2270     2.42       2.27
 5 2022-01-01 04'00"     1  3131  3039     3.13       3.04
 6 2022-01-01 05'00"     1  3443  3399     3.44       3.40
 7 2022-01-01 06'00"     1  2696  2776     2.70       2.78
 8 2022-01-01 07'00"     1  2025  2230     2.02       2.23
 9 2022-01-01 08'00"     1  1045  1564     1.04       1.56
10 2022-01-01 09'00"     1   742  1325     0.742      1.32
# … with 29,413 more rows

This is summarised by hour :

library(tidyverse)
library(lubridate)

magee %>% 
  mutate(new_date = ymd_hms(as.character(paste(Date, Time))),
         date_time = floor_date(new_date, "hour")) %>% 
  group_by(date_time) %>% 
  summarise(BC6_h = mean(BC6_micro, na.rm = TRUE))

# A tibble: 491 × 2
   date_time            BC6_h
   <dttm>               <dbl>
 1 2022-01-01 00:00:00 1.02  
 2 2022-01-01 01:00:00 0.120 
 3 2022-01-01 02:00:00 0.136 
 4 2022-01-01 03:00:00 0.0369
 5 2022-01-01 04:00:00 0.0328
 6 2022-01-01 05:00:00 0.0312
 7 2022-01-01 06:00:00 0.0511
 8 2022-01-01 07:00:00 0.0589
 9 2022-01-01 08:00:00 0.0672
10 2022-01-01 09:00:00 0.285 

Next time, provide a reproducible dataset:

2 Likes

Yes! I was facing the same issue. In the file, the data is in every minute, but after uploading in R, it was showing like as you mentioned in your response.

that's why I tried to generate the new_date via

magee$new_date <- as.POSIXct(as.character(paste(magee$Date, magee$Time)), format="%Y-%m-%d %H:%M:%S")

maybe the date was not correct, so I changed the date more accurate way!

Corrected data can be found here

https://www.dropbox.com/t/rGiVvmj51ZcboljN

So, This time the data shows like, after applying

magee$date <- as.POSIXct(as.character(paste(magee$date)), format="%Y-%m-%d %H:%M:%S")
  date                  BC6   BC1
  <dttm>              <dbl> <dbl>
1 2022-01-01 00:00:00  1.59  1.61
2 2022-01-01 00:01:00  1.57  1.57
3 2022-01-01 00:02:00  1.64  1.63
4 2022-01-01 00:03:00  2.42  2.27
5 2022-01-01 00:04:00  3.13  3.04
6 2022-01-01 00:05:00  3.44  3.40

Then I tried


magee %>%
  mutate(date_time = floor_date(date, "hour"))  %>% 
  group_by(date_time) %>% 
  summarise(BC6_h = mean(BC6, na.rm = TRUE))

But it is showing the total mean of the data, not each hour!

    BC6_h
1 1.31546

I hope I am clear this time!

Yes, but my code shows the mean by hour. Isn't that what you're after?

Yes! That's what I was looking for.
Thanks for spending time on my query.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.