Good Afternoon,
I'm using this code to study rainfall datasets.
I have a list of rainfall records (format: YYYY-MM-DD HH:MM:SS xx.xx ) at irregular time steps and i'm merging it with an empty dataset to create a continuous dataset with records every minute (for many years). After that I look for the maximum at different time steps (5-10-15-20-30-60 minutes) usign purrr:.
Anyway, the code is still too slow (to give an idea, vectors contain more than 5.5 million elements).
Here's the first part of the code, which is also the bottleneck (it takes 95% of the computational time on purrr::: scripts)
How could I improve its speed?
%%%%%%%%%%%%%%%%
library(dplyr)
library(data.table)
library(xts)
library(rio)
library(stats4)
library(MASS)
library(gumbel)
library(ismev)
library(readr)
library(stringi)
library(fExtremes)
library(evd)
start_date <- as.POSIXct(paste0("2010/01/01", "00:00:00"), format= "%Y/%m/%d %H:%M:%S")
end_date <- as.POSIXct(paste0("2010/01/02", "00:00:01"), format= "%Y/%m/%d %H:%M:%S")
tot_date <- seq.POSIXt(start_date, end_date, by = "1 min")
zero_output <-data.frame(tot_date)
head(zero_output)
date <- c("2010-01-01 00:00:00","2010-01-01 01:03:00","2010-01-01 05:15:00","2010-01-01 06:22:00","2010-01-01 12:35:00","2010-01-01 12:36:00","2010-01-01 12:37:00","2010-01-01 15:28:00","2010-01-01 20:00:00","2010-01-01 23:00:00" )
rain <- c(0.2,0.5,0.6,0.4,0.5,1.2,8.5,4.5,12,15)
rio_5min <- data.frame(date,rain)
zero_output$tot_date <- as.character(zero_output$tot_date)
rio_5min$date <- as.character(rio_5min$date)
AD <- left_join(zero_output,rio_5min,by=c("tot_date"="date"))
AD <- data.frame(AD)
AD$rain[is.na(AD$rain)]<-0
options(digits =3)
fiv <- vector()
ten <- vector()
fift <- vector()
twe <- vector()
midhou <- vector()
hou <- vector()
fiv <- purrr::map_dbl(seq_along(AD$rain),
~{
sum(AD$rain[.:(.+4)]) / 5
})
fiv[is.na(fiv)]<-0
ten <- purrr::map_dbl(seq_along(AD$rain),
~{
sum(AD$rain[.:(.+9)]) / 10
})
ten[is.na(ten)]<-0
fift <- purrr::map_dbl(seq_along(AD$rain),
~{
sum(AD$rain[.:(.+14)]) / 15
})
fift[is.na(fift)]<-0
twe <- purrr::map_dbl(seq_along(AD$rain),
~{
sum(AD$rain[.:(.+19)]) / 20
})
twe[is.na(twe)]<-0
midhou <- purrr::map_dbl(seq_along(AD$rain),
~{
sum(AD$rain[.:(.+29)]) / 30
})
midhou[is.na(midhou)]<-0
hou <- purrr::map_dbl(seq_along(AD$rain),
~{
sum(AD$rain[.:(.+59)]) / 60
})
hou[is.na(hou)]<-0
#fiv <- AD$rain
fiv <- fiv*60
ten <- ten*60
fift <- fift*60
twe <- twe*60
midhou <-midhou*60
hou <-hou*60
....
....
....
%%%%%%%%%%%%%%%%
Thank You,
Andrea