Leap Year / Time Series / Forecasting

lherming · July 2, 2020, 11:35am

Hello everyone,

I am tryining to do a daily Forecast. For this I would like to determine a time series and both the training and test set. Unfortunately, however, the year 2016 is a leap year, so I cannot set the frequency to 365. To leave one day out of the examinations is also out of question, because afterwards the weekdays have to be analyzed. I already tried it with the frequency 365.25, but then the leap year would fall on the year 2018, which is not correct. The data has rather an annual cycle instead of a weekly one. However, I would be grateful for any solution suggestions. Enclosed my code so far:

inds <- seq(as.Date("2014-09-10"), as.Date("2017-12-31"), by = "day") # Create a daily Date object

historic_demand <- ts(data$Demand,
start = c(2014, as.numeric(format(inds[1], "%j"))),
frequency = 365)

training_set <- window(historic_demand,
start = c(2014, as.numeric(format(inds[1], "%j"))),
end = c(2016, 365))

test_set <- window(historic_demand,
start = c(2017, 1),
end = c(2017, 365))

Kind regards
Lukas

robjhyndman · July 2, 2020, 10:55pm

Daily data is easier to handle with tsibble objects rather than ts objects as the dates are preserved explicitly. Here is an example of fitting a model to data like yours using a dynamic harmonic regression model to handle the annual seasonality. Many other models can be handled using the fable package. See OTexts.com/fpp3 for a textbook introduction to using tsibble and fable.

library(tidyverse)
library(lubridate)
library(tsibble)
library(fable)

# Create some fake data with same dates
historic_demand <- tibble(
  inds = seq(as.Date("2014-09-10"), as.Date("2017-12-31"), by = "day")
  ) %>%
  mutate(
    Demand = rnorm(length(inds))
  ) %>%
  as_tsibble(index=inds)
# Split into training and test sets
training_set <- historic_demand %>% filter(year(inds) <= 2016)
test_set <- historic_demand %>% filter(year(inds) >= 2017)

# Fit dynamic harmonic regression model with annual seasonality
fit <- training_set %>%
  model(dhr = ARIMA(Demand ~ fourier(period="1 year", K=5)))

# Forecasts of the test set
fc <- fit %>% forecast(test_set)

# Compute accuracy statistics for forecasts
fc %>% accuracy(historic_demand)
#> # A tibble: 1 x 9
#>   .model .type      ME  RMSE   MAE   MPE  MAPE  MASE   ACF1
#>   <chr>  <chr>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
#> 1 dhr    Test  -0.0538 0.985 0.786  76.7  146. 0.666 0.0759

^{Created on 2020-07-03 by the reprex package (v0.3.0)}

jlacko · July 16, 2020, 5:57am

Thanks for sharing the link; I was not aware of the book (and now have some reading to do...)

system · August 6, 2020, 5:57am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.