Below an example of series I have (scenario 1) with an itinerary taken on Tuesdays and Thursdays but not all (i.e. cancelled due to bad weather)
ticket_semiweekly <- data.frame(stringsAsFactors=FALSE,
ticket = c(277, 178, 255, 368, 267, 373, 100, 120, 190, 337, 392),
tripdate = c("2014-12-16", "2014-12-18", "2014-12-23", "2014-12-30",
"2015-01-06", "2015-01-08", "2015-01-15", "2015-01-20",
"2015-01-22", "2015-01-27", "2015-01-29"),
day = c("Tuesday", "Thursday", "Tuesday", "Tuesday", "Tuesday",
"Thursday", "Thursday", "Tuesday", "Thursday", "Tuesday",
"Thursday")
)
# below I try to "ts" my series but I am not sure how to:
ts(ticket_semiweekly$ticket,start=c(2014, 12, 16),freq=2*52)
#> Time Series:
#> Start = c(2014, 12)
#> End = c(2014, 22)
#> Frequency = 104
#> [1] 277 178 255 368 267 373 100 120 190 337 392
The second example below refers to a series (scenario 2), where data are available daily but only for 3 months each year. the NA here, means that I have also data available for the remaining days up to the end of each month. But again there may be gaps (cancelled or not scheduled trips)
data.frame(
ticket = c(277, 178, 255, 368, 267, NA, 100, 120, 190, 337, 392, 200, NA,
300, 290, 260, 370, 290, NA, NA, 120, 150, 210, 347, 395, 219,
NA, 200, 205, 200, 390, 400, 240, NA, 340, 200, 285, 400, 300,
260, NA, 140, 160),
tripdate = c("2014-07-01", "2014-07-02", "2014-07-03", "2014-07-04",
"2014-07-05", NA, "2014-07-31", "2014-08-01", "2014-08-02",
"2014-08-03", "2014-08-04", "2014-08-05", NA, "2014-08-30",
"2014-09-01", "2014-09-02", "2014-09-03", "2014-09-04", "2014-09-05",
NA, "2014-09-30", "2015-07-01", "2015-07-02", "2015-07-03",
"2015-07-04", "2015-07-05", NA, "2015-07-31", "2015-08-01",
"2015-08-02", "2015-08-03", "2015-08-04", "2015-08-05", NA,
"2015-08-30", "2015-09-01", "2014-08-05", "2015-09-03", "2015-09-04",
"2015-09-01", NA, "2015-09-30", "2016-07-01")
)
#> ticket tripdate
#> 1 277 2014-07-01
#> 2 178 2014-07-02
#> 3 255 2014-07-03
#> 4 368 2014-07-04
#> 5 267 2014-07-05
#> 6 NA <NA>
#> 7 100 2014-07-31
#> 8 120 2014-08-01
#> 9 190 2014-08-02
#> 10 337 2014-08-03
#> 11 392 2014-08-04
#> 12 200 2014-08-05
#> 13 NA <NA>
#> 14 300 2014-08-30
#> 15 290 2014-09-01
#> 16 260 2014-09-02
#> 17 370 2014-09-03
#> 18 290 2014-09-04
#> 19 NA 2014-09-05
#> 20 NA <NA>
#> 21 120 2014-09-30
#> 22 150 2015-07-01
#> 23 210 2015-07-02
#> 24 347 2015-07-03
#> 25 395 2015-07-04
#> 26 219 2015-07-05
#> 27 NA <NA>
#> 28 200 2015-07-31
#> 29 205 2015-08-01
#> 30 200 2015-08-02
#> 31 390 2015-08-03
#> 32 400 2015-08-04
#> 33 240 2015-08-05
#> 34 NA <NA>
#> 35 340 2015-08-30
#> 36 200 2015-09-01
#> 37 285 2014-08-05
#> 38 400 2015-09-03
#> 39 300 2015-09-04
#> 40 260 2015-09-01
#> 41 NA <NA>
#> 42 140 2015-09-30
#> 43 160 2016-07-01
In both cases, my problems are how to take into account gaps and how to declare series as time series
The goal is to make forecast for the coming Tuesdays and Thursdays (scenario 1) or the the days (or weeks) of those months next year (scenario 2).
I hope it is clear now, although a lengthy message