Decomposition of time series yields error

VannyF · October 27, 2020, 5:12pm

Hi all,

I am very new to R and I would like to use the decompose() method from stats package like it is used here: Time Series Analysis
I have this code:
decomposedTimeSeries<-decompose(ts(generationData$Price[1:500]))
Unfortunately I get an error stating:

Error in decompose(ts(generationData$Price[1:500])) :
Zeitreihe hat keine oder weniger als 2 Perioden
Translated: Time Series has no or less than 2 Periods

I do not know why this problem occurs. The used time series 'ts(generationData$Price[1:500])' is just a conventional time series. Does anyone have an idea what the problem might be and how I can solve it. I'd appreciate every comment.

GreyMerchant · October 27, 2020, 5:36pm

Hello,

Welcome to the forum! So your error is happening because for typical time series decomposition it needs at least 2 observations for each period for it to be able to extract a trend, seasonality, and remainder.

You'll see in my silly example the first one fails as we only have 7 measured points and the frequency is 4 (so we would need minimum 8). Changing the frequency to 3 yields a decompoisition.

TS <- ts(1:7, frequency = 4)
decompose(TS)
#> Error in decompose(TS): time series has no or less than 2 periods


TS <- ts(1:7, frequency = 3)
decompose(TS)
#> $x
#> Time Series:
#> Start = c(1, 1) 
#> End = c(3, 1) 
#> Frequency = 3 
#> [1] 1 2 3 4 5 6 7
#> 
#> $seasonal
#> Time Series:
#> Start = c(1, 1) 
#> End = c(3, 1) 
#> Frequency = 3 
#> [1] -1.110223e-16  0.000000e+00  1.110223e-16 -1.110223e-16  0.000000e+00
#> [6]  1.110223e-16 -1.110223e-16
#> 
#> $trend
#> Time Series:
#> Start = c(1, 1) 
#> End = c(3, 1) 
#> Frequency = 3 
#> [1] NA  2  3  4  5  6 NA
#> 
#> $random
#> Time Series:
#> Start = c(1, 1) 
#> End = c(3, 1) 
#> Frequency = 3 
#> [1]           NA 2.220446e-16 4.440892e-16 0.000000e+00 0.000000e+00
#> [6] 0.000000e+00           NA
#> 
#> $figure
#> [1] -1.110223e-16  0.000000e+00  1.110223e-16
#> 
#> $type
#> [1] "additive"
#> 
#> attr(,"class")
#> [1] "decomposed.ts"

^{Created on 2020-10-27 by the reprex package (v0.3.0)}

VannyF · October 27, 2020, 5:57pm

Thanks for your answer GreyMerchant,
my time series does have more than 2 entries. Bascially I have a time series and I would like to decompose it. The time series has in my example 300.000 entries but I only want to decompose a shortened time series beginning from time slot 1 to time slot 500. So I have 500 observations for that period and it should be possible to decompose it. The frequency is I think 1 hour in my time example (altough I do see how this should affect the decomposition). Even when I increase the time series samples to 50000 I still get the same error. 50000 data points should be more than enough to see a trand (if there is one).

GreyMerchant · October 27, 2020, 7:44pm

The frequency is really important and not arbitrary! So the decomposition needs to know when it is basically hitting the same "point" for a second time as it is treated as a "cycle". So in my example of the frequency of 4 we could say I measured every quarter once and therefore I would need at least 8 observations to have the minimum. If you think of it, I can have the 4 on top of the 4 and I will have a comparable rating for Q1 - 2019 vs Q1 - 2020 etc etc.

Thus, you need to specify a meaningful frequency. If I am looking at measuring my heart rate every hour and reporting it then my frequency would be 24 and I would need at least 24 x 2 = 48 to be able to decompose it. I could have had 70 "observations" which is not a perfect 3 days worth of measures but it still satisfies the 2 or more sets. Does it make more sense to you now?

On a side note, I can't exactly see how your data looks which makes this more difficult too. If you want me to look at it I suggest a reprex: FAQ: How to do a minimal reproducible example ( reprex ) for beginners

VannyF · October 28, 2020, 8:34am

Thanks GreyMerchant for your answer and effort,

basically I have to admit that unfortunately I do not understand it all. Of course for any real world implications the frequency is important but what I want to have is just to decompose a time series in a first step. Why do I need 24x2 datapoints in your example? Furher, as stated before, my time series is quite long and even when I use the command with 10 times more datapoints I still get this error message. I even tried it with 100k datapoints that should be more than enough to see a trend but it did not work. Not every time series has regular cycles. In this case the cycle component should just be zero but a trend is still observable.

Anyways here you can see the data (200 time steps):

ts(generationData$Price[0:200])
Time Series:
Start = 1 
End = 200 
Frequency = 1 
  [1]  10.07  -4.08  -9.91  -7.41 -12.55 -17.25 -15.07  -4.93  -6.33  -4.93   0.45   0.12  -0.02
 [14]   0.00  -0.03   1.97   9.06   0.07  -4.97  -6.98 -24.93  -4.87 -28.93 -33.57 -45.92 -48.29
 [27] -44.99 -48.93 -29.91  -0.01  37.43  48.06  50.74  47.57  43.94  40.97  44.95  49.64  53.67
 [40]  56.01  56.95  62.08  62.11  57.99  55.64  55.13  50.76  42.91  45.22  45.63  44.00  43.88
 [53]  45.92  51.07  52.77  62.89  60.03  58.19  62.99  63.52  64.67  65.24  67.76  68.41  69.55
 [66]  67.28  69.46  68.38  61.72  53.72  49.98  50.73  47.11  47.07  46.94  47.00  46.91  49.59
 [79]  55.32  55.78  55.52  55.23  53.58  51.74  51.60  51.41  51.69  52.59  54.66  54.10  51.89
 [92]  46.58  45.43  43.96  31.41  26.90  25.12  24.12  22.04  18.37  22.09  23.35  28.76  36.63
[105]  40.46  45.85  49.80  51.36  51.74  51.92  53.22  56.62  56.92  61.64  59.44  52.75  51.90
[118]  51.38  49.96  50.29  47.72  48.38  48.02  44.23  47.17  48.19  49.11  50.44  53.40  56.55
[131]  60.02  60.22  55.00  52.39  51.57  54.71  63.43  67.37  67.20  66.03  55.36  58.59  53.70
[144]  46.03  47.98  47.84  46.11  46.08  47.62  55.77  68.61  74.15  74.93  73.59  71.23  68.79
[157]  66.75  62.47  53.25  53.26  53.42  47.91  42.05  40.96  32.04  20.82   1.84  17.94  20.91
[170]   7.78  14.33  18.56  18.57  35.81  43.87  46.93  43.88  43.85  46.74  43.94  43.21  43.81
[183]  45.60  35.21  45.64  45.63  37.94  39.53  35.97  29.72  22.55  20.04   7.24   3.43  10.04
[196]  14.20  25.41  36.98  43.89  50.98

GreyMerchant · October 28, 2020, 8:46am

I can't use the data in this format see how to do a reprex and then I can have a look tonight.

VannyF · October 28, 2020, 1:04pm

Thanks GreyMerchant for your answer,

unfortunately the datapasta does not work. I installed it and tried the following without success:

> library(datapasta)
Warning message:
Paket ‘datapasta’ wurde unter R Version 3.6.3 erstellt 
> datapasta::dpasta(ts(generationData$Price[0:200]))
> datapasta::dpasta(ts(generationData$Price[0:200]))
> reprox<-datapasta::dpasta(ts(generationData$Price[0:200]))
> head(ts(generationData$Price[0:200]))
[1]  10.07  -4.08  -9.91  -7.41 -12.55 -17.25
> test<-tribble:tribble
Error: object 'tribble' not found
> tribble_paste()
Could not paste clipboard as tibble. Text could not be parsed as table.
NULL
> dpasta(ts(generationData$Price[0:200]))
> dpasta(ts(generationData$Price[0:200]))
> reprox<-dpasta(ts(generationData$Price[0:200]))

So I just copied the 200 entries of the time series into this post. The sample rate is 1 hour starting from January 1sth 0:00 a.m. 2019

Click to see data vector

10.07
-4.08
-9.91
-7.41
-12.55
-17.25
-15.07
-4.93
-6.33
-4.93
0.45
0.12
-0.02
0
-0.03
1.97
9.06
0.07
-4.97
-6.98
-24.93
-4.87
-28.93
-33.57
-45.92
-48.29
-44.99
-48.93
-29.91
-0.01
37.43
48.06
50.74
47.57
43.94
40.97
44.95
49.64
53.67
56.01
56.95
62.08
62.11
57.99
55.64
55.13
50.76
42.91
45.22
45.63
44
43.88
45.92
51.07
52.77
62.89
60.03
58.19
62.99
63.52
64.67
65.24
67.76
68.41
69.55
67.28
69.46
68.38
61.72
53.72
49.98
50.73
47.11
47.07
46.94
47
46.91
49.59
55.32
55.78
55.52
55.23
53.58
51.74
51.6
51.41
51.69
52.59
54.66
54.1
51.89
46.58
45.43
43.96
31.41
26.9
25.12
24.12
22.04
18.37
22.09
23.35
28.76
36.63
40.46
45.85
49.8
51.36
51.74
51.92
53.22
56.62
56.92
61.64
59.44
52.75
51.9
51.38
49.96
50.29
47.72
48.38
48.02
44.23
47.17
48.19
49.11
50.44
53.4
56.55
60.02
60.22
55
52.39
51.57
54.71
63.43
67.37
67.2
66.03
55.36
58.59
53.7
46.03
47.98
47.84
46.11
46.08
47.62
55.77
68.61
74.15
74.93
73.59
71.23
68.79
66.75
62.47
53.25
53.26
53.42
47.91
42.05
40.96
32.04
20.82
1.84
17.94
20.91
7.78
14.33
18.56
18.57
35.81
43.87
46.93
43.88
43.85
46.74
43.94
43.21
43.81
45.6
35.21
45.64
45.63
37.94
39.53
35.97
29.72
22.55
20.04
7.24
3.43
10.04
14.2
25.41
36.98
43.89
50.98

GreyMerchant · October 29, 2020, 9:43am

Hello,

See below. I used the data you provided. I ran one run with frequency = 2 and another with frequency = 10. As you can see the decomposition changes given the "cycle". Hopefully this helps? You can explore the rest of the outputs by looking at output.

library(tidyverse)

df <- c(10.07, -4.08, -9.91, -7.41, -12.55, -17.25, -15.07, -4.93, -6.33, -4.93, 0.45, 0.12, -0.02, 0, -0.03, 1.97, 9.06, 0.07, -4.97, -6.98, -24.93, -4.87, -28.93, -33.57, -45.92, -48.29, -44.99, -48.93, -29.91, -0.01, 37.43, 48.06, 50.74, 47.57, 43.94, 40.97, 44.95, 49.64, 53.67, 56.01, 56.95, 62.08, 62.11, 57.99, 55.64, 55.13, 50.76, 42.91, 45.22, 45.63, 44, 43.88, 45.92, 51.07, 52.77, 62.89, 60.03, 58.19, 62.99, 63.52, 64.67, 65.24, 67.76, 68.41, 69.55, 67.28, 69.46, 68.38, 61.72, 53.72, 49.98, 50.73, 47.11, 47.07, 46.94, 47, 46.91, 49.59, 55.32, 55.78, 55.52, 55.23, 53.58, 51.74, 51.6, 51.41, 51.69, 52.59, 54.66, 54.1, 51.89, 46.58, 45.43, 43.96, 31.41, 26.9, 25.12, 24.12, 22.04, 18.37, 22.09, 23.35, 28.76, 36.63, 40.46, 45.85, 49.8, 51.36, 51.74, 51.92, 53.22, 56.62, 56.92, 61.64, 59.44, 52.75, 51.9, 51.38, 49.96, 50.29, 47.72, 48.38, 48.02, 44.23, 47.17, 48.19, 49.11, 50.44, 53.4, 56.55, 60.02, 60.22, 55, 52.39, 51.57, 54.71, 63.43, 67.37, 67.2, 66.03, 55.36, 58.59, 53.7, 46.03, 47.98, 47.84, 46.11, 46.08, 47.62, 55.77, 68.61, 74.15, 74.93, 73.59, 71.23, 68.79, 66.75, 62.47, 53.25, 53.26, 53.42, 47.91, 42.05, 40.96, 32.04, 20.82, 1.84, 17.94, 20.91, 7.78, 14.33, 18.56, 18.57, 35.81, 43.87, 46.93, 43.88, 43.85, 46.74, 43.94, 43.21, 43.81, 45.6, 35.21, 45.64, 45.63, 37.94, 39.53, 35.97, 29.72, 22.55, 20.04, 7.24, 3.43, 10.04, 14.2, 25.41, 36.98, 43.89, 50.98) %>%  as.data.frame()


TS <- ts(df, frequency = 2)

output <- decompose(TS)


plot.ts(TS)

plot(output)


library(tidyverse)

df <- c(10.07, -4.08, -9.91, -7.41, -12.55, -17.25, -15.07, -4.93, -6.33, -4.93, 0.45, 0.12, -0.02, 0, -0.03, 1.97, 9.06, 0.07, -4.97, -6.98, -24.93, -4.87, -28.93, -33.57, -45.92, -48.29, -44.99, -48.93, -29.91, -0.01, 37.43, 48.06, 50.74, 47.57, 43.94, 40.97, 44.95, 49.64, 53.67, 56.01, 56.95, 62.08, 62.11, 57.99, 55.64, 55.13, 50.76, 42.91, 45.22, 45.63, 44, 43.88, 45.92, 51.07, 52.77, 62.89, 60.03, 58.19, 62.99, 63.52, 64.67, 65.24, 67.76, 68.41, 69.55, 67.28, 69.46, 68.38, 61.72, 53.72, 49.98, 50.73, 47.11, 47.07, 46.94, 47, 46.91, 49.59, 55.32, 55.78, 55.52, 55.23, 53.58, 51.74, 51.6, 51.41, 51.69, 52.59, 54.66, 54.1, 51.89, 46.58, 45.43, 43.96, 31.41, 26.9, 25.12, 24.12, 22.04, 18.37, 22.09, 23.35, 28.76, 36.63, 40.46, 45.85, 49.8, 51.36, 51.74, 51.92, 53.22, 56.62, 56.92, 61.64, 59.44, 52.75, 51.9, 51.38, 49.96, 50.29, 47.72, 48.38, 48.02, 44.23, 47.17, 48.19, 49.11, 50.44, 53.4, 56.55, 60.02, 60.22, 55, 52.39, 51.57, 54.71, 63.43, 67.37, 67.2, 66.03, 55.36, 58.59, 53.7, 46.03, 47.98, 47.84, 46.11, 46.08, 47.62, 55.77, 68.61, 74.15, 74.93, 73.59, 71.23, 68.79, 66.75, 62.47, 53.25, 53.26, 53.42, 47.91, 42.05, 40.96, 32.04, 20.82, 1.84, 17.94, 20.91, 7.78, 14.33, 18.56, 18.57, 35.81, 43.87, 46.93, 43.88, 43.85, 46.74, 43.94, 43.21, 43.81, 45.6, 35.21, 45.64, 45.63, 37.94, 39.53, 35.97, 29.72, 22.55, 20.04, 7.24, 3.43, 10.04, 14.2, 25.41, 36.98, 43.89, 50.98)

TS <- ts(df, frequency = 10)

output <- decompose(TS)


plot.ts(TS)

plot(output)

^{Created on 2020-10-29 by the reprex package (v0.3.0)}

VannyF · October 29, 2020, 1:30pm

Thanks GreyMerchant for your help and effort,

I still do not understand at all why I have to specify a frequency. Basically there is no frequency or any real cycle in the time series I have provided. So the frequency should be 1. From the decomposition pictures I can see that there is a seasonal component. This is quite artifical and just not true. So the seasonal component should be 0 and based on that R should decompose the time series.

GreyMerchant · October 29, 2020, 2:08pm

Because time is understood to be cyclical in this regard (e.g. you will always have night and day followed by night again or you will have colder months followed by warmer etc). If you truly have no meaningful frequency then you need to probably look at what other methods are available to decompose in this way. I am not sure what the implications exactly are with a flat seasonality then given the ability to extract a trend and randomness from it.

Have a read here: https://stats.stackexchange.com/questions/159428/forecasting-with-no-seasonality

VannyF · October 29, 2020, 5:48pm

Thanks GreyMerchant for your answer and effort, I really appreciate it.

Is it not possible to just decompose a time series using R without any cycles and frequency? I mean the decomposition you showed (with the conventional decompose() function) is kind of senseless regarding the cycles. You first specify a frequency and R then plots (and decomposes) a regular cycle based on your specified frequency. This is not usefull at all and pointless. How can one profit from that?

GreyMerchant · October 29, 2020, 9:51pm

This is going to be my final response on this post.

Time series decomposition specifically means to decompose in a trend, seasonality, and remainder like this. There is a lot of work that has been done in time series to find this approach adequate to analyze and understand a lot of time sensitive data. You clearly don't want to analyse in this way so select a more appropriate technique for your problem.

I think you're very naive if you want to brand this technique as useless and pointless. A LOT of people use times series and have decomposition as a first step before building their prediction models. To build a good time series model and profit from it is sort of a joke...that is not how any of this works. Even if you had good stock market data you can't just simply add it into a model and start getting accurate predictions or even have certainty as a lot of models cannot predict what individual players and actors will necessarily do depending on their reactions, goals, behaviours etc.

I think you'd be far better of learning about these techniques more and their theoretical underpinnings and use cases.

VannyF · October 30, 2020, 8:59am

Thanks for your answer and help GreyMerchant,

basically I was not questioning the decomposition itself. I was (and I am still) questioning the approach of defining a frequency before the decomposition (altough there are no real cycles) and then let a decomposition function extract cycles based on this pre-specified frequency. I still have the opinion that this is pointless.

VannyF · November 2, 2020, 2:14pm

Does anyone else have an opinion on the approach of predefining a frequency before the decomposition (altough there are no real cycles) and then let a decomposition function extract cycles based on this pre-specified frequency. As mentioned above, I personally think that this is pointless. So basically my question is whether there are methods that do not need a predefined frequency (if the time series has no cycles) to decompose the time series into a trend and a white noise? I'd appreciate any further comments.

VannyF · November 5, 2020, 5:42pm

Does anyone have a remark regarding my last two comments? I'd really appreciate it as I am quite confused at the moment.

VannyF · November 16, 2020, 10:03am

Does it really make sense to specify a frequency for a time series that does not have cycles and then decompose this time series?

VannyF · November 17, 2020, 4:32pm

Are my questions in the last comments unclear or why is nobody replying? If so please tell me what is not clear and I'd try to explain it.

VannyF · November 23, 2020, 8:23am

What do you think about this??????????

system · December 14, 2020, 8:23am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.