I need help with plotting this time series

Samridhi · June 1, 2022, 1:04pm

library(readxl)
Index1 <- read_excel("C:/Users/Samridhi/Desktop/Index1.xlsx",
sheet = "in", col_types = c("text"))
View(Index1)
Index1TS <- ts(Index1, frequency = 4, start=c(2016,1))
IndexTS
plot.ts("Index1TS")

plot.ts("Index1TS")
Error in plot.window(xlim, ylim, log, ...) : need finite 'ylim' values
In addition: Warning messages:
1: In xy.coords(x, NULL, log = log, setLab = FALSE) :
NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf

startz · June 1, 2022, 1:43pm

Probably want to leave the quotes off the argument.

Samridhi · June 1, 2022, 2:11pm

I can't plot this time series perfectly. With your suggestion I can see the graph but it is a straight line. Could you possibly help with the full coding of this time series. Like from importing in excel to plotting. I can then check my code and see what is the issue

dvetsch75 · June 1, 2022, 3:02pm

The Problems

There are a few things going on here that are giving you problems:

For context, I am using this as a proxy for your data:

index1 <- data.frame(
    'date_price_index' = paste(
        seq.Date(
            from = as.Date('2016-01-01'),
            by = 'quarter',
            length.out = 24
        ),
        sapply(
            1:24,
            function(x) {
                returns_range <- seq(
                    from = -0.001,
                    to = 0.01,
                    length.out = 1000
                )
                111 * (1 + sample(returns_range, size = 1))^x
            }
        ),
        sep = ';'
    )
)
index1
#>               date_price_index
#> 1  2016-01-01;112.006111111111
#> 2  2016-04-01;112.307159184184
#> 3  2016-07-01;113.389950465601
#> 4  2016-10-01;112.425480537166
#> 5  2017-01-01;111.601298700539
#> 6   2017-04-01;116.71557001979
#> 7  2017-07-01;116.362484627963
#> 8  2017-10-01;118.054473986592
#> 9  2018-01-01;110.562767358832
#> 10 2018-04-01;117.562782686146
#> 11 2018-07-01;123.705592755229
#> 12 2018-10-01;115.529226680025
#> 13 2019-01-01;121.544176890339
#> 14 2019-04-01;114.000345177666
#> 15 2019-07-01;111.040006727427
#> 16 2019-10-01;118.818536055432
#> 17 2020-01-01;125.988584972734
#> 18 2020-04-01;113.336921133243
#> 19 2020-07-01;126.138732856221
#> 20 2020-10-01;134.528549634179
#> 21 2021-01-01;113.547430684347
#> 22  2021-04-01;113.64268185571
#> 23 2021-07-01;131.021882107549
#> 24 2021-10-01;124.069172217448

^{Created on 2022-06-01 by the reprex package (v1.0.0)}

First, I am doubtful that you want col_types = c("text") in read_excel. Assuming you are using readxl::read_excel, you probably want either col_types = c('Date', 'double') or, even better yet, just leave that argument out. read_excel will guess the type for you and judging by your data, it will probably be able to guess correctly.

Second, look at what feeding your dataframe to ts ends up creating:

ts(index1, frequency = 4, start = c(2016, 1))
#>      Qtr1 Qtr2 Qtr3 Qtr4
#> 2016    1    2    3    4
#> 2017    5    6    7    8
#> 2018    9   10   11   12
#> 2019   13   14   15   16
#> 2020   17   18   19   20
#> 2021   21   22   23   24

That's why your plot ends up looking like a straight line.

The Solution

I would recommend starting with tidyverse tools before getting into base R tools, since base R tools are syntactically more similar to traditional statistical languages, whereas tidyverse tools tend to be closer to natural language.

So to do what you are trying to do with tidy tools, I would do something like this:


# Load tidyr for separating by delimiter

library(tidyr)
library(ggplot2)
library(dplyr)
library(scales)

index1_sep <- index1 %>% 
    separate(
        col = 'date_price_index',
        sep = ';',
        into = c('date', 'price_index')
    ) %>% 
    mutate(
        date = as.Date(date),
        price_index = as.numeric(price_index)
    )

head(index1_sep)
#>         date price_index
#> 1 2016-01-01    111.9719
#> 2 2016-04-01    112.1425
#> 3 2016-07-01    111.0557
#> 4 2016-10-01    114.0333
#> 5 2017-01-01    110.5618
#> 6 2017-04-01    114.1397

ggplot(index1_sep, aes(x = date, y = price_index, group = T)) +
    geom_line() +
    scale_x_date(
        date_breaks = '3 months',
        label = date_format(format = '%m-%Y')
    )

Samridhi · June 2, 2022, 10:20am

Thank you for your help. There is another issue that I am facing now.

The code
EstFund_LT <- read.csv("C:\Users\Samridhi\Desktop\New folder\Lithuania Building Prices.csv", sep= ";")
EstFund_LT

EstFund_LT_TS <- ts(EstFund_LT, frequency = 4, start=c(2016,1))
EstFund_LT_TS

plot.ts(EstFund_LT_TS, aes(x = date, y = price_index, group = T)) +
geom_line() +
scale_x_date(
date_breaks = '3 months',
label = date_format(format = '%m-%Y'))

The data
EstFund_LT
DATE Price.index
1 2016-01-01 111.0524
2 2016-04-01 113.2230
3 2016-07-01 117.9778
4 2016-10-01 118.7229
5 2017-01-01 119.0881
6 2017-04-01 120.5646
7 2017-07-01 122.7022
8 2017-10-01 121.7658
9 2018-01-01 124.1870
10 2018-04-01 126.1828
11 2018-07-01 127.7915
12 2018-10-01 127.6438
13 2019-01-01 131.1385
14 2019-04-01 131.0206
15 2019-07-01 132.7060
16 2019-10-01 133.2458
17 2020-01-01 135.9200
18 2020-04-01 139.0751
19 2020-07-01 139.7961
20 2020-10-01 145.0123
21 2021-01-01 150.9321
22 2021-04-01 152.6017
23 2021-07-01 157.6510
24 2021-10-01 159.0404

EstFund_LT_TS
DATE Price.index
2016 Q1 1 111.0524
2016 Q2 2 113.2230
2016 Q3 3 117.9778
2016 Q4 4 118.7229
2017 Q1 5 119.0881
2017 Q2 6 120.5646
2017 Q3 7 122.7022
2017 Q4 8 121.7658
2018 Q1 9 124.1870
2018 Q2 10 126.1828
2018 Q3 11 127.7915
2018 Q4 12 127.6438
2019 Q1 13 131.1385
2019 Q2 14 131.0206
2019 Q3 15 132.7060
2019 Q4 16 133.2458
2020 Q1 17 135.9200
2020 Q2 18 139.0751
2020 Q3 19 139.7961
2020 Q4 20 145.0123
2021 Q1 21 150.9321
2021 Q2 22 152.6017
2021 Q3 23 157.6510
2021 Q4 24 159.0404
The Time series is splitting date and including month and year on x axis and date on y axis.

The plot

I want to fix the data.frame and plot and have date on x axis properly and price index on y axis.

Samridhi · June 2, 2022, 1:05pm

The code
EstFund_LT <- read.csv("C:\Users\Samridhi\Desktop\New folder\Lithuania Building Prices.csv", sep= ";")
EstFund_LT

EstFund_LT_TS <- ts(EstFund_LT, frequency = 4, start=c(2016,1))
EstFund_LT_TS

plot.ts(EstFund_LT_TS, aes(x = date, y = price_index, group = T)) +
geom_line() +
scale_x_date(
date_breaks = '3 months',
label = date_format(format = '%m-%Y'))

The data
EstFund_LT
DATE Price.index
1 2016-01-01 111.0524
2 2016-04-01 113.2230
3 2016-07-01 117.9778
4 2016-10-01 118.7229
5 2017-01-01 119.0881
6 2017-04-01 120.5646
7 2017-07-01 122.7022
8 2017-10-01 121.7658
9 2018-01-01 124.1870
10 2018-04-01 126.1828
11 2018-07-01 127.7915
12 2018-10-01 127.6438
13 2019-01-01 131.1385
14 2019-04-01 131.0206
15 2019-07-01 132.7060
16 2019-10-01 133.2458
17 2020-01-01 135.9200
18 2020-04-01 139.0751
19 2020-07-01 139.7961
20 2020-10-01 145.0123
21 2021-01-01 150.9321
22 2021-04-01 152.6017
23 2021-07-01 157.6510
24 2021-10-01 159.0404

EstFund_LT_TS
DATE Price.index
2016 Q1 1 111.0524
2016 Q2 2 113.2230
2016 Q3 3 117.9778
2016 Q4 4 118.7229
2017 Q1 5 119.0881
2017 Q2 6 120.5646
2017 Q3 7 122.7022
2017 Q4 8 121.7658
2018 Q1 9 124.1870
2018 Q2 10 126.1828
2018 Q3 11 127.7915
2018 Q4 12 127.6438
2019 Q1 13 131.1385
2019 Q2 14 131.0206
2019 Q3 15 132.7060
2019 Q4 16 133.2458
2020 Q1 17 135.9200
2020 Q2 18 139.0751
2020 Q3 19 139.7961
2020 Q4 20 145.0123
2021 Q1 21 150.9321
2021 Q2 22 152.6017
2021 Q3 23 157.6510
2021 Q4 24 159.0404
The Time series is splitting date and including month and year on x axis and date on y axis.

The plot

I want to fix the data.frame and plot and have date on x axis properly and price index on y axis.

dvetsch75 · June 7, 2022, 8:26pm

Try using ggplot2::ggplot instead of plot.ts. Here's a good link where you can learn how to use ggplot.

williaml · June 8, 2022, 6:50am

Also, please provide reproducible examples next time. We don't have your dataset.

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

system · June 29, 2022, 6:50am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.