Rmarkdown geom_line + geom_point problem

Hi,

I am new to R and I have problem with my script. I want to create plot with points and line but I cannot get geom_line to work.
The code is quite messy but I cannot modify to much of it.

data_vector <- list(c('10-10-2018', '25-10-2018', '9-11-2018', '10-10-2019', '25-10-2019', '9-11-2019'))


df1 <- data.frame(dates =  as.Date(unlist(data_vector[1]), format = '%d-%m'))


chart1 <- ggplot(data = df1, aes(x = dates, y = as.numeric(list(21.900,22.700,23.400,NA,NA,NA))), group = 1)+
        geom_line(size = 0.5) + geom_point(size = 2) + theme_bw()

chart1

Result:

If I replace NA values with some random values, everything works fine.

For sure this is some stupid problem but I cannot solve it. I already spent a lot of time with this issue.

What is not working? Or rather - what don't you think is working. You get a warning telling you three NAs can't be plotted. Did you expect them to be plotted? Or am I missing some other error.

Secondly - is that code an attempt to be a representation of your real code, or is that your real code?

I'm not loving the list, unlist. But I'm very much not loving the y= section. Usually you'd put the data all in one "frame" or a tibble before plotting. Then if you don't want warnings, filter the NAs off...

I've cleaned up your code a little bit, but if you wish you can revert it back to the original version.
Important difference in the code - format argument in the as.Date(). The year's digits were dropped and replaced by the current year. As result, you had 3 distinct values for x, and for each of the - 2 values for y (number and NA). As you put all of them in a single group, what you asked ggplot to do, is to draw a line between (x1, y1) to (x1, "I don't know where") to (x2, y2) and etc.

suppressMessages(library(tidyverse))

data_vector <- c('10-10-2018', '25-10-2018', '9-11-2018', '10-10-2019', '25-10-2019', '9-11-2019')


df1 <- data.frame(dates =  as.Date(data_vector, format = '%d-%m-%Y'),
                  y = as.numeric(c(21.900,22.700,23.400,NA,NA,NA)))

chart1 <- ggplot(data = df1, aes(x = dates, y = y,  group = 1))+
  geom_line(size = 0.5) + geom_point(size = 2) + theme_bw()


chart1
#> Warning: Removed 3 row(s) containing missing values (geom_path).
#> Warning: Removed 3 rows containing missing values (geom_point).

image

Here is an example of how the line with NA's can look if they are consequent or non-consequent

suppressMessages(library(tidyverse))

df <- data.frame(x = 1:10,
                 y = c(1, 2, NA, 4, 5, 6, NA, 8, NA, 10))

ggplot(df, aes(x = x,
               y = y,
               group = 1)) +
  geom_point() + 
  geom_line()
#> Warning: Removed 3 rows containing missing values (geom_point).

image

1 Like

Normally I would put data in single data.frame, unfortunatly I cannot do that. (long story short, full code gonna be generated as string in a loop and than executed via eval(parse(text))). This is the part I cannot change. I also missed Year for purpose. At the end I want to be few lines on the same plot. (Not fully true but) each line from different year. You can see that real data in this case is only for 2018. For 2019 is only NA:
10-10-2018 - 21.900
25-10-2018 - 22.700
9-11-2018 - 23.400
10-10-2019 - NA
25-10-2019 - NA
9-11-2019 - NA

There will be different list for values of 2019, etc

For example. There will be dates:
('10-10-2018', '25-10-2018', '9-11-2018', '10-10-2019', '25-10-2019', '9-11-2019', '12-10-2020', '27-10-2020', '11-11-2020)
and different values (each for different year but visible on the same plot)
c(21.900,22.700,23.400,NA,NA,NA,NA,NA,NA)
c(NA,NA,NA,21.100,21.000,21.200,NA,NA,NA)
c(NA,NA,NA,NA,NA,NA,22.100,22.200,22.300)

That is why the order of the data is important and I did not get rid of NA values.

So, 3 lines, each only for existing values but with correct day-month.

To be honest, I didn't get why you can't make a data frame from your strings before creating the plot.

This is an example of how you can make a line plot keeping NAs (but you need extra grouping variable)

suppressMessages(library(tidyverse))
suppressMessages(library(lubridate))

data_vector <- c('10-10-2018', '25-10-2018', '9-11-2018', '10-10-2019', '25-10-2019', '9-11-2019', '12-10-2020', '27-10-2020', '11-11-2020')
dates <- as.Date(data_vector, format = '%d-%m-%Y')

y1 <- c(21.900,22.700,23.400,NA,NA,NA,NA,NA,NA)
y2 <- c(NA,NA,NA,21.100,21.000,21.200,NA,NA,NA)
y3 <- c(NA,NA,NA,NA,NA,NA,22.100,22.200,22.300)

x <- rep(dates, times = 3)
y <- c(y1, y2, y3)
gr <- rep(1:3, each = 9)

ggplot(data = NULL, aes(x = update(x, year = 1),
                        y = c(y1, y2, y3),
                        group = interaction(gr, lubridate::year(x)))) +
  geom_point() + 
  geom_line()
#> Warning: Removed 18 rows containing missing values (geom_point).
#> Warning: Removed 18 row(s) containing missing values (geom_path).

image

1 Like

Thank you very much for help. This is something i was looking for. I also found other solution. Not using geom_line() but geom_path(). Anyway, thanks a lot

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.