Plotting over an existing plot with ggplot2

bhorsley · March 5, 2018, 5:27am

Hello,

I have some historical sports data plotted as a boxplot using ggplot2. I'm wanting to read in another .csv file with some new data and plot the data points on the existing boxplot. I'm wondering how to achieve this.

My code currently reads as:

plot <- ggplot(data = quarterone, aes(x = Player.Name, y = Average.Distance)) +
  geom_boxplot() +
  coord_flip()

Basically, I'm wanting to plot the current Average.Distance variable, using the new dataset, against the historical data to see where that data point lies within the distribution. Once again, any assistance will be greatly appreciated.

Thank you.

mishabalyasin · March 5, 2018, 11:15am

Can you post a reprex with some made up data (or real, if it is not sensitive) so that it is clear what you are trying to achieve?

FAQ: What's a reproducible example (`reprex`) and how do I create one? meta

Why reprex? Getting unstuck is hard. Your first step here is usually to create a reprex, or reproducible example. The goal of a reprex is to package your code, and information about your problem so that others can run it and feel your pain. Then, hopefully, folks can more easily provide a solution. What's in a Reproducible Example? Parts of a reproducible example: background information - Describe what you are trying to do. What have you already done? complete set up - include any library() calls and data to reproduce your issue. data for a reprex: Here's a discussion on setting up data for a reprex make it run - include the minimal code required to reproduce your error on the data…

But as a general advice what you can do is create a new dataframe that will contain both datasets in one. Before joining them you can mark them (look into dplyr::bind_rows, it has support for this type of thing). Then you just plot you data and split it by this new column with color/fill/whatever is more suitable. Will this approach work?

Another approach is to add new data directly into ggplot using data argument for any of the layers. For example, something like + geom_point(data = new_data, color = "red") or something.