I have some historical sports data plotted as a boxplot using ggplot2. I'm wanting to read in another .csv file with some new data and plot the data points on the existing boxplot. I'm wondering how to achieve this.
Basically, I'm wanting to plot the current Average.Distance variable, using the new dataset, against the historical data to see where that data point lies within the distribution. Once again, any assistance will be greatly appreciated.
Can you post a reprex with some made up data (or real, if it is not sensitive) so that it is clear what you are trying to achieve?
But as a general advice what you can do is create a new dataframe that will contain both datasets in one. Before joining them you can mark them (look into dplyr::bind_rows, it has support for this type of thing). Then you just plot you data and split it by this new column with color/fill/whatever is more suitable. Will this approach work?
Another approach is to add new data directly into ggplot using data argument for any of the layers. For example, something like + geom_point(data = new_data, color = "red") or something.