This can also be done with a single call to stat_summary, though it takes a bit of extra work to set it up to use colour and linetype aesthetics. To use stat_summary, I've created a simple function to return the mean, median, and mode. I've also created a helper function to calculate the mode. The function uses a kernel density estimate to estimate the mode and it returns only one mode. With some extra work the various functions and the plot code could be adjusted to account for multiple modes.
This is not an answer, but my viewpoints on some of the things I noticed in this thread. Please note that I mean no disrespect to anyone.
This is a wrong R code. If you want to execute multiple lines in a single line, use ; instead of ,.
However, as jcblum pointed out in a DM, though it is possible to use of ;'s to put multiple statements in one line, this is not encouraged. Check out the links below:
For a continuous variable, mode is defined as the value which corresponds to the maximum probability density. mpg is not a discrete variable, and the concept of frequency doesn't make much sense here.
This will not be able to handle multiple modes. which.max will always return index corresponding to the first maximum, and hence we'll always get a single answer. However, I cannot suggest a better alternative, as even if I try d$x[d$y == max(d$y)], there is no guarantee that we'll get multiple answers.
Illustration
set.seed(seed = 37206)
ModeJoels = function(d, adj=0.75) {
d = density(d, adjust=adj)
d = data.frame(x=d$x, y=d$y)
d$x[which.max(d$y)]
} # can't find multiple modes
ModeYarnabrina <- function(dataset, adjustChoice = 0.75)
{
with(data = density(x = dataset, adjust = adjustChoice), expr = {x[y == max(y)]})
} # no guarantee to find multiple modes (and indeed probability of finding multiple modes is very small)
p <- sample(x = 0:1, size = 1000, replace = TRUE)
q <- rnorm(n = 1000, mean = 5)
r <- rnorm(n = 1000, mean = -5)
x <- ifelse(test = p, yes = q, no = r)
ModeJoels(d = x)
#> [1] -5.102445
ModeYarnabrina(dataset = x)
#> [1] -5.102445
hist(x = x)
abline(v = -5.102445, col = "red")
One more edit (and hopefully the final one)
First of all, I apologise to criticise joels's post. He already mentioned about finding only one mode, and I missed that altogether. Sorry!
Second, I was searching online to find multiple modes, and the most confusing part of that is mode is not clearly defined for distributions with multiple modes. Here's a small excerpt from the Wikipedia page:
If we assume this definition, then perhaps one can use quantmod::findPeaks or ggpmisc::stat_peaks functions on the estimated density. The main idea of these functions is to consider the derivative and find points where sign of the derivative changes to negative from positive. This is not foolproof, but it usually works. I've provided links to the source codes and a related thread from Cross Validated below:
I mentioned in my answer that the function will return only one mode, although for a kernel density estimate there's little chance of there being more than one mode (if mode is understood to be the globally most likely value(s), though there may be many local maxima). With continuous data, the location of the mode(s) will depend on the details of the mode calculation method (e.g., the bandwidth and kernel in the case of a kernel density estimate).