How does ggplot2 determines bins' starting and ending points in histogram?

I'm trying to understand how geom_histogram() determines binning.

When I run the following code, ggplot(df, aes(body_mass_g)) + geom_histogram(bins = 40), how does ggplot decide the starting and ending points for each bin?

When there are only a few data points in df,

mydf <- data.frame(Sales = c(0, 5, 12, 19, 26, 29, 41, 82, 111, 400))
ggplot(mydf, aes(x=Sales)) + geom_histogram(bins=5)

it seems to apply binwidth = (max - min) / (bins - 1), and the first bin is from [min - binwidth/2, max + binwidth/2). In this case, the first bin's interval is [-50,50), so 0, 5, 12, 19, 26, 29, 41 (7 values) are included.

However the same process doesn't seem to apply when there are many data points.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.