Let's look at the help:

In `?geom_histogram`

:

Computed variables

These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.

`after_stat(count)`

number of points in bin.
`after_stat(density)`

density of points in bin, scaled to integrate to 1.
`after_stat(ncount)`

count, scaled to a maximum of 1.
`after_stat(ndensity)`

density, scaled to a maximum of 1.
`after_stat(width)`

widths of bins.

In `?geom_density`

:

Computed variables

These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.

`after_stat(density)`

density estimate.
`after_stat(count)`

density * number of points - useful for stacked density plots.
`after_stat(scaled)`

density estimate, scaled to maximum of 1.
`after_stat(n)`

number of points.
`after_stat(ndensity)`

alias for `scaled`

, to mirror the syntax of `stat_bin()`

.

So you want both to be on the same scale. In your code, when running `geom_histogram()`

, you take the cumsum of `..count..`

, which is the number of points in a bin, so the total of that cumsum is the total number of points (the number of rows in `data`

).

We can make that more clear with example data:

```
library(ggplot2)
set.seed(123)
data <- data.frame(Service = rpois(100, lambda = 1000))
head(data)
#> Service
#> 1 982
#> 2 1037
#> 3 946
#> 4 1004
#> 5 1054
#> 6 1014
num_bins <- ceiling(1 + log2(nrow(data)))
gg <- ggplot(data, aes(x=Service)) +
geom_histogram(aes(y=cumsum(..count..)),
bins=num_bins, fill="skyblue", color="black")
layer_data(gg) |>
dplyr::select(x, y, count, density, ncount, ndensity)
#> Warning: The dot-dot notation (`..count..`) was deprecated in ggplot2 3.4.0.
#> ℹ Please use `after_stat(count)` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
#> x y count density ncount ndensity
#> 1 941.1429 2 2 0.001147541 0.07142857 0.07142857
#> 2 958.5714 8 6 0.003442623 0.21428571 0.21428571
#> 3 976.0000 26 18 0.010327869 0.64285714 0.64285714
#> 4 993.4286 52 26 0.014918033 0.92857143 0.92857143
#> 5 1010.8571 80 28 0.016065574 1.00000000 1.00000000
#> 6 1028.2857 90 10 0.005737705 0.35714286 0.35714286
#> 7 1045.7143 97 7 0.004016393 0.25000000 0.25000000
#> 8 1063.1429 100 3 0.001721311 0.10714286 0.10714286
```

^{Created on 2023-12-26 with reprex v2.0.2}

So as you can see here, the maximum of `y`

is `100`

, which is the number of rows in `data`

, because `count`

contains the number of rows in a bin.

If you want the same scale for the `geom_density()`

, you thus need the maximum to be the number of points, `..n..`

. And you can use a scaled density to make sure the scale is respected:

```
ggplot(data, aes(x=Service)) +
geom_histogram(aes(y=cumsum(..count..)),
bins=num_bins, fill="skyblue", color="black") +
geom_density(aes(y=..scaled..*..n..), color="red")
```