Hello, I have made this graph and I would like to add a line with the total mean and the total median to the graph that cross the whole graph.

```
DATA%>%
ggplot(aes(x = CNT, y = PVTOTAL))+
geom_boxplot()+
easy_rotate_labels()
```

Hello, I have made this graph and I would like to add a line with the total mean and the total median to the graph that cross the whole graph.

```
DATA%>%
ggplot(aes(x = CNT, y = PVTOTAL))+
geom_boxplot()+
easy_rotate_labels()
```

Here is a simple example of what I think you want.

```
DF <- data.frame(Name = rep(c("A","B","C"), 30), Value = rnorm(90))
head(DF)
#> Name Value
#> 1 A 0.5969224
#> 2 B 1.1423760
#> 3 C 0.1074961
#> 4 A -1.5284152
#> 5 B -0.7479784
#> 6 C -0.3417590
DF_Stats <- data.frame(Stat = c("Mean","Median"),
Stat_Value = c(mean(DF$Value), median(DF$Value)))
DF_Stats
#> Stat Stat_Value
#> 1 Mean -0.2168492
#> 2 Median -0.3358145
library(ggplot2)
ggplot(DF, aes(Name, Value)) + geom_boxplot() +
geom_hline(data = DF_Stats,
mapping = aes(yintercept = Stat_Value, color = Stat))
```

^{Created on 2023-05-19 with reprex v2.0.2}

1 Like

Hi @FJCC,

Is it possible to place labels with actual values for mean and median ? This is difficult to figure it out from plot what theirs values are.

Is it possible to add a mean for each group with yellow colour so we do have it apart from grand mean ?

Here is one version of displaying the values of the global mean and median and adding a data point to each boxplot showing the mean.

```
DF <- data.frame(Name = rep(c("A","B","C"), 30), Value = rnorm(90))
head(DF)
#> Name Value
#> 1 A -1.05411137
#> 2 B -0.67697338
#> 3 C -0.80538918
#> 4 A 0.34424211
#> 5 B -0.05592012
#> 6 C -0.41565611
DF_Stats <- data.frame(Stat = c("Mean","Median"),
Stat_Value = c(mean(DF$Value), median(DF$Value)))
DF_Stats
#> Stat Stat_Value
#> 1 Mean -0.1115978
#> 2 Median -0.1412545
library(ggplot2)
ggplot(DF, aes(Name, Value)) + geom_boxplot() +
geom_hline(data = DF_Stats,
mapping = aes(yintercept = Stat_Value, color = Stat)) +
stat_summary(fun = mean, geom="point") +
geom_text(aes(x = c(1.5, 2.5), y = 1, label = round(Stat_Value,3), color = Stat),
data = DF_Stats, show.legend = FALSE)
```

^{Created on 2023-05-20 with reprex v2.0.2}

Thank you, when I change from geom="point" to geom="line" in stat_summary, nothing is displayed.

Actually the following warning is displayed:

"`geom_line()`

: Each group consists of only one observation.

Do you need to adjust the group aesthetic?"

Why is that ?

Because the x axis is categorical, ggplot assumes grouping by that variable. Since there is only one value in each group, a line cannot be drawn. This can be fixed by including `group =1`

in the aes().

```
ggplot(DF, aes(Name, Value)) + geom_boxplot() +
geom_hline(data = DF_Stats,
mapping = aes(yintercept = Stat_Value, color = Stat)) +
stat_summary(fun = mean, geom="line", mapping = aes(group = 1)) +
geom_text(aes(x = c(1.5, 2.5), y = 1, label = round(Stat_Value,3), color = Stat),
data = DF_Stats, show.legend = FALSE)
```

How come the median which is inside a box (black line) can be a line, but a mean can't and it only works when mean is set to "point" ?

I do not get it that "Each group consists of only one observation" as there are 3 groups (A, B, C) in Name variable.

What am I missing ?

My desired result:

The mean can be a line, it is just more work. You can use geom_linerange. The first category on the x axis is at position 1, the second is at 2, and so on. I used this fact to add an X column to the Means data frame.

There may be a better way to do this and this is very manual.

```
library(ggplot2)
library(dplyr)
DF <- data.frame(Name = rep(c("A","B","C"), 30), Value = rnorm(90))
head(DF)
DF_Stats <- data.frame(Stat = c("Mean","Median"),
Stat_Value = c(mean(DF$Value), median(DF$Value)))
Means <- DF |> group_by(Name) |> summarize(Mean = mean(Value)) |>
mutate(X = 1:3)
ggplot(DF, aes(Name, Value)) + geom_boxplot(width = 0.8) +
geom_hline(data = DF_Stats,
mapping = aes(yintercept = Stat_Value, color = Stat)) +
geom_linerange(data = Means,
mapping = aes(x = Name, y = Mean, xmin = X - 0.4, xmax = X + 0.4),
color = "blue", size = 1) +
geom_text(aes(x = c(1.5, 2.5), y = 1, label = round(Stat_Value,3), color = Stat),
data = DF_Stats, show.legend = FALSE)
```

My comment "Each group consists of only one observation" refers to the data after stat_summary has calculated the mean. There is only one mean value at each x position.

Thank you that you patiently explained this to me. Much appreciated and I have learnt geom_linerange() as I did not know about it. Greatly appreciated.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.