Add regression line in geom_boxplot

Hi all !
I have plotted many boxplots on the same figure, and also with the mean of values (red dots) at the middle of each boxplot. I want to add a regression line with "geom_abline" but it not appears. How could I resolve this ?

Thanks

1 Like

Can you post a reproducible example with your data and code? I can accomplish this using the following code.

library("ggplot2")

set.seed(123)

dat <- data.frame(
  x = rep(1:5, 10),
  y = runif(n = 50)
)

coefs <- coef(lm(y ~ x, data = dat))

ggplot(dat, aes(x, y, group = x)) +
  geom_boxplot() +
  geom_abline(intercept = coefs[1], slope = coefs[2])

Created on 2018-05-29 by the reprex package (v0.2.0).

3 Likes

Thanks a lot for your response,
Here is my code:
ggplot(boxplot) + aes(x=boxplot$Years, y=boxplot$NO2, group=boxplot$Years) + geom_boxplot(fill="white", color="black", width= 0.8) + ggtitle("Banizoumbou") + scale_y_continuous(breaks = seq(0 , 7, 1)) + scale_x_continuous(breaks = seq(1998, 2002, 1)) + xlab("Years")+ ylab("Concentrations (ppb)") + theme_bw() + theme(axis.title = element_text(size = 9)) + stat_summary(fun.y = mean, color="red", geom = "point")+ theme(axis.text= element_text(size= 7, angle = 0)) + geom_abline(intercept = 1, slope = -0.2, color="red")

Sorry, I can't upload the file because it's a .xlsx file, but I will give you a sample:
File name : boxplot

Years NO2
1998 0.8
1998 0.4
1998 1.0
1999 1.3
1999 0.7
1999 0.4
2000 2.6
2000 2.9
2000 2.2
2001 0.9
2001 0.8
2001 0.9
2002 3.6
2002 5.4
2002 4.2

The line you're trying add isn't appearing because it is outside the scale of the plot. If you use the coefficients like I did in my example it should appear:

library("ggplot2")

boxplot <- read.table(text = "Years NO2
1998 0.8
1998 0.4
1998 1.0
1999 1.3
1999 0.7
1999 0.4
2000 2.6
2000 2.9
2000 2.2
2001 0.9
2001 0.8
2001 0.9
2002 3.6
2002 5.4
2002 4.2", header = TRUE)

## View coefficients, which will become the slope and intercept
(coefs <- coef(lm(NO2 ~ Years, data = boxplot)))
#> (Intercept)       Years 
#>   -1478.127       0.740

ggplot(boxplot, aes(Years, NO2, group = Years)) +
  geom_boxplot(fill="white", color="black", width= 0.8) +
  ggtitle("Banizoumbou") +
  scale_y_continuous(breaks = seq(0 , 7, 1)) +
  scale_x_continuous(breaks = seq(1998, 2002, 1)) +
  xlab("Years") +
  ylab("Concentrations (ppb)") +
  theme_bw() +
  theme(axis.title = element_text(size = 9)) +
  stat_summary(fun.y = mean, color="red", geom = "point") +
  theme(axis.text= element_text(size= 7, angle = 0)) +
  geom_abline(intercept = coefs[1], slope = coefs[2], color = "red")

Created on 2018-05-29 by the reprex package (v0.2.0).

3 Likes

I will try this with my entier data and will tell you if it works.

Thanks a lot !

Please, I have one question again: How can I do if have a particular "slope" and "intercept" ?

Sincerely

You can replace coefs[1] etc. with the values, e.g.

geom_abline(intercept = -1478.127, slope = 0.740, color = "red")

There was nothing technically wrong with the code you originally wrote, the slope and intercept values just placed the line outside the area of the plot.

2 Likes

Ok, thanks very much !