How to create a boxplot with two boxes on top of each other, with overlapping portions.

Dear all,

I wish to obtain a ggplot boxplot like the following using the data example below:

I have looked for a way to make two boxes overlap with one another in a boxplot, but I haven't succeeded.

Would anyone have advice on how to achieve this in ggplot? Would it be possible to set the transparency of the box on the top so that we can see the overlapped portion?

The data set contains data from four conditions: 1a, 1b, 2a, 2b.
There is also a reference value from each of these conditions (this is the mean of the general population).

Boxes 1a and 1b boxes should overlap, and so should boxes 2a and 2b. The reference values, like "1a_ref", etc., should be next to the boxplot.

I would appreciate hearing from you about how to realize this in ggplot.
Thank you!
Dan

library(datapasta)
library(reprex)
library(dplyr)

data <- tibble::tribble(
                ~`1a`,       ~`1b`,       ~`2a`,       ~`2b`, ~`1a_ref`, ~`1b_ref`, ~`2a_ref`, ~`2b_ref`,
          46.03612316, 6.379308681, 84.85844059, 18.44469465,       50L,       25L,       75L,       35L,
           39.1412256, 25.62868338, 60.96337514, 44.97792706,        NA,        NA,        NA,        NA,
          56.44069604, 56.97870173, 63.12430571, 35.97211387,        NA,        NA,        NA,        NA,
          77.15643048, 18.25548149, 84.94856789,  64.5323931,        NA,        NA,        NA,        NA,
          27.28007676, 22.10646457, 66.98601135, 7.845562822,        NA,        NA,        NA,        NA,
          48.92300837,           0, 76.57414273, 72.67011555,        NA,        NA,        NA,        NA,
          43.18410599,           0, 80.46783389, 20.42183561,        NA,        NA,        NA,        NA,
          52.15594668, 18.82445379, 74.32756203,  35.2970116,        NA,        NA,        NA,        NA,
          56.85943369,           0, 80.60287961, 7.787770153,        NA,        NA,        NA,        NA,
          89.29983706, 14.30199567, 75.13883255, 42.53271463,        NA,        NA,        NA,        NA
          )
head(data)
#> # A tibble: 6 x 8
#>    `1a`  `1b`  `2a`  `2b` `1a_ref` `1b_ref` `2a_ref` `2b_ref`
#>   <dbl> <dbl> <dbl> <dbl>    <int>    <int>    <int>    <int>
#> 1  46.0  6.38  84.9 18.4        50       25       75       35
#> 2  39.1 25.6   61.0 45.0        NA       NA       NA       NA
#> 3  56.4 57.0   63.1 36.0        NA       NA       NA       NA
#> 4  77.2 18.3   84.9 64.5        NA       NA       NA       NA
#> 5  27.3 22.1   67.0  7.85       NA       NA       NA       NA
#> 6  48.9  0     76.6 72.7        NA       NA       NA       NA

Created on 2023-10-06 with reprex v2.0.2

consider this example

library(tidyverse)
# have some data 
liris <- pivot_longer(iris,
             cols = ends_with("Length"))


p <- ggplot(liris, aes(x=name, y=value,color=Species))

#default is auto dodged
p + geom_boxplot()

# instead keep everything together 
p + geom_boxplot(position = position_identity())
1 Like

Hi Nir,
This does the trick! Thank you for your help.
Dan

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.