R-Boxplot and Legend

I have been struggling with two persistent problems in a multi-panel boxplot figure. I would really appreciate any guidance.


FIGURE LAYOUT:

  • 25-cell grid (5 columns × 5 rows)
  • Each cell = one boxplot panel
  • Cell 5 (top right) = custom legend
  • X-axis: 3 groups on x-axis, 6 color/shape combinations per group (dodged)
  • X-axis labels only on bottom row
  • Y-axis labels only — no panel titles

PROBLEM 1 — SIGNIFICANCE LETTERS NOT APPEARING

I want two types of letters:

  • UPPERCASE (A, B) for main effect from two-way ANOVA — only when p < 0.05
  • lowercase (a, b) for pairwise comparison within each x-group — only when significant

What I tried:

  • multcomp::cld() — crashes silently because one factor has only 2 levels
  • emmeans + multcomp — requires multcompView which caused install errors
  • TukeyHSD() + manual ifelse() — letters calculate correctly in console
    but do not appear on the figure because the join between the
    stats table and the ymax position table fails silently

What is the most reliable way to calculate and place significance letters
on dodged boxplots when one factor has only 2 levels?


PROBLEM 2 — LEGEND TOO LARGE

I am building a custom legend inside one grid cell using ggplot + geom_point

  • geom_text + theme_void. The legend is always too large and takes up
    too much space relative to the data panels.

What I have tried:

  • Reducing symbol size, text size, and margins
  • Adjusting ylim and xlim
  • wrap_plots with rel_widths

Hello MZ3333!

It is extremely hard to give fixing suggestions when you dont provide any samplecode for this. Could you please attach a reproducible code, especially with a sample dataset, so that we can run it and more easily identify the problem? Furthermore we can then see, which package you use for making the wrapped plots (patchwork or simply facet_grid() from ggplot?) and how you extract the results of your ANOVA.

However, I have a suggestion for you to increase the power of your plots, especially as it seems that you use your ANOVA and the plot in an academic context. I really like the package ggstatsplot, it creates beautiful publication ready plots:. In my example code I used the nice sample dataset penguin.

library(ggstatsplot) # install it first if needed

grouped_ggbetweenstats(penguins, x = species, y = bill_len, grouping.var = sex, pairwise.display = "significant", type = "parametric")

The output then looks pretty nice like this:

1 Like

dat_kbg_sig <- dat_kbg %>%
group_by(year, treatment, location, name) %>%
summarise(y = max(value, na.rm=T) + 20) %>% # y position
mutate(
label = ifelse(location == "IR", "a", "b"), # lowercase
lbl_trt = ifelse(treatment == "A", "A", # uppercase
ifelse(treatment == "B", "B", "AB"))
)

Thank you so much, but i have long and many variables

Please copy the code and paste it here between

```

```

This gives us formatted code that we can copy, paste and run . Often a person here does not have the time to type out code to test it and find a problem.

A handy way to supply data is to use the dput() function. Do dput(mydata) where "mydata" is the name of your dataset. For really large datasets probably dput(head(mydata, 100) will do. Paste it here between

```

```
You may also find this helpful.