Hi! I need to put 2 histograms one over the other in the same plot. The variable it plots will be the same but it is filtered by 2 different criteria (some of which overlap). So, I need to plot these while also changing it to % of frequency instead of just frequency. Sample data for clarity:

height <- c(10, 12, 11, 15, 9, 18)
female <- c(1, 0, 1, 0, 0, 1)
treat <- c(0, 1, 1, 0, 1, 1)

So I want to plot heights of female and heights of treated people with the y axis being % of total instead of just a simple count/frequency. Tips on a clear legend, title and formatting would be massively appreciated!

Sorry if this is not super clear, I am a little confused by this. Thank you a ton!

Hi @prch

You mean something like this?



#your data in dataframe
df1 <- tibble(
height = c(10, 12, 11, 15, 9, 18), 
female = c(1, 0, 1, 0, 0, 1), 
treat = c(0, 1, 1, 0, 1, 1)

#transform to longer format for easier plotting
df1_long <- pivot_longer(df1, 
                        cols = c(2:3),
                        names_to = "group", 
                        values_to = "count")

#group for group totals and recombine
df1_long %>%
    group_by(group) %>%
    summarize(n_tot=sum(count)) %>%
    ungroup() %>% #remove grouping information
    right_join(df1_long,.) %>%
    mutate(perc = count/n_tot*100) %>%
    ggplot(aes(x = height, y=perc, fill=group)) +
        geom_col(alpha=0.4, position = "identity" ) +
        theme_classic() +
        scale_y_continuous(name = "% of group total" ) +
        ggtitle("Comparison of relative heights for two different groups" )

I think this is sort of what I am looking for but the treat and female variables are not mutually exclusive so I think that's causing an issue..

Issue is not fully clear to me :confused:. You mean percentage on total instead of group (marginal)? What causes problems?


