GGplot and Data Format

Hello,

I have an excel datasheet that I want to create a stacked histogram with percentages with. I think I have the data setup in excel the way it needs to be in order to create the plot but I am running into an issue.

A tibble: 6 × 5
IssueArea Era1 Era2 Era3 Era4

1 attorneys 0 1 1 3
2 civil rights 0 11 41 37
3 criminal procedure 0 10 11 9
4 due process 7 54 14 18
5 economic activity 35 468 46 20
6 federal taxation 0 3 1 0

.....and so on.

I want to first, compute the percentages and then stack the percentages by IssueArea in one histogram.

When I type str (dataset name) it shows:
str(breadth_collapsed_davia)
tibble [14 × 5] (S3: tbl_df/tbl/data.frame)
IssueArea: chr [1:14] "attorneys" "civil rights" "criminal procedure" "due process" ... Era1 : num [1:14] 0 0 0 7 35 0 7 2 1 17 ...
Era2 : num [1:14] 1 11 10 54 468 3 19 3 4 197 ... Era3 : num [1:14] 1 41 11 14 46 1 8 38 0 23 ...
$ Era4 : num [1:14] 3 37 9 18 20 0 5 35 0 18 ...

However, when I try to move to calculating the percents in each era (my time variable), I get the following error:

summarise(n = sum(Era1)) %>%

  • mutate(percentage = n / sum(n))

Error: object 'Era1' not found

I'm fairly sure this is an easy fix, but I only use R for ggplot so am not sure exactly what steps I need to take to read the data in correctly.

Did you pipe the tibble into summarise?

I'm not quite clear on what exactly you're looking for, but I'm guessing something like:


library(tidyverse)

breadth_collapsed_davia <- tribble(
  ~IssueArea, ~Era1, ~Era2, ~Era3, ~Era4,
  "attorneys", 0, 1, 1, 3,
  "civil rights", 0, 11, 41, 37,
  "criminal procedure", 0, 10, 11, 9,
  "due process", 7, 54, 14, 18,
  "economic activity", 35, 468, 46, 20,
  "federal taxation" , 0, 3, 1, 0
)

breadth_collapsed_davia %>% 
  mutate(across(Era1:Era4, ~ .x /sum(.x))) %>% 
  pivot_longer(cols = -c(IssueArea), names_to = "era", values_to = "pct") %>% 
  ggplot()+
  geom_col(aes(x = era, y = pct, fill = IssueArea))

Yes, this is what I needed. Thank you!

Quick Question: If I wanted to add the pct to each block of color--how would I do that? I assume it's a label code, but I'm unsure where to put it in the code.

then you could add:


+
  geom_text(
    aes(
      x = era, 
      y = pct,
      group = IssueArea,
      # don't show very small percentages 
      label = ifelse(pct > .02, scales::label_percent()(pct), ""),
      ), 
    position = position_stack(vjust = .5)
    )