I have a column (in a dataframe, with each row represent a paper published) called Publisher. It's a factor. I used this code to run a frequency table:
table(data$Publisher) %>%
sort(decreasing=TRUE)
I got a frequency table like this:
BMC
28
PloS
18
Springer Nature
9
Elsevier
5
BMJ
3
(here I omitted many other values with frequency less than 3)
I'm trying to regroup all the publishers with value less than 3 into "Other", because I'm hoping to group those publishers with which not many papers have been published into Other in a pie chart. Otherwise the pie chart would get too crowded and not meaningful. But I'm stuck here with recoding Publisher. Can anybody help?
You can use fct_lump() or fct_other() from forcats package
If you need more specific help, please turn this into a self-contained REPRoducible EXample (reprex) A reprex makes it much easier for others to understand your issue and figure out how to help.
If you've never heard of a reprex before, you might want to start by reading this FAQ:
Thanks so much for your help! I've never heard of forecats but it sounds exactly what I need! I'll try it out.
Thanks also for the info about Reproducible example. I'll use it if I still can't figure this issue out.
If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it: