I'm trying to plot something in ggplot that is a bit tricky (to me). I've got a subsetted data frame and I'm characterizing each row in it by one of 10 "classes" (we'll call them identities). Each row can have any combination of the 10 identities. I want to display a value of each row (another column in the data frame) but by identity. So it would be something like a clustered bar graph, and each cluster/identity has a different, often redundant, number of rows within its cluster.. I tried creating a list of strings and putting them in their corresponding rows within a column at the end to then use as my mapping and filling variable. But R is only recognizing the first string in the list of the column.
# New column with each row containing a list of strings describing the row (Gene)
CellType$subtype = character(length = length(CellType$GeneID))
CellType$subtype[which(CellType$GeneID=="Gfra2")]= c("Tyrosine Hydroxylase", "Non-peptidergic 2")
CellType$subtype[which(CellType$GeneID=="Mrgpra3")]= c("Non-peptidergic 2")
CellType$subtype[which(CellType$GeneID=="Mrgprd")]= c("Non-peptidergic 1")
CellType$subtype[which(CellType$GeneID=="Sst")]= c("Non-peptidergic 3")
CellType$subtype[which(CellType$GeneID=="Piezo2")]= c("Tyrosine Hydroxylase")
CellType$subtype[which(CellType$GeneID=="Ldhb")]= c("Neurofilament 1", "Neurofilament 2", "Neurofilament 3", "Neurofilament 4", "Neurofilament 5")
CellType$subtype[which(CellType$GeneID=="Cacna1h")]= c("Neurofilament 1", "Neurofilament 2")
CellType$subtype[which(CellType$GeneID=="Necab2")]= c("Neurofilament 2")
CellType$subtype[which(CellType$GeneID=="Fam19a1")]= c("Neurofilament 3", "Peptidergic 2")
Sub_pop_cluster = ggplot(CellType, aes(x = CellType$subtype, y = CellType$log2.FC.)) + geom_bar(aes(fill = CellType$GeneID), position = "dodge", stat = "identity")
So when R plots this, it is recognizing, from the first example, Gene Gfra2 as "Tyrosine hydroxylase", instead of both "Tyrosine hydroxylase" and "Non-peptidergic 2".
Hi Erik, welcome!
It would be easier to help if you provide some sample data, so could you please turn this into a self-contained REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.
If you've never heard of a reprex before, you might want to start by reading this FAQ:
Thanks so much. I think I see. When I tried to reproduce the code, the "enframe" function wasn't being recognized despite having tidyverse or tibble package installed.
Created on 2019-03-15 by the reprex package (v0.2.1.9000)
Note also that you must loat/attach the library in each session, so, if you haven't run library(tidyverse) (as in the reprex), the function enframe() will not be found.