I have one question regarding the relative frequency plot. I am trying to perform a stacked plot for my microbiome data, but in y axis (relative frequencies) I get much more than 100. The sum of each bacteria (relative frequency) is 100 but I don't know why it goes way higher than 100 in the plot. I have read other posts but I couldn't figured it out. I really appreciate if you can help me out, please.
The first 10 rows of the data are the following:
The codes that I am using are:
library(tidyverse)
library(readxl)
library(glue)
library(ggtext)
library(patchwork)
library(reshape2)
library(ggtext)
pc = read.csv("L2_16S_R2.csv", header = TRUE)
#convert data frame from a "wide" format to a "long" format
pcm = melt(pc, id = c("Vineyard"))
View(pcm)
str(pcm)
pcm %>%
group_by(Vineyard, variable) %>%
summarize(value = sum(value), .groups="drop") %>%
group_by(Vineyard, variable) %>%
summarize(mean_value = mean(value), .groups="drop") %>%
mutate(variable=str_replace(variable,
"(.*)_unclassified", "Unclassified *\\1*"),
variable = str_replace(variable,
"^(\\S*)$", "*\\1*")) %>%
ggplot(aes(x=Vineyard, y=mean_value, fill= variable)) +
geom_col() +
labs(x = NULL,
y = "Mean relative abundance (%)") +
theme_classic() +
theme(axis.text.x = element_markdown(),
legend.text = element_markdown())
To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one: