Hello! this is my first post, I was looking around for a solution to this problem and it seemed many people had it.
The bug: I was using a for
loop to make pie charts of relative taxon abundance by site, and I kept getting errors that my character column "SITE" was an "Unknown or uninitialised column" even though the plots themselves would still generate. This issue appeared every single time I ran the code, using various methods and functions.
What I changed: This made the error stop appearing, but would only create a single SITE plot even though I had two sites. I added quotation marks around my character column SITE when asking my data to group by site. Here is my code below, with the change highlighted:
My first approach
load packages
#library(tidyverse)
#library(dplyr)
#library(ggplot2)
#library(purrr)
simulate data
small_data <- data.frame(
Genus = c("GenusA", "GenusB", "GenusA", "GenusB", "GenusC", "GenusD", "GenusA", "GenusA", "GenusE", "GenusB", "GenusF", "GenusB", "GenusC", "GenusG", "GenusD", "GenusC", "GenusH", "GenusA", "GenusE", "GenusH" ),
target_Genus = c(TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE),
SITE = c("Site1", "Site2", "Site1", "Site2", "Site1", "Site1", "Site2", "Site2", "Site2", "Site2", "Site1", "Site1", "Site2", "Site1", "Site2", "Site1", "Site1", "Site2", "Site2", "Site1"),
Reads = c(100, 150, 200, 120, 180, 300, 30, 96, 240, 190, 176, 358, 209, 652, 102, 280, 320, 520, 760, 120)
)
Group by SITE
grouped_data <- small_data %>%
filter(target_Genus == TRUE) %>%
group_by("SITE") # These are the added quotation marks. Running this code two times, once with group_by(SITE) and once with group_by("SITE"), will replicate the warning message and the fix
Function to create pie chart for each group
create_pie_chart <- function(data) {
unique_groups <- unique(data$Genus)
group_counts <- table(data$Genus)
pie_data <- data.frame(Group = names(group_counts), Count = as.numeric(group_counts))
colors <- rainbow(length(unique_groups))
pie_chart <- ggplot(pie_data, aes(x = "", y = Count, fill = Group)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y") +
scale_fill_manual(values = colors) +
theme_void() +
ggtitle(paste("Distribution of HAB Genera by SITE:", unique(data$SITE)))
print(pie_chart)
}
Apply the function to each group
grouped_data %>%
group_map(~ create_pie_chart(.))
Because this minor change stopped the warning, but didn't give me my desired plots, I tried another method where I include filtering by SITE inside the for loop to give me all the plots I want.
My second approach
Function to create pie chart for each SITE
create_pie_chart <- function(data) {
unique_sites <- unique(data$SITE)
for (site in unique_sites) {
site_data <- data %>%
filter(target_Genus == TRUE, SITE == site)
if (nrow(site_data) > 0) { # Check if data for the site is available
unique_groups <- unique(site_data$Genus)
group_counts <- table(site_data$Genus)
pie_data <- data.frame(Group = names(group_counts), Count = as.numeric(group_counts))
colors <- rainbow(length(unique_groups))
pie_chart <- ggplot(pie_data, aes(x = "", y = Count, fill = Group)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y") +
scale_fill_manual(values = colors) +
theme_void() +
ggtitle(paste("Distribution of HAB Genera by SITE:", site))
print(pie_chart)
}
}
}
Apply the function to the data frame
create_pie_chart(small_data)
This second approach where I put the grouping by sites into the for
loop makes the warning disappear, and it creates a plot for every unique site in the SITE column.
That's all! I hope this helps someone! Maybe I have it incorrect, but hopefully at least the reproducible bug is helpful to people who want to figure out the root of the problem.