Reproducible "Unknown or uninitialised column" warning and potential fix

Hello! this is my first post, I was looking around for a solution to this problem and it seemed many people had it.

The bug: I was using a for loop to make pie charts of relative taxon abundance by site, and I kept getting errors that my character column "SITE" was an "Unknown or uninitialised column" even though the plots themselves would still generate. This issue appeared every single time I ran the code, using various methods and functions.

What I changed: This made the error stop appearing, but would only create a single SITE plot even though I had two sites. I added quotation marks around my character column SITE when asking my data to group by site. Here is my code below, with the change highlighted:

My first approach

load packages

#library(tidyverse)
#library(dplyr)
#library(ggplot2)
#library(purrr)

simulate data

small_data <- data.frame(
Genus = c("GenusA", "GenusB", "GenusA", "GenusB", "GenusC", "GenusD", "GenusA", "GenusA", "GenusE", "GenusB", "GenusF", "GenusB", "GenusC", "GenusG", "GenusD", "GenusC", "GenusH", "GenusA", "GenusE", "GenusH" ),
target_Genus = c(TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE),
SITE = c("Site1", "Site2", "Site1", "Site2", "Site1", "Site1", "Site2", "Site2", "Site2", "Site2", "Site1", "Site1", "Site2", "Site1", "Site2", "Site1", "Site1", "Site2", "Site2", "Site1"),
Reads = c(100, 150, 200, 120, 180, 300, 30, 96, 240, 190, 176, 358, 209, 652, 102, 280, 320, 520, 760, 120)
)

Group by SITE

grouped_data <- small_data %>%
filter(target_Genus == TRUE) %>%
group_by("SITE") # These are the added quotation marks. Running this code two times, once with group_by(SITE) and once with group_by("SITE"), will replicate the warning message and the fix

Function to create pie chart for each group

create_pie_chart <- function(data) {
unique_groups <- unique(data$Genus)
group_counts <- table(data$Genus)
pie_data <- data.frame(Group = names(group_counts), Count = as.numeric(group_counts))
colors <- rainbow(length(unique_groups))

pie_chart <- ggplot(pie_data, aes(x = "", y = Count, fill = Group)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y") +
scale_fill_manual(values = colors) +
theme_void() +
ggtitle(paste("Distribution of HAB Genera by SITE:", unique(data$SITE)))

print(pie_chart)
}

Apply the function to each group

grouped_data %>%
group_map(~ create_pie_chart(.))

Because this minor change stopped the warning, but didn't give me my desired plots, I tried another method where I include filtering by SITE inside the for loop to give me all the plots I want.

My second approach

Function to create pie chart for each SITE

create_pie_chart <- function(data) {
unique_sites <- unique(data$SITE)

for (site in unique_sites) {
site_data <- data %>%
filter(target_Genus == TRUE, SITE == site)

if (nrow(site_data) > 0) {  # Check if data for the site is available
  unique_groups <- unique(site_data$Genus)
  group_counts <- table(site_data$Genus)
  pie_data <- data.frame(Group = names(group_counts), Count = as.numeric(group_counts))
  colors <- rainbow(length(unique_groups))
  
  pie_chart <- ggplot(pie_data, aes(x = "", y = Count, fill = Group)) +
    geom_bar(stat = "identity", width = 1) +
    coord_polar("y") +
    scale_fill_manual(values = colors) +
    theme_void() +
    ggtitle(paste("Distribution of HAB Genera by SITE:", site))
  
  print(pie_chart)
}

}
}

Apply the function to the data frame

create_pie_chart(small_data)

This second approach where I put the grouping by sites into the for loop makes the warning disappear, and it creates a plot for every unique site in the SITE column.

That's all! I hope this helps someone! Maybe I have it incorrect, but hopefully at least the reproducible bug is helpful to people who want to figure out the root of the problem.

your original issue was caused by group_map() behaviour, which normally drops the group.
Therefore the most direct fix would be to pass .keep=TRUE parameter to the group_map call and your original code would have worked as intended.

grouped_data %>%
  group_map(~ create_pie_chart(.),.keep=TRUE)

That's a great fix, thank you! It does produce double the plots (two per site), but the plots seem correct.