R Novice Issues encountered in averaging data by variable for plot

Hi there,

I'm new to the community but would greatly appreciate any assistance in creating this new tab of an app making for visualizing hospital data. The CMS dataset I compiled has a row for each hospital with information on the state it's located in as well as the type of hospital ownership, and city. There are continuous variables for each hospital measuring it's rating for different categories such as safety, outcomes, etc. I have a scatterplot which works great to visualize each continuous criteria on the different axes, and allows to filter by city, state, or ownership. I'm trying to create a separate similar plot where each point shows the average for each state or ownership type, instead of each individual hospital. The new plot is working great when averaging by hospital ownership type, but for some reason fails when averaging by state. I'm not sure why the state variable works in the first plot but not the second plot, and why averaging by ownership works just fine. The code is nearly identical so I'm having trouble pinpointing the issue. Any help at all would be greatly appreciated!

Here are the functions used in the code:

hospital_criteria <- c("MortAP", "HAIAP",
                       "MSPB-1 Achievement Points Num", "Overall Rating of Hospital Achievement Points Num",
                       "Total Performance Score")

criteria_display <- c("Clinical Outcomes Achievement Points",
                      "Patient Safety Achievement Points",
                      "Cost Efficiency Achievement Points",
                      "Patient Experience Achievement Points",
                      "Total Performance Score")

hospital_criteria_avg <- c("AverageMortAP",
                           "AverageHAIAP",
                           "AverageMSBP",
                           "AverageOverall",
                           "AverageTotalPerformance")

Here are the tab panels I'm using for the first and second plots:

  tabPanel("Plot of Hospitals", 
             sidebarLayout(
               sidebarPanel(
                 selectInput("x_axis_criteria", "Select X-axis Criteria:",
                             choices = setNames(hospital_criteria, criteria_display)),
                 selectInput("y_axis_criteria", "Select Y-axis Criteria:",
                             choices = setNames(hospital_criteria, criteria_display)),
                 checkboxInput("show_regression_line", "Show Regression Line", value = TRUE),
                 checkboxInput("show_hospital_labels", "Show Hospital Names", value = TRUE), # New checkbox
                 selectInput("state_filter", "Filter by State:",
                             choices = c("All states", unique(CMS$State.x))),
                 selectInput("city_filter", "Filter by City:",
                             choices = c("All cities", unique(CMS$City.x))),
                 selectInput("hospitalownership_filter", "Filter by Hospital Ownership:",
                             choices = c("All ownership types", unique(CMS$'Hospital Ownership')))
               ),
               mainPanel(
                 plotOutput("distPlot", height = "400px")
               )
             )
    ),
    tabPanel("Plot of Averages", 
             sidebarLayout(
               sidebarPanel(
                 selectInput("hospital_criteria_avg",
                             "Select Criteria:",
                             choices = setNames(hospital_criteria_avg, criteria_display)),
                 selectInput("average_filter", "Filter by average:",
                             choices = c("State", "Hospital Ownership")),
               ),
               mainPanel(
                 plotOutput("avgPlot", height = "400px")
               )
             )
    )

And here is the server code I use for the average data, for which averaging by State does not work:

server <- function(input, output, session) {
  
  averageData <- reactive({
    data <- CMS
    
      if (input$average_filter == "Hospital Ownership") {
      data <- data %>%
        group_by(`Hospital Ownership`) %>%
        summarise(AverageMortAP = mean(!!sym("MortAP"), na.rm = TRUE),
                  AverageHAIAP = mean(!!sym("HAIAP"), na.rm = TRUE),
                  AverageMSBP = mean(!!sym("MSPB-1 Achievement Points Num"), na.rm = TRUE),
                  AverageOverall = mean(!!sym("Overall Rating of Hospital Achievement Points Num"), na.rm = TRUE),
                  AverageTotalPerformance = mean(!!sym("Total Performance Score"), na.rm = TRUE))
    } else if (input$average_filter == "State") {
      data <- data %>%
        group_by(`State.x`) %>%
        summarise(AverageMortAP = mean(!!sym("MortAP"), na.rm = TRUE),
                  AverageHAIAP = mean(!!sym("HAIAP"), na.rm = TRUE),
                  AverageMSBP = mean(!!sym("MSPB-1 Achievement Points Num"), na.rm = TRUE),
                  AverageOverall = mean(!!sym("Overall Rating of Hospital Achievement Points Num"), na.rm = TRUE),
                  AverageTotalPerformance = mean(!!sym("Total Performance Score"), na.rm = TRUE))
    }
    
    return(data)
  })
  
  output$avgPlot <- renderPlot({
    averageData() %>%
    ggplot(aes(x = !!sym(input$hospital_criteria_avg), y = AverageTotalPerformance, color = !!sym(input$average_filter))) +
      geom_point()
  })

Here is the server code (which is working) for the first plot which just examines a point for each hospital. I'm using State.x in the same way so I'm not sure why this is working where the Average Plot is not.


  
  criteriaPlot <- reactive({
    x_cv <- sym(input$x_axis_criteria)
    y_cv <- sym(input$y_axis_criteria)
    
    data <- CMS
    
    if (input$city_filter != "All cities") {
      data <- data %>% filter(City.x == input$city_filter)
    }
    
    if (input$state_filter != "All states") {
      data <- data %>% filter(State.x == input$state_filter)
    }
    
    if (input$hospitalownership_filter != "All ownership types") {
      data <- data %>% filter(`Hospital Ownership` == input$hospitalownership_filter)
    }
    
    x_axis_label <- setNames(criteria_display, hospital_criteria)
    y_axis_label <- setNames(criteria_display, hospital_criteria)
    
    plot <- data %>%
      ggplot(aes(x = !!x_cv, y = !!y_cv, color = `State.x`)) +
      geom_point() +
      labs(title = "Total Performance Score vs. Criteria Selected",
           x = x_axis_label[input$x_axis_criteria],
           y = y_axis_label[input$y_axis_criteria]) +
      theme_classic(base_size = 15)
    
    if (input$show_hospital_labels) {
      plot <- plot + geom_text_repel(aes(label = `Facility Name.x`))
    }
    
    if (input$show_regression_line) {
      plot <- plot + geom_smooth(method = "lm", se = FALSE, aes(group = 1))
    }
    
    return(plot)
    
  })
  
  output$distPlot <- renderPlot({
    criteriaPlot()
  })

Also, here was an attempt at troubleshooting I did just by running the code to manipulate the data and create the plot as R script instead of as a Shiny app. This actually worked great, so it must be an issue in the way I incorporate it into the Shiny App.

data <- CMS
  data <- data %>%
    group_by(`State.x`) %>%
    summarise(AverageMortAP = mean(!!sym("MortAP"), na.rm = TRUE),
              AverageHAIAP = mean(!!sym("HAIAP"), na.rm = TRUE),
              AverageMSBP = mean(!!sym("MSPB-1 Achievement Points Num"), na.rm = TRUE),
              AverageOverall = mean(!!sym("Overall Rating of Hospital Achievement Points Num"), na.rm = TRUE),
              AverageTotalPerformance = mean(!!sym("Total Performance Score"), na.rm = TRUE))
  

data %>%
  ggplot(aes(x = AverageMortAP, y = AverageTotalPerformance, color = State.x)) +
  geom_point()

Thanks again for any help you might be able to provide, and please let me know if there's anything I can provide which would be useful!

This topic was automatically closed 54 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.