How to order and visualize x-axis variables in ggplot2?

Hi,

I am working with a dataset, and plotted line plot geom_line using ggplot2 in R. From the plot (see below) the order of the labels are jumbled. Is there a way to re-organize these labels so that all 6hr groups appear first followed by 24hr and then 48hr.

M_6hr_B
M_6hr_A
X_6hr_B
X_6hr_A
.. so on....
library(ggplot2)
ggplot(data_long, aes(x = Condition_Timepoint, y = Value, group=Genes, color=Genes)) +
  geom_line(aes(linetype=Genes)) +
  geom_point(aes(shape=Genes)) +
  labs(title = "",
       x = "",
       y = "log (CPM+1")

Thank you,
Toufiq

You can make a factor with the x values and set the order with the levels argument.

data_long$Condition_Timepoint <- factor(data_long$ Condition_Timepoint, levels = c( "M_6hr_B",  "M6hr_A", "X_6hr_B", "X_6hr_A", ...))
1 Like

@FJCC

thank you very much, this indeed worked, however, I applied the same piece of code to the boxplot in R Shiny app, but this does not work.

  output$boxplot <- renderPlot({
    ggplot(data = dat(), aes(x = valtype, y = Value, fill=valtype)) +
      geom_boxplot() +
      geom_jitter() +
      theme_classic(base_size = 14) +
      labs(x = "", y = "Normalized Log (CPM+1)", title = str_to_title(input$type)) +
      theme(axis.text.x =element_text(size=12, face = "bold", color = "black", angle=10),
            axis.text.y =element_text(size=10, face = "bold", color = "black", angle=0),
            axis.title=element_text(size=15, face = "bold", color = "black"),
            plot.title = element_text(lineheight=1.0,size = 18,face = "bold"))
  })

It is impossible to tell what is happening from just your ggplot code. Please provide some data to plot. You can provide that with the output of the dput() function. If your data frame is named DF, post the output of

dput(DF)

Provide the data just as it would be returned by the dat() function in your shiny code.

@FJCC , sure here is the example that might be helpful.

library(shiny)
library(shinydashboard)
library(ggplot2)
library(reshape2)
library(tidyverse)
Data <- structure(list(Gene_Symbols = c("Gene_1", "Gene_3", "Gene_1", 
                                        "Gene_3", "Gene_1", "Gene_3", "Gene_3", "Gene_1", "Gene_1", "Gene_3", 
                                        "Gene_1", "Gene_3", "Gene_1", "Gene_3", "Gene_3", "Gene_1", "Gene_1", 
                                        "Gene_3", "Gene_3", "Gene_1", "Gene_3", "Gene_1", "Gene_1", "Gene_3", 
                                        "Gene_3", "Gene_1", "Gene_1", "Gene_3", "Gene_1", "Gene_3", "Gene_1", 
                                        "Gene_3", "Gene_1", "Gene_3", "Gene_1", "Gene_3", "Gene_3", "Gene_1", 
                                        "Gene_1", "Gene_3", "Gene_3", "Gene_1", "Gene_1", "Gene_3", "Gene_1", 
                                        "Gene_3", "Gene_3", "Gene_1"), Value = c(-6.091545059, -6.091545059, 
                                                                                 -6.091545059, -6.070101714, -3.569172964, -3.090390025, 7.967166226, 
                                                                                 6.814141308, 4.058969425, 5.020329205, -0.809462279, 0.566028591, 
                                                                                 -5.982728293, -7.071261301, -6.176446417, -5.628321743, -7.071261301, 
                                                                                 -6.037578026, 7.260109885, 7.260109885, 7.260109885, 7.260109885, 
                                                                                 0.426459749, 1.012303668, -5.208224789, -5.208224789, -5.208224789, 
                                                                                 -5.208224789, -5.49532517, -3.827895419, 6.543398129, 7.990918822, 
                                                                                 2.761360605, 3.818956484, 1.477705428, 2.224654598, -10, -10, 
                                                                                 -10, -10, -10, -5.148680676, -5.148680676, -5.148680676, 4.321568875, 
                                                                                 5.309031806, 1.763212573, 0.219900364), Condition_Timepoint = c("M_24hr_B", 
                                                                                                                                                 "M_24hr_B", "M_48hr_B", "M_48hr_B", "M_6hr_B", "M_6hr_B", "X_24hr_B", 
                                                                                                                                                 "X_24hr_B", "X_48hr_B", "X_48hr_B", "X_6hr_B", "X_6hr_B", "M_24hr_A", 
                                                                                                                                                 "M_24hr_A", "M_48hr_A", "M_48hr_A", "M_6hr_A", "M_6hr_A", "X_24hr_A", 
                                                                                                                                                 "X_24hr_A", "X_48hr_A", "X_48hr_A", "X_6hr_A", "X_6hr_A", "M_24hr_A", 
                                                                                                                                                 "M_24hr_A", "M_48hr_A", "M_48hr_A", "M_6hr_A", "M_6hr_A", "X_24hr_A", 
                                                                                                                                                 "X_24hr_A", "X_48hr_A", "X_48hr_A", "X_6hr_A", "X_6hr_A", "M_24hr_B", 
                                                                                                                                                 "M_24hr_B", "M_48hr_B", "M_48hr_B", "M_6hr_B", "M_6hr_B", "X_24hr_B", 
                                                                                                                                                 "X_24hr_B", "X_48hr_B", "X_48hr_B", "X_6hr_B", "X_6hr_B"), Condition = c("M", 
                                                                                                                                                                                                                          "M", "M", "M", "M", "M", "X", "X", "X", "X", "X", "X", "M", "M", 
                                                                                                                                                                                                                          "M", "M", "M", "M", "X", "X", "X", "X", "X", "X", "M", "M", "M", 
                                                                                                                                                                                                                          "M", "M", "M", "X", "X", "X", "X", "X", "X", "M", "M", "M", "M", 
                                                                                                                                                                                                                          "M", "M", "X", "X", "X", "X", "X", "X")), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                                                                                                                        -48L))

Data$Condition_Timepoint <- factor(Data$Condition_Timepoint, levels = c( "M_6hr_A",  "M_6hr_B", "M_24hr_A",  "M_24hr_B", "M_48hr_A",  "M_48hr_B", "X_6hr_A", "X_6hr_B", "X_24hr_A", "X_24hr_B", "X_48hr_A", "X_48hr_B" ))

### app.R
ui <- fluidPage(
  titlePanel("Boxplot to visualize gene expression across variables"),
  sidebarLayout(
    sidebarPanel(
      selectizeInput("thegene", "Genes", choices = "Gene_1", multiple = FALSE),
      selectizeInput('type', 'Select variables', choices = c('Condition_Timepoint', 'Condition'), selected = 'Condition_Timepoint', multiple = TRUE),
      width = 3
    ),
    mainPanel(
      fluidRow(
              column(
      plotOutput("boxplot"),
      width = 12,
      )
    )
    )
  )
)

server <- function(input, output, session) {

  updateSelectizeInput(session, "thegene", choices = Data$Gene_Symbols, server = TRUE)

  dat <- reactive({
    Data |>
      pivot_longer(cols = c('Condition_Timepoint', 'Condition'), names_to = 'type') |>
      filter(type == input$type) |>
      filter(Gene_Symbols == input$thegene)
  })

  output$boxplot <- renderPlot({
    ggplot(data = dat(), aes(x = value, y = Value, fill=value)) +
      geom_boxplot() +
      geom_jitter() +
      theme_classic(base_size = 14) +
      labs(x = "", y = "Normalized Log (CPM+1)", title = str_to_title(input$type)) +
      theme(axis.text.x =element_text(size=12, face = "bold", color = "black", angle=10),
            axis.text.y =element_text(size=10, face = "bold", color = "black", angle=0),
            axis.title=element_text(size=15, face = "bold", color = "black"),
            plot.title = element_text(lineheight=1.0,size = 18,face = "bold"))
  })

}

shinyApp(ui = ui, server = server)

The x axis order in the box plot is alphabetical because the value column is characters and not a factor. That happened when you pivoted the data frame, combining the Condition_Timepoint and Condition columns. When you combine a factor and characters, the combination is coerced to be characters. If you add the levels M and X to Condition_Timepoint and make Condition a factor with the levels M and X, the conversion will not happen.

1 Like

@FJCC thank you for looking into this. Additionally, I made a factor for the Condition with the x values and set the order with the levels argument. This worked.

Data$Condition <- factor(Data$Condition, levels = c("M", "X"))  

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.