Using ggplot2::vars and .data with a character vector

Hi All,

I am writing a function that lets users determine the variable to be used in ggplot2::facet_grid() as columns (or rows).

This is an illustration of what I want to do:

set.seed(1234)
n <- 400
dat <- data.frame(x = rnorm(n),
                  y = rnorm(n),
                  cat1 = sample(c("A", "B"),
                                size = n,
                                replace = TRUE),
                  cat2 = sample(c("A", "B"),
                                size = n,
                                replace = TRUE))

tmpfct <- function(my_data,
                   my_cols = NULL,
                   facet_grid_args = list(as.table = FALSE)) {
    p <- ggplot2::ggplot(data = my_data,
                         ggplot2::aes(x = x,
                             y = y)) +
            ggplot2::geom_point()
    if (!is.null(my_cols)) {
        facet_grid_args_final <- utils::modifyList(facet_grid_args,
                                                   list(cols = ggplot2::vars(.data[[my_cols]])))
        p <- p + do.call(ggplot2::facet_grid,
                         facet_grid_args_final)
      }
    p
  }

out <- tmpfct(my_data = dat)
out

out <- tmpfct(my_data = dat,
              my_cols = "cat2")
out

out <- tmpfct(my_data = dat,
              my_cols = c("cat1", "cat2"))
out

The last call, with my_cols = c("cat1", "cat2"), led to this error:

Error in `.data[[<chr: "cat1", "cat2">]]`:
! Must subset the data pronoun with a string, not a character vector.

How can I solve this problem? I guess I need to do something in .data[[my_cols]] but I am not sure how.

Thanks a lot.

-- Shu Fai

I found one solution myself. It may not be a good one but it can solve the problem I encountered.

set.seed(1234)
n <- 400
dat <- data.frame(x = rnorm(n),
                  y = rnorm(n),
                  cat1 = sample(c("A", "B"),
                                size = n,
                                replace = TRUE),
                  cat2 = sample(c("A", "B"),
                                size = n,
                                replace = TRUE))

tmpfct <- function(my_data,
                   my_cols = NULL,
                   facet_grid_args = list(as.table = FALSE)) {
    p <- ggplot2::ggplot(data = my_data,
                         ggplot2::aes(x = x,
                             y = y)) +
            ggplot2::geom_point()
    if (!is.null(my_cols)) {
        tmp <- sapply(my_cols,
                      function(xx) paste0(".data[[", sQuote(xx), "]]"))
        tmp <- paste0("quote(ggplot2::vars(",
                      paste(tmp, collapse = ","),
                      "))")
        facet_grid_args_final <- utils::modifyList(facet_grid_args,
                                                   list(cols = eval(parse(text = tmp))))
        p <- p + do.call(ggplot2::facet_grid,
                         facet_grid_args_final)
      }
    p
  }

out <- tmpfct(my_data = dat)
out

out <- tmpfct(my_data = dat,
              my_cols = "cat2")
out

out <- tmpfct(my_data = dat,
              my_cols = c("cat1", "cat2"))
out

I solves the problem by constructing the value to cols as a string , with the expression involving .data enclosed with quote(). It is parsed by parse() and then evaluated by eval(). This value can then be interpreted correctly by facet_grid() because it is passed as ggplot2::vars(.data[['cat1']],.data[['cat2']]))

Although this can solve my problem, if there is a better solution, I would love to know. Thanks.

-- Shu Fai

Hi @sfcheung, thank you for your question and I'm pleased you have found a solution and shared it with us.

Here is an alternate approach using the embrace operator {{ to pass the my_cols parameter to ggplot2::facet_grid() so it can handle single values and character vectors:

library(tidyverse)

# construct the data
set.seed(1234)
n <- 400
dat <- data.frame(
  x = rnorm(n),
  y = rnorm(n),
  cat1 = sample(c("A", "B"), size = n, replace = TRUE),
  cat2 = sample(c("A", "B"), size = n,replace = TRUE)
)

# construct the function
tmpfct2 <- function(my_data, my_cols = NULL) {
  p <- my_data |> 
    ggplot(aes(x = x, y = y)) +
    geom_point()
  
  # facet the plot on my_cols where appropriate
  if (!is.null(my_cols)) {
    p <- p +
      facet_grid({{my_cols}})
  }
  
  return(p)
}

# test `my_cols` with a character vector
tmpfct2(my_data = dat, my_cols = c('cat1', 'cat2'))

Created on 2024-09-30 with reprex v2.1.0

Does this meet your needs?

@craig.parylo , thanks a lot! I am unaware of the embrace operator and will study more about it. In my case, I couldn't figure out how to use this operator for now. I need to use do.call() and utils::modifyList() because the call involves other user-supplied arguments that cannot be passed using dotdotdot.

Nevertheless, the operator still looks useful and may be applicable in other cases in the function.

Thanks again.