data masking or tidy select for 'select' when using dbplyr

Lets say I am writing a function in an R package that uses dplyr/dbplyr to query a db and I want to select a single column:
(ref: Programming with dplyr • dplyr)

# create a local sql database using mtcars data
library(DBI)
library(RSQLite)
library(dplyr)

con <- dbConnect(RSQLite::SQLite(), ":memory:")
dbWriteTable(con, "mtcars", mtcars)
dbListTables(con)  

 tbl(con, "mtcars")  %>%
    dplyr::select(mpg)

I can see in the documentation for dplyr::select() that the ... uses tidy select, so in an R package I should quote the selected columns (Keep or drop columns using their names and types — select • dplyr).

  tbl(con, "mtcars")  %>%
    dplyr::select('mpg')

But I also see in the dbplyr select documentation that the ... uses data masking, so perhaps I should use the .data$ method? (Subset, rename, and reorder columns using their names — select.tbl_lazy • dbplyr).

  tbl(con, "mtcars")  %>%
    dplyr::select(.data$mpg)

Or is there a better way of handling this using {{}} or ensym()?

As a side note, I tried doing the following, but it seems select is not an exported function from dbplyr?

  tbl(con, "mtcars")  %>%
    **dbplyr**::select(.data$mpg)

Indeed, it's not, because it's using S3. The function is dbplyr:::select.tbl_lazy() (note the triple colons).

For your main question, I'm not sure. In particular, I feel the problem you'll run into is that you don't always want to select mpg, so you'll need to specify the column as a variable:

extract_col <- function(data, column){
  data  %>%
    dplyr::select({{column}})
}

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.