What is the initial (discovery) query when defining a datasource with the tbl(con, source_table)
?
My assumption is that the tbl()
is running a SELECT * FROM <source_table> LIMIT <n>
query in the background. Can you confirm?
I am asking because I run frequently in the std::bad_alloc
error, but the same query composed with glue_sql()
can be fetched without any error using the dbGetQuery()
. Some of unused columns triggers this error, but the custom made query is working ok because I am specifying the columns manually in the SQL query.
How many times is a query done for this piece of code (without the final collect()
)?
flights <- tbl(con, "flights") |>
select(year:day, dep_delay, arr_delay) |>
filter(dep_delay > 240)
(my guess is 2)
Do you have any idea how the columns can be specified with tbl()
in order to avoid this std::bad_alloc
type error (and still be able to use the flexibility of the dplyr
based queries)?
Thank you.