What is the initial (discovery) query when defining a datasource with the tbl(con, source_table)?
My assumption is that the tbl() is running a SELECT * FROM <source_table> LIMIT <n> query in the background. Can you confirm?
I am asking because I run frequently in the std::bad_alloc error, but the same query composed with glue_sql() can be fetched without any error using the dbGetQuery(). Some of unused columns triggers this error, but the custom made query is working ok because I am specifying the columns manually in the SQL query.
How many times is a query done for this piece of code (without the final collect())?
flights <- tbl(con, "flights") |>
select(year:day, dep_delay, arr_delay) |>
filter(dep_delay > 240)
(my guess is 2)
Do you have any idea how the columns can be specified with tbl() in order to avoid this std::bad_alloc type error (and still be able to use the flexibility of the dplyr based queries)?
Thank you.