Thanks, I should have done that. I managed to find a server still running dbplyr 1.3.0, so the differences become obvious via show_query()
which should hopefully be enough as explain()
is a much longer output.
Here is a simplified form of the code with names changed:
library(dplyr)
library(dbplyr)
con <- DBI::dbConnect(odbc::odbc(), "my_DSN")
my_query <-
tbl(con, in_schema("container_name", "table_name")) %>%
select(id_column, datestamp) %>%
filter(datestamp == "2019-04-30") %>%
group_by(id_column) %>%
filter(row_number() == 1L) %>%
ungroup()
This removes duplicate id values. (There may be better ways of doing this, but this is a simplified query).
dplyr 1.3.0:
> my_query %>% show_query()
<SQL>
SELECT `id_column`, `datestamp`
FROM (SELECT `id_column`, `datestamp`, row_number() OVER (PARTITION BY `id_column`) AS `zzz9`
FROM (SELECT *
FROM (SELECT `id_column`, `datestamp`
FROM container_name.table_name) `sbbdxdpnnx`
WHERE (`datestamp` = '2019-04-30')) `qcpxocjxmq`) `akxsynfdxk`
WHERE (`zzz9` = 1)
dplyr 1.4.0:
> my_query %>% show_query()
<SQL>
SELECT *
FROM (SELECT *
FROM (SELECT `id_column`, `datestamp`
FROM container_name.table_name) `dbplyr_030`
WHERE (`datestamp` = '2019-04-30')) `dbplyr_031`
WHERE (row_number() = 1)
dbplyr 1.4.0 therefore omits the ... row_number() OVER (PARTITION BY
id_column)
windowing and errors when it comes to WHERE (row_number() = 1)
.
I tried to upload mtcars as a sample dataset to our database, but failed and our admin is on holiday. I hope that this example provides enough detail to explain the cause of the error.