I'm querying an Oracle database using dbplyr
and am getting errors when I try to filter a datetime column.
Here's a small sample of the data:
tbl(con, in_schema("UDW_LMS_DMT", "ETL_CNV_REQUESTS_ISO")) %>%
select(EVENT_DT, matches("TIMESTAMP"), SESSION_ID)
# Source: lazy query [?? x 6] # Database: Oracle 18.00.0000[username@UDWTST/] EVENT_DT TIMESTAMP TIMESTAMP_YEAR TIMESTAMP_MONTH TIMESTAMP_DAY SESSION_ID <dttm> <dttm> <chr> <chr> <chr> <chr> 1 2018-01-06 00:00:00 2018-01-06 16:15:24 2018 2018-01 2018-01-06 <NA> 2 2018-01-06 00:00:00 2018-01-06 16:15:24 2018 2018-01 2018-01-06 90184df219a705a3a901e4732~ 3 2018-01-06 00:00:00 2018-01-06 16:15:26 2018 2018-01 2018-01-06 2fbec1b6890c557399b2dc8da~ 4 2018-01-06 00:00:00 2018-01-06 16:15:29 2018 2018-01 2018-01-06 4df4b2ee622fd6ec96d5daf18~ 5 2018-01-06 00:00:00 2018-01-06 16:15:49 2018 2018-01 2018-01-06 6a0b081acef830e0f087eac6e~ 6 2018-01-06 00:00:00 2018-01-06 16:15:53 2018 2018-01 2018-01-06 548b676fc7b6b0b1ece89ee41~ 7 2018-01-06 00:00:00 2018-01-06 16:15:54 2018 2018-01 2018-01-06 548b676fc7b6b0b1ece89ee41~ 8 2018-01-06 00:00:00 2018-01-06 16:15:54 2018 2018-01 2018-01-06 548b676fc7b6b0b1ece89ee41~ 9 2018-01-06 00:00:00 2018-01-06 16:16:11 2018 2018-01 2018-01-06 da25e7d97701fb60dbc967c03~ 10 2018-01-06 00:00:00 2018-01-06 16:16:42 2018 2018-01 2018-01-06 <NA> # ... with more rows
I was filtering this table using the TIMESTAMP_DAY
column, which should be a date, but is a character column in the database. While fixing some data corruption issues, our IT folks added a new column called EVENT_DT
, which is a datetime-formatted column, and asked me to use that column for filtering. So, I tried the following and got an error.
tbl(con, in_schema("UDW_LMS_DMT", "ETL_CNV_REQUESTS_ISO")) %>%
select(EVENT_DT, matches("TIMESTAMP"), SESSION_ID) %>%
filter(EVENT_DT >= as.POSIXct("2020-03-23"),
EVENT_DT <= as.POSIXct("2020-03-24"))
Error: nanodbc/nanodbc.cpp:1617: HY000: [Oracle][ODBC][Ora]ORA-01843: not a valid month <SQL> 'SELECT * FROM (SELECT * FROM (SELECT "EVENT_DT", "TIMESTAMP", "TIMESTAMP_YEAR", "TIMESTAMP_MONTH", "TIMESTAMP_DAY", "SESSION_ID" FROM (UDW_LMS_DMT.ETL_CNV_REQUESTS_ISO) ) "dbplyr_026" WHERE (("EVENT_DT" >= CAST('2020-03-23' AS TIMESTAMP)) AND ("EVENT_DT" <= CAST('2020-03-24' AS TIMESTAMP)))) "zzz72" WHERE ROWNUM <= 11.0'
Then I tried this and got a different error:
tbl(con, in_schema("UDW_LMS_DMT", "ETL_CNV_REQUESTS_ISO")) %>%
select(EVENT_DT, matches("TIMESTAMP"), SESSION_ID) %>%
filter(as.Date(EVENT_DT) >= as.Date("2020-03-23"),
as.Date(EVENT_DT) <= as.Date("2020-03-24"))
Error: nanodbc/nanodbc.cpp:1617: HY000: [Oracle][ODBC][Ora]ORA-01861: literal does not match format string <SQL> 'SELECT * FROM (SELECT * FROM (SELECT "EVENT_DT", "TIMESTAMP", "TIMESTAMP_YEAR", "TIMESTAMP_MONTH", "TIMESTAMP_DAY", "SESSION_ID" FROM (UDW_LMS_DMT.ETL_CNV_REQUESTS_ISO) ) "dbplyr_029" WHERE ((CAST("EVENT_DT" AS DATE) >= CAST('2020-03-23' AS DATE)) AND (CAST("EVENT_DT" AS DATE) <= CAST('2020-03-24' AS DATE)))) "zzz78" WHERE ROWNUM <= 11.0'
I could go back to using the TIMESTAMP_DAY
column and filter it using character values (e.g., filter(TIMESTAMP_DAY >= "2020-03-23")
, but I'd like to understand how to properly filter date and datetime values in databases with dbplyr
.