Why does a date before 1970 (e.g., 1800-07-02) display as 1969-12-31 in an RStudio notebook?
The raw data file has this field "1800-07-02 00:00:00", which was loaded as a timestamp into a PostgreSQL database for the first example below that displays as "1969-12-31 17:59:59"
Is this an RStudio problem, or a problem in the RPostgres package?
Using Postgres SQL to cast the value as a "date" results in a field that displays correctly.
```{r}
library(DBI)
library(RPostgres)
library(tidyverse)
. . .
```
```{r}
MimicDB <- dbConnect(RPostgres::Postgres(),
host = "localhost",
dbname = "mimic",
user = Sys.getenv("MIMIC_User"),
password = Sys.getenv("MIMIC_Password"),
bigint = "integer", # See RBloggers 2018-03-16 posting
options = "-c search_path=mimiciii")
```
```{sql, connection=MimicDB}
SELECT subject_id, dob, cast (dob as date) AS dobDate
FROM patients
ORDER BY DOB
LIMIT 5
```
subject_id dob dobdate
<int> <S3:POSIXct> <date>
31585 1969-12-31 17:59:59 1800-07-02
62884 1969-12-31 17:59:59 1800-07-16
31288 1969-12-31 17:59:59 1800-07-16
3240 1969-12-31 17:59:59 1800-07-18
29851 1969-12-31 17:59:59 1800-08-19
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] kableExtra_0.9.0 dbplyr_1.2.2 forcats_0.3.0 stringr_1.3.1 dplyr_0.7.7 purrr_0.2.5
[7] readr_1.1.1 tidyr_0.8.2 tibble_1.4.2 ggplot2_3.1.0 tidyverse_1.2.1 RPostgres_1.1.1
[13] DBI_1.0.0
loaded via a namespace (and not attached):
[1] tidyselect_0.2.5 haven_1.1.2 lattice_0.20-35 colorspace_1.3-2 viridisLite_0.3.0
[6] htmltools_0.3.6 yaml_2.2.0 blob_1.1.1 rlang_0.3.0.1 pillar_1.3.0
[11] glue_1.3.0 withr_2.1.2 bit64_0.9-7 modelr_0.1.2 readxl_1.1.0
[16] bindrcpp_0.2.2 bindr_0.1.1 plyr_1.8.4 munsell_0.5.0 gtable_0.2.0
[21] cellranger_1.1.0 rvest_0.3.2 evaluate_0.12 knitr_1.20 highr_0.7
[26] broom_0.5.0 Rcpp_0.12.19 scales_1.0.0 backports_1.1.2 jsonlite_1.5
[31] bit_1.1-14 hms_0.4.2 digest_0.6.18 stringi_1.2.4 grid_3.5.1
[36] rprojroot_1.3-2 cli_1.0.1 tools_3.5.1 magrittr_1.5 lazyeval_0.2.1
[41] crayon_1.3.4 pkgconfig_2.0.2 xml2_1.2.0 lubridate_1.7.4 assertthat_0.2.0
[46] rmarkdown_1.10 httr_1.3.1 rstudioapi_0.8 R6_2.3.0 nlme_3.1-137
[51] compiler_3.5.1
Same query as shown above but run in psql:
mimic=# SELECT subject_id, dob, cast (dob as date) AS dobDate
mimic-# FROM patients
mimic-# ORDER BY DOB
mimic-# LIMIT 5;
subject_id | dob | dobdate
------------+---------------------+------------
31585 | 1800-07-02 00:00:00 | 1800-07-02
62884 | 1800-07-16 00:00:00 | 1800-07-16
31288 | 1800-07-16 00:00:00 | 1800-07-16
3240 | 1800-07-18 00:00:00 | 1800-07-18
29851 | 1800-08-19 00:00:00 | 1800-08-19