I wrote a reprex and now I'm doubly confused.
I'm handling import from a bunch of excel files where the dates have been stored in a couple of different ways. I've been working with a list where each input file is a data frame:
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
library(purrr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:lubridate':
#>
#> intersect, setdiff, union
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(janitor)
char_dates <- tibble(
date = c("July 6, 2009", "July 7, 2009"),
name = "string dates"
)
excel_dates <-
tibble(
date = c("40000", "40001"),
name = "excel dates"
)
list(char_dates, excel_dates) %>%
map(
~transmute(.x,
common_date = case_when(
name == "string dates" ~ mdy(date),
name == "excel dates" ~ excel_numeric_to_date(as.double(date))
)
)
)
#> Warning in excel_numeric_to_date(as.double(date)): NAs introduced by
#> coercion
#> Warning: All formats failed to parse. No formats found.
#> [[1]]
#> # A tibble: 2 x 1
#> common_date
#> <date>
#> 1 2009-07-06
#> 2 2009-07-07
#>
#> [[2]]
#> # A tibble: 2 x 1
#> common_date
#> <date>
#> 1 2009-07-06
#> 2 2009-07-07
list(char_dates, excel_dates) %>%
map(
~transmute(.x,
common_date = if_else(
name == "string dates",
mdy(date),
excel_numeric_to_date(as.double(date))
)
)
)
#> Warning in excel_numeric_to_date(as.double(date)): NAs introduced by
#> coercion
#> Warning in excel_numeric_to_date(as.double(date)): All formats failed to
#> parse. No formats found.
#> [[1]]
#> # A tibble: 2 x 1
#> common_date
#> <date>
#> 1 2009-07-06
#> 2 2009-07-07
#>
#> [[2]]
#> # A tibble: 2 x 1
#> common_date
#> <date>
#> 1 2009-07-06
#> 2 2009-07-07
Created on 2019-04-12 by the reprex package (v0.2.1)
two questions:
- why are both sets of warnings generated if both date parsing strategies succeed?
- This is not actually the behavior I observe on the real data I'm working on--in that case, the call to
lubridate::mdy()
issues the warning but all the dates are actually NA. I triedparse_date_time()
and a more granularorder
specification, which also fails. Theexcel_numeric_to_date()
works fine, and returns date values in the relevant data frames.