I want to convert column types from integer to date after import of a XLSX data-frame using read_xlsx, where columns of type date have been been wrongly guessed as dbl (the data-frame also contains several other columns that should not be converted).
Doing this during import using the col_type argument does not work, because this would subsetting the data in the col_type argument, which is not possible (as far as I see - Import dates from XLSX).
I found this thread Using across to convert column types, but it unfortunately does not work - not even as described there.
Because it is conversion from Excel dates stored as dbl, using the as_date function of lubridate the correct origin using the origin argument has to be set.
I get an error across()
must only be used inside dplyr verbs. with this code:
read_xlsx("dataframe.xlsx") %>%
mutate(across(column3:column6 & where(is.double), as_date(., origin = "1899-12-30")))
However, this syntax works with "simple" conversion like:
read_xlsx("dataframe.xlsx") %>%
mutate(across(column3:column6 & where(is.character), as.double))
Of course, it could be done separately for each column:
read_xlsx("dataframe.xlsx") %>%
mutate(column3 = as_date(column3, origin = "1899-12-30")) %>%
mutate(column4 = as_date(column4, origin = "1899-12-30"))
... and so forth...
But this is rather unhandy, in addition since input column type has to be checked/corrected for every column, and not applicable to large datasets.
How can I fix this problem and convert from dbl to date using the across- and as_date-funtions?
Or are there other "handy" possibilities for conversion to date-type?