lubridate::ymd() behaving strangely within ifelse

I'm trying to use some some ifelse logic within a mutate function to tidy up a column of strings and convert them to dates, depending on the format of the string that is encountered.

I'm finding that lubridate::ymd() seems to be giving me a different output depending on whether it is within an ifelse() statement or not, and I'm confused.

Here's a simplified reprex to demonstrate what I'm seeing:

ymd_regex <- "^20[0-9]{2}-[0-9]{1,2}-[0-9]{1,2}$"

grepl(ymd_regex, "2022-02-01")
#> [1] TRUE

lubridate::ymd("2022-02-01")
#> [1] "2022-02-01"

ifelse(grepl(ymd_regex, "2022-02-01"), lubridate::ymd("2022-02-01"), "Error")
#> [1] 19024

dplyr::if_else(grepl(ymd_regex, "2022-02-01"), lubridate::ymd("2022-02-01"), Sys.Date())
#> [1] "2022-02-01"

Created on 2023-06-20 with reprex v2.0.2

I want the "2022-02-01" date output but I'm getting the 19024 numeric output instead.
Please can someone explain what might be happening?

EDIT - sorry I have just seen from the ifelse help that:

ifelse() strips attributes

I've amended my reprex to show that dplyr::if_else() gives the output I need (though it throws warnings when used on a mixed vector within mutate())

The reason for this are R's use of coercion on directly incompatible types; R is trying to find a compatabile type between Logical (from grepl) and Date (from ymd); and settling on Numeric as a compromise.

class(FALSE)
class(Sys.Date())
class(c(FALSE,Sys.Date()))
1 Like

Surely the logical is just the necessary first argument for ifelse - it needs a TRUE or FALSE? This shouldn't affect the two possible returns.

I did edit my post while you were replying, sorry!

in the documentation for ifelse it does say :

Value

A vector of the same length and attributes (including dimensions and "class") as test and data values from the values of yes or no.

And I agree with you this doesnt seem entirely warranted aside from perhaps a performance perspective, as it acts against expectations ; part of the motivation behind dplyr::if_else is for more consistent and stricter class compatibility

1 Like

You'd have to go through something like this

ymd_regex <- "^20[0-9]{2}-[0-9]{1,2}-[0-9]{1,2}$"

ifelse(grepl(ymd_regex, "2022-02-01"), lubridate::ymd("2022-02-01"), "Error")
#> [1] 19024

ifelse(TRUE,lubridate::ymd("2022-02-01","Error"))
#> Warning: 1 failed to parse.
#> [1] 19024

if(TRUE) lubridate::ymd("2022-02-01")
#> [1] "2022-02-01"
ifelse(TRUE,lubridate::ymd("2022-02-01"), "Error") |> as.Date(x = _, origin = "1970-01-01")
#> [1] "2022-02-01"
ifelse(grepl(ymd_regex, "2022-02-01"), lubridate::ymd("2022-02-01"), "Error")  |> as.Date(x = _, origin = "1970-01-01")
#> [1] "2022-02-01"

Probably better to replace ifelse with a block test

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.