I have these 2 variables, I need to modify the first one to recode correctly the numbers that are really dates and then calculate the distance in days between both columns and finally create a new variable where it says if each observation returns in the indicated time or not (975,75 days).
# A tibble: 15 x 2
time1 time2
<chr> <dttm>
1 44358 2015-07-23 00:00:00
2 NO 2015-10-25 00:00:00
3 NO 2015-05-28 00:00:00
4 44328 2015-07-27 00:00:00
5 NO 2015-07-24 00:00:00
6 44277 2015-07-27 00:00:00
7 NO 2015-07-23 00:00:00
8 NO 2015-08-03 00:00:00
9 NO 2015-08-03 00:00:00
10 NO 2015-07-27 00:00:00
11 NO 2015-08-03 00:00:00
12 44481 2015-07-24 00:00:00
13 NO 2015-07-24 00:00:00
14 43326 2015-07-23 00:00:00
15 44414 2015-08-03 00:00:00
I do this:
df %>%
mutate(time1= if_else(str_detect(time1, "NO"),
time1,
time1 %>%
as.numeric() %>%
excel_numeric_to_date() %>%
as.character())) %>%
mutate(dias= as.numeric(difftime(as.Date(time1), time2, units = "days"))) %>%
mutate(comeback= if_else(any(dias>((365.25*3)-120)),T,F))
but for some reason in the created variable the NA observations of time1 are passed as T, when they should be passed as NA.
# A tibble: 15 x 4
time1 time2 dias comeback
<chr> <dttm> <dbl> <lgl>
1 2021-06-11 2015-07-23 00:00:00 2150 TRUE
2 NO 2015-10-25 00:00:00 NA TRUE
3 NO 2015-05-28 00:00:00 NA TRUE
4 2021-05-12 2015-07-27 00:00:00 2116 TRUE
5 NO 2015-07-24 00:00:00 NA TRUE
6 2021-03-22 2015-07-27 00:00:00 2065 TRUE
7 NO 2015-07-23 00:00:00 NA TRUE
8 NO 2015-08-03 00:00:00 NA TRUE
9 NO 2015-08-03 00:00:00 NA TRUE
10 NO 2015-07-27 00:00:00 NA TRUE
11 NO 2015-08-03 00:00:00 NA TRUE
12 2021-10-12 2015-07-24 00:00:00 2272 TRUE
13 NO 2015-07-24 00:00:00 NA TRUE
14 2018-08-14 2015-07-23 00:00:00 1118 TRUE
15 2021-08-06 2015-08-03 00:00:00 2195 TRUE
Warning message:
Problem while computing `hpv_post = if_else(...)`.
i NAs introduced by coercion
What could be happening?