Datetime vector to binary/logical "night" and "day" vector

Hi all,

In my dataset I have a variable called call_received_time. I want to create another column called "shift_id" that simply shows if it was a night shift (18:00 - 08:00) or day shift (08:00-18:00) call.

So far, I have turned my character variable that was D-M-Y H:M:S into just H:M but I am now stuck. This is the first time I have ever really had anything to do with times/dates in R and the learning curve is feels steep.

#Making call_time a time & adding a shift identifier
bd$call_received_time <- format(as.POSIXct(bd$call_received_time,"%Y-%m-%d %H:%M:%S"),"%H:%M")
  mutate(shift_id = call_received_time > hms(18:00) & < hms(08:00),
         shift_id = if_else(shift_id == TRUE, "Night", "Day"))

Thanks in advance

Can you please share a small part of the data set in a copy-paste friendly format?

In case you don't know how to do it, there are many options, which include:

  1. If you have stored the data set in some R object, dput function is very handy.

  2. In case the data set is in a spreadsheet, check out the datapasta package. Take a look at this link.

1 Like

I can try!

Haven't done this for a dataset before... it seems to have worked though.

test <- data.frame(
          stringsAsFactors = FALSE,
                patient_id = c("f3088f25feefc6e0","a27900c96b5b5ded","75848116245f12d7",
                               "bfbd0a335f50af7f","5b8bd4f78e963c52",
                               "d46a94cabebf47f1"),
                 case_date = c("2019-04-23",
                               "2017-01-26","2013-01-17","2019-05-18",
                               "2019-05-18","2019-06-01"),
  final_primary_assessment = c("Short of Breath","Fracture/s","Arrhythmia",
                               "Airway Obstruction","Crush Injury","Vomiting"),
       scene_duration_time = c("2940s (~49 minutes)","660s (~11 minutes)",
                               "900s (~15 minutes)","1320s (~22 minutes)","2580s (~43 minutes)",
                               "6480s (~1.8 hours)"),
   transport_duration_time = c(NA, NA, NA, NA, NA, NA),
  cardiac_arrest_indicator = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE),
         unit_vehicle_type = c("Ambulance",
                               "Ambulance","Ambulance","Ambulance","Ambulance",
                               "Ambulance"),
        call_received_time = c("20:07",
                               "15:55","20:35","14:08","17:57","20:56"),
                       LST = c(FALSE, FALSE, FALSE, TRUE, FALSE, FALSE),
                      Year = c(2019, 2017, 2013, 2019, 2019, 2019),
                 skill_set = as.factor(c("ICP","ICP","ICP","Paramedic",
                                         "ICP","ICP")),
     highest_dispatch_code = as.factor(c("2", "2", "1", "1", "1", "2")),
    highest_transport_code = as.factor(c("2", "2", "3", "1", "2", "2"))
)

Here you go

library(dplyr)
library(hms)

test <- data.frame(
    stringsAsFactors = FALSE,
    patient_id = c("f3088f25feefc6e0","a27900c96b5b5ded","75848116245f12d7",
                   "bfbd0a335f50af7f","5b8bd4f78e963c52",
                   "d46a94cabebf47f1"),
    case_date = c("2019-04-23",
                  "2017-01-26","2013-01-17","2019-05-18",
                  "2019-05-18","2019-06-01"),
    final_primary_assessment = c("Short of Breath","Fracture/s","Arrhythmia",
                                 "Airway Obstruction","Crush Injury","Vomiting"),
    scene_duration_time = c("2940s (~49 minutes)","660s (~11 minutes)",
                            "900s (~15 minutes)","1320s (~22 minutes)","2580s (~43 minutes)",
                            "6480s (~1.8 hours)"),
    transport_duration_time = c(NA, NA, NA, NA, NA, NA),
    cardiac_arrest_indicator = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE),
    unit_vehicle_type = c("Ambulance",
                          "Ambulance","Ambulance","Ambulance","Ambulance",
                          "Ambulance"),
    call_received_time = c("20:07",
                           "15:55","20:35","14:08","17:57","20:56"),
    LST = c(FALSE, FALSE, FALSE, TRUE, FALSE, FALSE),
    Year = c(2019, 2017, 2013, 2019, 2019, 2019),
    skill_set = as.factor(c("ICP","ICP","ICP","Paramedic",
                            "ICP","ICP")),
    highest_dispatch_code = as.factor(c("2", "2", "1", "1", "1", "2")),
    highest_transport_code = as.factor(c("2", "2", "3", "1", "2", "2"))
)

test %>% 
    mutate(call_received_time = as_hms(paste0(call_received_time, ":00")),
           shift_id = if_else(call_received_time >= as_hms("08:00:00") & call_received_time < as_hms("18:00:00"),
                              "Day",
                              "Nigth"))
#>         patient_id  case_date final_primary_assessment scene_duration_time
#> 1 f3088f25feefc6e0 2019-04-23          Short of Breath 2940s (~49 minutes)
#> 2 a27900c96b5b5ded 2017-01-26               Fracture/s  660s (~11 minutes)
#> 3 75848116245f12d7 2013-01-17               Arrhythmia  900s (~15 minutes)
#> 4 bfbd0a335f50af7f 2019-05-18       Airway Obstruction 1320s (~22 minutes)
#> 5 5b8bd4f78e963c52 2019-05-18             Crush Injury 2580s (~43 minutes)
#> 6 d46a94cabebf47f1 2019-06-01                 Vomiting  6480s (~1.8 hours)
#>   transport_duration_time cardiac_arrest_indicator unit_vehicle_type
#> 1                      NA                    FALSE         Ambulance
#> 2                      NA                    FALSE         Ambulance
#> 3                      NA                    FALSE         Ambulance
#> 4                      NA                    FALSE         Ambulance
#> 5                      NA                    FALSE         Ambulance
#> 6                      NA                    FALSE         Ambulance
#>   call_received_time   LST Year skill_set highest_dispatch_code
#> 1           20:07:00 FALSE 2019       ICP                     2
#> 2           15:55:00 FALSE 2017       ICP                     2
#> 3           20:35:00 FALSE 2013       ICP                     1
#> 4           14:08:00  TRUE 2019 Paramedic                     1
#> 5           17:57:00 FALSE 2019       ICP                     1
#> 6           20:56:00 FALSE 2019       ICP                     2
#>   highest_transport_code shift_id
#> 1                      2    Nigth
#> 2                      2      Day
#> 3                      3    Nigth
#> 4                      1      Day
#> 5                      2      Day
#> 6                      2    Nigth

Created on 2022-07-09 by the reprex package (v2.0.1)

Hi @andresrcs

Firstly, thankyou so much for the little clue of library(hms)! I had found the as_hms() function but couldn't get it to work. I thought it was part of lubridate for some reason.

I'm getting an error when I try your code.

Error in UseMethod("mutate") : 
  no applicable method for 'mutate' applied to an object of class "character"

I also tried adding to an existing pipe and got this: (This is the first time I have ever really used rlang::last_error because I'm a true beginner and normally I just abort my plan and try to go around another way....)

Error in `mutate()`:
! Problem while computing `call_received_time =
  as_hms(paste0(call_received_time, ":00"))`.
Caused by error in `abort_lossy_cast()`:
! Lossy cast from <character> to <hms> at position(s) 1, 2, 3, 4, 5, ... (and 62927 more)
Run `rlang::last_error()` to see where the error occurred.
> rlang::last_error()
<error/dplyr:::mutate_error>
Error in `mutate()`:
! Problem while computing `call_received_time =
  as_hms(paste0(call_received_time, ":00"))`.
Caused by error in `abort_lossy_cast()`:
! Lossy cast from <character> to <hms> at position(s) 1, 2, 3, 4, 5, ... (and 62927 more)

Backtrace:
  1. ... %>% mutate(Year = isoyear(case_date))
  9. hms:::as_hms.default(paste0(call_received_time, ":00"))
 10. vctrs::vec_cast(x, new_hms())
 11. vctrs `<fn>`()
 12. hms:::vec_cast.hms.character(...)
 13. hms:::abort_lossy_cast(x, to, ..., lossy = lossy)
Run `rlang::last_trace()` to see the full context.
> rlang::last_trace()
<error/dplyr:::mutate_error>
Error in `mutate()`:
! Problem while computing `call_received_time =
  as_hms(paste0(call_received_time, ":00"))`.
Caused by error in `abort_lossy_cast()`:
! Lossy cast from <character> to <hms> at position(s) 1, 2, 3, 4, 5, ... (and 62927 more)

Backtrace:
     x
  1. +-... %>% mutate(Year = isoyear(case_date))
  2. +-dplyr::mutate(., Year = isoyear(case_date))
  3. +-dplyr::mutate(...)
  4. +-dplyr:::mutate.data.frame(...)
  5. | \-dplyr:::mutate_cols(.data, dplyr_quosures(...), caller_env = caller_env())
  6. |   +-base::withCallingHandlers(...)
  7. |   \-mask$eval_all_mutate(quo)
  8. +-hms::as_hms(paste0(call_received_time, ":00"))
  9. \-hms:::as_hms.default(paste0(call_received_time, ":00"))
 10.   \-vctrs::vec_cast(x, new_hms())
 11.     \-vctrs `<fn>`()
 12.       \-hms:::vec_cast.hms.character(...)
 13.         \-hms:::abort_lossy_cast(x, to, ..., lossy = lossy)
 14.           \-rlang::abort(...)
> 

It seems the sample data you have provided doesn't actually represent your real data and the resulting time format is not valid. Can you provide a sample of your actual data set? or even better, a proper REPRoducible EXample (reprex) illustrating your issue?

I'm not sure what to tell you... I did produce both as a reprex (unless I've done it wrong...)

The code was using > reprex:::reprex_addin() and the data set is from datapasta!

This was the closest it felt like it was to working:

#Making call_time a time & adding a shift identifier
  bd$call_received_time <- format(as.POSIXct(bd$call_received_time,
                                           "%Y-%m-%d %H:%M:%S"),"%H:%M") %>%
    mutate(call_received_time = as_hms(paste0(call_received_time, ":00")),
           shift_id = if_else(call_received_time >= as_hms("08:00:00") & 
                                call_received_time < as_hms("18:00:00"),
                              "Day",
                              "Night"))
#> Error in format(as.POSIXct(bd$call_received_time, "%Y-%m-%d %H:%M:%S"), : could not find function "%>%"

The error says;

Error in UseMethod("mutate") : 
  no applicable method for 'mutate' applied to an object of class "character"

When I check the class it isn't a character type:

> class(bd$call_received_time)
[1] "POSIXct" "POSIXt"

I think you are a little bit confused about what a reprex actually is, a reprex should be reproducible just by copying your code and pasting it into our own R session and what you have posted so far is not. Please read the reprex guide I linked for you more carefully, notice that a reprex must satisfy certain conditions, regardless of the the use of the reprex package which only serves as a helper in the process.

Now about your problem

This is not valid dplyr syntax, you are trying to apply mutate() to a vector instead of a dataframe. If you pay attention to the example I gave you, I applied mutate to the sample data frame you provided, not to a column of the data frame.

Chronobiology is fun to explore in statistical design of experiment.

Shift within day is a good start.
Other useful elements include: day of week, is weekend, is holiday weekend, is holiday, week of year.

I also like what I call "time cosines". If you think about cycles like days, weeks, and such, the first and last hours are adjacent but showing them as 0:00 and 23:59 isn't helpful for machine learners. If you convert them to dimensionless, by dividing by 24 hours, scale them by 2*pi, and then create two columns, one of the sine, and one of the cosine, then spatially those two columns indicate proximity. December 31 is snuggled right next to Jan 01 in the year-cosines. That can help machine learners and other statistical tools to make sense of the cyclic nature of the phenomena.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.