Send dataframe from wide to long with timestamp and duplicate id in R

dataning · April 19, 2021, 11:22am

I'm wondering if there's a better way to send the dataframe from long to wide where one row per patient. In my case, it should be two rows in total.

My current approach is a typical pivot_wider() where I am using id_cols on the patient_id. Also the case_id gives me an unique id per rows. However, there's a timestamp column in my dataset so I guess it's best to use that as name and use the remaining columns as value.

Problem: It does give me a two-rows dataframe but if the time_stamp goes for 24 hours per day and we have 365 days. I can end up with a few thousands columns. Is there a better way to manage such situation?

    case_id patient_id message_type conditions          time_stamp
 1:  295179   72613620         high          D 2017-12-25 22:36:54
 2:  401139   72613620         high          A 2017-12-27 00:13:48
 3:  420761   72613620       normal          B 2017-12-27 04:57:13
 4:  390022   72613620         high          A 2017-12-26 21:34:10
 5:  339198   72613620         high          C 2017-12-26 09:15:56
 6:  256241   72613620          low          C 2017-12-25 13:13:32
 7:  280864   96797683         high          E 2017-12-25 19:10:21
 8:  224620   96797683         high          B 2017-12-25 05:36:00
 9:  479313   96797683       normal          A 2017-12-27 19:00:49
10:  416389   96797683         high          B 2017-12-27 03:53:39
11:  381187   96797683         high          A 2017-12-26 19:26:18
12:  209207   96797683         high          B 2017-12-25 01:53:57

df %>% 
  pivot_wider(id_cols = patient_id,
              names_from = time_stamp,
              values_from = c(3:4))

df <- structure(list(case_id = c(295179L, 401139L, 420761L, 390022L, 
                                 339198L, 256241L, 280864L, 224620L, 479313L, 416389L, 381187L, 
                                 209207L), patient_id = c(72613620L, 72613620L, 72613620L, 
                                                          72613620L, 72613620L, 72613620L, 96797683L, 96797683L, 96797683L, 
                                                          96797683L, 96797683L, 96797683L), message_type = c("high", "high", 
                                                                                                             "normal", "high", "high", "low", "high", "high", "normal", "high", 
                                                                                                             "high", "high"), conditions = c("D", "A", "B", 
                                                                                                                                             "A", "C", "C", "E", "B", 
                                                                                                                                             "A", "B", "A", "B"), time_stamp = structure(c(1514241414, 
                                                                                                                                                                                           1514333628, 1514350633, 1514324050, 1514279756, 1514207612, 1514229021, 
                                                                                                                                                                                           1514180160, 1514401249, 1514346819, 1514316378, 1514166837), class = c("POSIXct", 
                                                                                                                                                                                                                                                                  "POSIXt"), tzone = "UTC")), row.names = c(NA, -12L), class = c("data.table", 
                                                                                                                                                                                                                                                                                                                                 "data.frame"))

system · May 10, 2021, 11:23am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.