I'm wondering if there's a better way to send the dataframe from long to wide where one row per patient. In my case, it should be two rows in total.
My current approach is a typical pivot_wider()
where I am using id_cols
on the patient_id
. Also the case_id
gives me an unique id per rows. However, there's a timestamp column in my dataset so I guess it's best to use that as name and use the remaining columns as value.
Problem: It does give me a two-rows dataframe but if the time_stamp goes for 24 hours per day and we have 365 days. I can end up with a few thousands columns. Is there a better way to manage such situation?
case_id patient_id message_type conditions time_stamp
1: 295179 72613620 high D 2017-12-25 22:36:54
2: 401139 72613620 high A 2017-12-27 00:13:48
3: 420761 72613620 normal B 2017-12-27 04:57:13
4: 390022 72613620 high A 2017-12-26 21:34:10
5: 339198 72613620 high C 2017-12-26 09:15:56
6: 256241 72613620 low C 2017-12-25 13:13:32
7: 280864 96797683 high E 2017-12-25 19:10:21
8: 224620 96797683 high B 2017-12-25 05:36:00
9: 479313 96797683 normal A 2017-12-27 19:00:49
10: 416389 96797683 high B 2017-12-27 03:53:39
11: 381187 96797683 high A 2017-12-26 19:26:18
12: 209207 96797683 high B 2017-12-25 01:53:57
df %>%
pivot_wider(id_cols = patient_id,
names_from = time_stamp,
values_from = c(3:4))
df <- structure(list(case_id = c(295179L, 401139L, 420761L, 390022L,
339198L, 256241L, 280864L, 224620L, 479313L, 416389L, 381187L,
209207L), patient_id = c(72613620L, 72613620L, 72613620L,
72613620L, 72613620L, 72613620L, 96797683L, 96797683L, 96797683L,
96797683L, 96797683L, 96797683L), message_type = c("high", "high",
"normal", "high", "high", "low", "high", "high", "normal", "high",
"high", "high"), conditions = c("D", "A", "B",
"A", "C", "C", "E", "B",
"A", "B", "A", "B"), time_stamp = structure(c(1514241414,
1514333628, 1514350633, 1514324050, 1514279756, 1514207612, 1514229021,
1514180160, 1514401249, 1514346819, 1514316378, 1514166837), class = c("POSIXct",
"POSIXt"), tzone = "UTC")), row.names = c(NA, -12L), class = c("data.table",
"data.frame"))