Was wondering if anyone could help me out with a data tidying task.
I'm attempting to count 7 day periods in my dataset, and mark each as a week number (1-n) for each grouping.
So far my best attempt was repeating a sequence of 1:7 over the dataset, but it does not account for grouping changes. My desired output would be like so:
site patient_id date time nebs_day days
01 R01002 2019-04-04 NA 0 1
01 R01002 2019-04-05 16:57:14 1 2
01 R01002 2019-04-06 NA 0 3
01 R01002 2019-04-07 18:19:49 1 4
01 R01002 2019-04-08 NA 0 5
01 R01002 2019-04-09 15:06:44 1 6
01 R01002 2019-04-10 15:36:47 1 7
01 R01002 2019-04-11 NA 0 1
01 R01002 2019-04-12 17:42:32 1 2
01 R01002 2019-04-13 20:24:46 1 3
01 R01003 2018-09-06 17:58:41 1 1 # Changes group with patient_id
01 R01003 2018-09-07 00:00:05 3 2
01 R01003 2018-09-08 12:52:57 1 3
01 R01003 2018-09-09 00:31:59 4 4
01 R01003 2018-09-10 12:36:52 2 5
01 R01003 2018-09-11 09:39:30 2 6
01 R01003 2018-09-12 09:38:22 3 7
For week numbers, I'm trying to get an output based on days that would look like so:
site patient_id date time nebs_day days week
01 R01002 2019-04-04 NA 0 1 34 #Increases by one for each period
01 R01002 2019-04-05 16:57:14 1 2 34
01 R01002 2019-04-06 NA 0 3 34
01 R01002 2019-04-07 18:19:49 1 4 34
01 R01002 2019-04-08 NA 0 5 34
01 R01002 2019-04-09 15:06:44 1 6 34
01 R01002 2019-04-10 15:36:47 1 7 34
01 R01002 2019-04-11 NA 0 1 35
01 R01002 2019-04-12 17:42:32 1 2 35
01 R01002 2019-04-13 20:24:46 1 3 35 #Not a full 7days, but mark as 1 week
01 R01003 2018-09-06 17:58:41 1 1 1 #Count resets with new patient_id
01 R01003 2018-09-07 00:00:05 3 2 1
01 R01003 2018-09-08 12:52:57 1 3 1
01 R01003 2018-09-09 00:31:59 4 4 1
01 R01003 2018-09-10 12:36:52 2 5 1
01 R01003 2018-09-11 09:39:30 2 6 1
01 R01003 2018-09-12 09:38:22 3 7 1
Any help with this would be greatly appreciated. I have tried using days() and weeks() prior to this, but it's a requirement to distinguish 7 day periods in my dataset so it would not be fully accurate using lubridate.
Here's some sample data:
sample <- data.frame(
site = rep(1, each = 17),
patient_id = c("R01002", "R01002", "R01002", "R01002", "R01002", "R01002",
"R01002", "R01002", "R01002", "R01002", "R01003", "R01003", "R01003",
"R01003", "R01003", "R01003", "R01003"),
date = c("2019-04-04", "2019-04-05","2019-04-06", "2019-04-07", "2019-04-08", "2019-04-09", "2019-04-10","2019-04-11", "2019-04-12", "2019-04-13",
"2018-09-06", "2018-09-07", "2018-09-08", "2018-09-09", "2018-09-10",
"2018-09-11", "2018-09-12"),
nebs_day = c(0,1,0,1,0,1,1,0,1,1,1,3,1,4,2,2,3))
Thanks!