Hello RStudio Community,
I have the following treatment data (Id, Startdate, Enddate, and trt). I would like to make a new combinedtrt column from trt, if for every Id, Startdate (current row) <Enddate (above row), then add the value of above row with the current one (e.g." B+C " for ID=1 and row 3). Please look at the final expected output and I highly appreciate your help in advance.
Thanks
# treatment data
data <- data.frame(Id = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L),
Startdate = c("2020-01-20", "2020-02-20", "2020-03-20", "2020-04-20", "2020-05-10", "2020-01-20", "2020-02-15", "2020-03-20", "2020-04-15", "2020-05-10", "2020-07-15", "2020-08-15", "2020-10-25", "2020-11-10", "2020-04-10", "2020-04-10", "2020-08-15", "2020-10-25", "2020-10-27"),
Enddate = c("2020-02-20", "2020-03-25", "2020-04-22", "2020-05-15", "2020-06-12", "2020-02-20", "2020-03-20", "2020-04-22", "2020-05-15", "2020-06-12", "2020-08-20", "2020-09-22", "2020-11-15", "2021-01-12", "2020-05-12", "2020-08-20", "2020-09-22", "2020-11-15", "2021-01-12"),
Trt = factor(c("A", "B", "C", "A", "D", "A", "B", "C", "D", "D", "B", "C", "C", "D", "D", "B", "C", "C", "D")),
stringsAsFactors = FALSE)
# expected output data
Id Startdate Enddate Trt Combinedtrt
1 2020-01-20 2020-02-20 A
1 2020-02-20 2020-03-25 B
1 2020-03-20 2020-04-22 C B+C
1 2020-04-20 2020-05-15 A C+A
1 2020-05-10 2020-06-12 D A+D
2 2020-01-20 2020-02-20 A
2 2020-02-15 2020-03-20 B A+B
2 2020-03-20 2020-04-22 C
2 2020-04-15 2020-05-15 D C+D
3 2020-05-10 2020-06-12 D
3 2020-07-15 2020-08-20 B
3 2020-08-15 2020-09-22 C B+C
3 2020-10-25 2020-11-15 C
3 2020-11-10 2021-01-12 D C+D
4 2020-04-10 2020-05-12 D
4 2020-04-10 2020-08-20 B D+B
4 2020-08-15 2020-09-22 C B+C
4 2020-10-25 2020-11-15 C
4 2020-10-27 2021-01-12 D C+D