Hi everyone, have a large dataset with two type of patient identifiers--a patient ID, and a unique identifier for each admission (for example, patient C who has been admitted 3 times will have 3 unique admission identifiers). I also have a column with outcomes Survived and Death, and race. How do I manipulate my data so that multiple admission are not counted, and ultimately, the eventual outcome of Survived vs Death for each patient is counted? Also interested in Race to see if there's any correlation. I have included some sample data and how I eventually want my table to look. Please help! Thank you!!
DF <- data.frame(
Patient.ID = c("A", "B", "C", "C", "C", "D", "D"),
Admit.ID = c("1Zz", "1Yy", "5Pp", "3Cc", "9Dd", "4Yy", "4Dd"),
Race = c("White", "Black", "Asian", "Asian", NA, "Black", "Black"),
Survived = c(1, 0, 1, 0, 1, 1, 1),
Died = c(0, 1, 0, 1, 0, 0, 0))
DF_Cleaned <- data.frame(
Patient.ID = c("A", "B", "C", "D"),
Race = c("White", "Black", "Asian", "Black"),
Outcome = c(0, 1, 1, 0))
Created on 2020-11-20 by the reprex package (v0.3.0)