I could give a more helpful answer with a reproducible example, called a reprex. In this case, even a link to the source data would help.
But the big problem is that icgc_donor_id
is class character
, and without quotation "D038988" is taken to be a separate object, such as
DO38988 <- 42
The next problem is with filter
You can drop ClinData$
. It's not causing problems but it's unnecessary; the pipe
ClinData %>%
implies it.
But what does cause problems is that filter
is a logical test, and to test for multiple conditions you need something in the form of
df %>% filter(var1 == "a" | var1 == "b" | ... | var1 == "z")
which is definitely a lot of error prone typing.
Instead, create a list
omits <- c("DO38988", "DO38968", "DO38962", "DO14966", "DO38937", "DO14870", "DO48165", "DO14534", "DO14510", "DO15735", "DO14408", "DO14440", "DO49054", "DO14726", "DO48143", "DO48070", "DO15927", "DO14152", "DO14161", "DO40263", "DO14363", "DO40646", "DO14325", "DO14333", "DO40592", "DO40586", "DO14288", "DO14290", "DO40520", "DO14165", "DO40087", "DO16045", "DO14153", "DO40322")
and then do a "negative subset"
keeps <- ClinData[!(ClinData$icgc_donor_id %in% omits)]
I would like to have been able to test this, but wasn't going to try to reverse engineer the source data