How to subselect in a dataframe with multiple conditions by columns

Hi guys,
I have a problem during the selecting the rows of a dataframe (composed by 138 milions of rows). I want to select all that rows that satisfy a particular condition:

tcga_tumors_copy <- tcga_tumors_copy[(tcga_tumors_copy$source %in% edges_conjugateABC$A & tcga_tumors_copy$dest %in% edges_conjugateABC$B),]

the number of rows in edges_conjugate ABC is around 5 milions of rows. After applying this condition, the tcga_tumors_copy is composed by 8 milions of rows.... that isn't possible..... How can I select the rows of a dataframes, based on a multiple conditions?

Thx in advance,

Not tested, due to lack of a reprex. See the FAQ: How to do a minimal reproducible example reprex for beginners.

x <- tcga_tumors_copy
setA <- edges_conjugateABC$A
setB <- edges_conjugateABC$B

x1 <- x[which(x["source"] %in% setA])
x2 <- x[which(x["dest"] %in% setB])

result <- intersect(x1,x2)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.