Hi,
I am working with a R dataframe. I am interested in extracting features with values passing the threshold. For instance; extracting only features passing the threshold that is value >= 50
across all columns. I have shown input and expected dataset below. One way was using the tidyverse
filter by each column manually, but there are hundreds of columns then typing each column name would be tedious. Furthermore, each column starts and ends with unique names. Is there a way to revise the below formula or straightforward method to perform this?
Thank you,
Toufiq
Input
dput(Data)
structure(list(Col1_Counts = c(100L, 10L, 2000L, 0L, 2000L, 0L,
11L, 15L, 19L, 0L, 100L, 50L, 10L, 100L, 50L), CSC1_Counts = c(150L,
50L, 150L, 3L, 50L, 0L, 12L, 16L, 20L, 23L, 1000L, 50L, 10L,
50L, 50L), BC_Counts = c(50L, 75L, 100L, 10L, 75L, 0L, 13L, 17L,
21L, 0L, 100000L, 40L, 10L, 100000L, 50L)), class = "data.frame", row.names = c("Feature_1",
"Feature_2", "Feature_3", "Feature_4", "Feature_5", "Feature_6",
"Feature_7", "Feature_8", "Feature_9", "Feature_10", "Feature_11",
"Feature_12", "Feature_13", "Feature_14", "Feature_15"))
#> Col1_Counts CSC1_Counts BC_Counts
#> Feature_1 100 150 50
#> Feature_2 10 50 75
#> Feature_3 2000 150 100
#> Feature_4 0 3 10
#> Feature_5 2000 50 75
#> Feature_6 0 0 0
#> Feature_7 11 12 13
#> Feature_8 15 16 17
#> Feature_9 19 20 21
#> Feature_10 0 23 0
#> Feature_11 100 1000 100000
#> Feature_12 50 50 40
#> Feature_13 10 10 10
#> Feature_14 100 50 100000
#> Feature_15 50 50 50
Expected Output:
library(tidyverse)
Data %>%
filter(Col1_Counts >=50 & CSC1_Counts >=50 & BC_Counts >=50)
Col1_Counts CSC1_Counts BC_Counts
Feature_1 100 150 50
Feature_3 2000 150 100
Feature_5 2000 50 75
Feature_11 100 1000 100000
Feature_14 100 50 100000
Feature_15 50 50 50
Created on 2023-02-12 with reprex v2.0.2