After searching for some time for an appropriate answer, I could not find any that would solve my problem. (Typical topics with "Deleting columns with all "NA" values" or only focused on rows)
I have a large dataset of 384 rows and 214 variables and I would like to replicate the same matrix while removing any columns that contain the value "NA" in any of its rows. (No rows have to be deleted)
I have tried multiple codes but I get only a vector of "TRUE" and "FALSE".
Any help would be appreciated as I'm new to R, there is maybe a package that can be useful that I'm not aware of yet.
Here is an example of removing columns with the string "NA". If you need to remove columns with NA values, use is.na() instead of str_detect().
DF <- data.frame(A = c("A","A","A"),
B = c("B","NA","B"),
C = c("NA","C","NA"),
D = c("D","D","D"))
library(purrr)
library(stringr)
COL_keep <- map_lgl(DF, ~ !any(str_detect(.x, "NA")))
DFnew <- DF[, COL_keep]
DFnew
#> A D
#> 1 A D
#> 2 A D
#> 3 A D
The value NA, which is what is in your data, is different than the string "NA" that you mentioned in your first post. As I said in my first post:
Try
COL_keep <- map_lgl(DF, ~ !any(is.na(.x)))
Here is an example of that using the toy data modified to have NA values rather than "NA".
DF <- data.frame(A = c("A","A","A"),
B = c("B",NA,"B"),
C = c(NA,"C",NA),
D = c("D","D","D"))
library(purrr)
library(stringr)
COL_keep <- map_lgl(DF, ~ !any(is.na(.x)))
DFnew <- DF[, COL_keep]
DFnew
A D
1 A D
2 A D
3 A D