Replicating a table without columns containing a certain value

Rek-Tek · April 1, 2023, 9:08pm

Hello everyone,

After searching for some time for an appropriate answer, I could not find any that would solve my problem. (Typical topics with "Deleting columns with all "NA" values" or only focused on rows)

I have a large dataset of 384 rows and 214 variables and I would like to replicate the same matrix while removing any columns that contain the value "NA" in any of its rows. (No rows have to be deleted)

I have tried multiple codes but I get only a vector of "TRUE" and "FALSE".

Any help would be appreciated as I'm new to R, there is maybe a package that can be useful that I'm not aware of yet.

Thank you

FJCC · April 1, 2023, 10:25pm

Here is an example of removing columns with the string "NA". If you need to remove columns with NA values, use is.na() instead of str_detect().

DF <- data.frame(A = c("A","A","A"),
                 B = c("B","NA","B"),
                 C = c("NA","C","NA"),
                 D = c("D","D","D"))
library(purrr)
library(stringr)
COL_keep <- map_lgl(DF, ~ !any(str_detect(.x, "NA")))
DFnew <- DF[, COL_keep]
DFnew
#>   A D
#> 1 A D
#> 2 A D
#> 3 A D

^{Created on 2023-04-01 with reprex v2.0.2}

Rek-Tek · April 2, 2023, 11:12am

Thank you very much for the answer, unfortunately I still get columns with NA values in it. (See image)

I checked the class of the columns names and they appeared to be numeric. Could that explain the problem ?

(PS : The rows do not appear on the image but it is a time series)

FJCC · April 2, 2023, 1:47pm

The value NA, which is what is in your data, is different than the string "NA" that you mentioned in your first post. As I said in my first post:

Try

COL_keep <- map_lgl(DF, ~ !any(is.na(.x)))

Here is an example of that using the toy data modified to have NA values rather than "NA".

 DF <- data.frame(A = c("A","A","A"),
                  B = c("B",NA,"B"),
                  C = c(NA,"C",NA),
                  D = c("D","D","D"))
library(purrr)
library(stringr)
COL_keep <- map_lgl(DF, ~ !any(is.na(.x)))
DFnew <- DF[, COL_keep]
DFnew
  A D
1 A D
2 A D
3 A D

Rek-Tek · April 2, 2023, 5:01pm

It finally works, thank you very much for the answer !

I understand my mistakes now.

system · May 14, 2023, 5:02pm

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.