Hi, I currently have a dataset 'data' contains more than 100 variables. I am doing some data cleaning right now, and I would like to drop all variables with more than 20% missing values.
Right now I have the following code,
library(purrr)
data2 <- data[!map_lgl(data, (is.na(.)))]
But I am getting this error message: Error: Can't convert a logical vector to function
Call rlang::last_error()
to see a backtrace
Is there a better/correct way to do this? Thanks!
1 Like
Welcome to the community!
Does this work for you?
is_column_with_at_least_eighty_percent_non_missing <- function(t)
{
mean(x = is.na(x = t)) < 0.20
}
Filter(f = is_column_with_at_least_eighty_percent_non_missing,
x = dataset)
1 Like
This works! I also solved by doing this:
data2 <- data[, which(colMeans(is.na(data)) > 0.5)]
system
Closed
September 3, 2019, 5:13pm
5
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.