fary
October 2, 2021, 7:07pm
1
hello everybody
I am new to r and I know it may seem such a stupid question.
I use this code to have the number of NA for a specific variable.
sum(is.na(df$SureAboutHeight))
but I want to know the percentage of NA for each variable.
what should i do?
startz
October 2, 2021, 7:23pm
2
sum(is.na(df$SureAboutHeight))/nrow(df$SureAboutHeight)
1 Like
suppressPackageStartupMessages(library("dplyr"))
# Duplicate mtcars dataset as example
mtcars_na <- mtcars
# Add in some missing data
mtcars_na[8, c(1, 3, 4)] <- NA
mtcars_na[9, c(1, 2, 3)] <- NA
# In base R
vapply(mtcars_na, function(x) {
100 * sum(is.na(x), na.rm = TRUE) / length(x)
}, FUN.VALUE = double(1L))
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> 6.250 3.125 6.250 3.125 0.000 0.000 0.000 0.000 0.000 0.000 0.000
# Using tidyverse approach
# Specify the function you want applied to each column
fns <- list(pct_missing = ~ 100 * sum(is.na(.), na.rm = TRUE) / length(.))
# For all columns
(mtcars_summary <- mtcars_na %>%
summarise(across(, .fns = fns)))
#> mpg_pct_missing cyl_pct_missing disp_pct_missing hp_pct_missing
#> 1 6.25 3.125 6.25 3.125
#> drat_pct_missing wt_pct_missing qsec_pct_missing vs_pct_missing
#> 1 0 0 0 0
#> am_pct_missing gear_pct_missing carb_pct_missing
#> 1 0 0 0
Created on 2021-10-02 by the reprex package (v2.0.1)
1 Like
joels
October 2, 2021, 10:31pm
4
Another option is to use the mean
function. is.na(x)
returns TRUE
if a value is NA
and FALSE
otherwise. mean
treats TRUE
as equal to 1 and FALSE
as equal to 0, so the mean is the fraction of values that are NA
. For example:
x = c(1,2,3,NA,NA)
is.na(x)
#> [1] FALSE FALSE FALSE TRUE TRUE
# Fraction NA
mean(is.na(x))
#> [1] 0.4
# Percentage NA
mean(is.na(x)) * 100
#> [1] 40
2 Likes
fary
October 3, 2021, 7:31am
5
joels:
mean(is.na(x)) * 100
Many thanks for your help.
system
Closed
October 10, 2021, 7:31am
6
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.