The problem I am having is with this next section:
rodents_seasons %>%
filter(is.na(hindfoot_length) | is.na(weight) | is.na(sex)) %>%
count(season_name, sort= TRUE)
As it is not counting the missing values correctly. I want to be able to count how much missing data there is for each season.
I don not see anything wrong with your code and it works with some data I invented. Could you provide some data that does show the problem? You can post an easily copied version of a data frame by posting the output of the dput() function. If you data frame is named DF, run
dput(DF)
and post that output bewteen lines with three back ticks, like this
```
Output of dput() goes here
```
If your data set is large, post just enough rows to show the problem.
Example of your code working:
library(dplyr)
rodents_seasons <- data.frame(id=1:6,
hindfoot_length=c(1,1,NA,1,1,1),
weight=c(2,NA,2,2,2,2),
sex=c(1,2,1,1,NA,2),
season_name=c("W","Sp","W","A","W","Sp"))
rodents_seasons
#> id hindfoot_length weight sex season_name
#> 1 1 1 2 1 W
#> 2 2 1 NA 2 Sp
#> 3 3 NA 2 1 W
#> 4 4 1 2 1 A
#> 5 5 1 2 NA W
#> 6 6 1 2 2 Sp
rodents_seasons %>%
filter(is.na(hindfoot_length) | is.na(weight) | is.na(sex)) %>%
count(season_name, sort= TRUE)
#> season_name n
#> 1 W 2
#> 2 Sp 1
This does partly work, but I need for the data output to be merged, so for example, I want it to output the season_name and the total number of missing values for that season. It is currently spliting this data up like this:
So you want the number of NA values for each season in each column? Something like this?
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
rodents_seasons <- data.frame(id=1:6,
hindfoot_length=c(1,1,NA,1,1,NA),
weight=c(2,NA,2,2,NA,2),
sex=c(NA,2,NA,NA,NA,2),
season_name=c("W","Sp","W","A","W","Sp"))
rodents_seasons
#> id hindfoot_length weight sex season_name
#> 1 1 1 2 NA W
#> 2 2 1 NA 2 Sp
#> 3 3 NA 2 NA W
#> 4 4 1 2 NA A
#> 5 5 1 NA NA W
#> 6 6 NA 2 2 Sp
rodents_seasons %>%
group_by(season_name) |>
summarize(across(.cols = hindfoot_length:sex, .fns = ~sum(is.na(.x))))
#> # A tibble: 3 × 4
#> season_name hindfoot_length weight sex
#> <chr> <int> <int> <int>
#> 1 A 0 0 1
#> 2 Sp 1 1 0
#> 3 W 1 1 3