calculating proportions using dplyr

technocrat · September 30, 2020, 4:19am

Begin with the usual analysis in term of f(x) = y

x is the data at hand, y is the subset desired, and f is the function, or composite function, to turn the one into the other.

The choice of f will depend on whether the count or a proportion is required.

The OP code is off to a false start by filtering down df to only college graduates in stem majors. This precludes any proportionality test. If only a count is needed all that's required is

suppressPackageStartupMessages({library(dplyr)})

# prefer dat or my_df over df, which is a built-in function name
# this avoids situation where `df` can be read as a closure

mtcars %>%
  filter(cyl == 4 & carb == 2) %>%
  summarise_all(~ sum(., trim = .2))
#>     mpg  cyl  disp    hp  drat     wt   qsec  vs  am gear carb
#> 1 155.6 24.2 699.8 522.2 25.05 14.588 113.82 5.2 4.2 26.2 12.2

^{Created on 2020-09-29 by the reprex package (v0.3.0.9001)}