Efficient descriptive stats with across and summarise

David_Stanley · June 10, 2020, 9:56pm

Hi Everyone,
I'm in psychology and we often want to create tables of summary statistics (mean,sd, etc) for all the variables in a data set. I know there are many other packages that can do so - but I'm trying to find a way to generate a descriptive summary table using the tidyverse/across/summarise.

I have a working solution - but it seems a bit long and convoluted. Can anyone suggest more efficient code? See example below you paste right into the Console.
Hopefully there is a shorter more efficient way to obtain this output. It seems like I should be able to get from row_sum to wide_summary using a single pivot (or similar command) rather than the two pivot commands below. In general, I think I may be talking the long way around to getting this table. If I'm not taking the long way around - that would be great to know too! My goal is to avoid giving my graduate/undergraduate classes poor code.

Thanks for any insights!
Cheers,
David

library(tidyverse)

mean_sd <- list(
mean = ~mean(.x, na.rm = TRUE),
sd = ~sd(.x, na.rm = TRUE)
)

row_sum <- attitude %>% summarise(across(everything(), mean_sd))

long_summary <- row_sum %>% pivot_longer(cols = everything(),
names_to = c("var", "stat"),
names_sep = c("_"),
values_to = "value")

wide_summary <- long_summary %>% pivot_wider(names_from = stat,
values_from = value)

print(wide_summary)

wide_summary

A tibble: 7 x 3

#var mean sd
#
#1 rating 64.6 12.2
#2 complaints 66.6 13.3
#3 privileges 53.1 12.2
#4 learning 56.4 11.7
#5 raises 64.6 10.4
#6 critical 74.8 9.89
#7 advance 42.9 10.3

nirgrahamuk · June 11, 2020, 12:06pm

it doesnt seem super convoluted to me, besides, if you wrapped it in a function, you would hide all that code from yourself and can just call it as needed.

Here is an alternative though, and it is slightly shorter.

mysummariser <- function(x){
  purrr::map_dfr(names(x),
                 ~tibble(name=.,
                         mean=mean(x[[.]]),
                         sd=sd(x[[.]])))
}
mysummariser(attitude)

David_Stanley · June 11, 2020, 3:24pm

Thanks - great to have another set of eyes give it a once over. Also, thanks for the purrr approach.

system · July 2, 2020, 3:24pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.