Creating table of the mean and sd of one variable over other variables with different groups

es_dutch · April 30, 2020, 11:58am

For my master thesis I want to create a table in Rstudio, however I cannot find the right way to do it. I hope you can help me or push me in the right direction.

I want to find out whether organisations differ in their budget residual (ratio variable, percentage above or below budget) if they have a certain type of instrument in place. So I want to create a table with the mean (sd) of that budget residual for different organization sizes (small, medium, large) if they have an instrument (1, 2, 3) in place. The instrument is a dummy variable with variable 1 of having that instrument and 0 for not having that instrument. Also, I would like to have the p-value of whether the differences between the two means of having or not having that instrument for a small organisation in the table (for example with a Kruskal wallis test). I also work with latex, so I want to export it in latex format.

I already tried compareGroups package but that gives the mean of the instruments itself and not of the budget.

nirgrahamuk · April 30, 2020, 12:05pm

?

es_dutch · April 30, 2020, 12:25pm

Thank you! That does work indeed. But I have to compute every single variable by hand and I have more variables that I want to do so if there is a way to do it automatically, that would be perfect.

nirgrahamuk · April 30, 2020, 1:11pm

if you provided a reprex, and show how you do two by hand I can show you how to do all off them automated

es_dutch · April 30, 2020, 1:57pm

This is how I make the table with the values:

total2017 %>% group_by(size, strictness) %>%
  summarise(mean=mean(budget, na.rm=TRUE), sd=sd(budget, na.rm=TRUE), n=n()) %>% 
  filter(!is.na(strictness))

Though I would like the table in a different format, so that I have the different size groups in the columns next to each other.

And how I perform a Kruskal-wallis test:

kruskal.test(budget~ strictness, data=small2017)
pairwise.wilcox.test(small2017$budget, small2017$strictness, p.adjust.method="BH")

kruskal.test(budget~ strictness, data=medium2017)
pairwise.wilcox.test(medium2017$budget, small2017$strictness, p.adjust.method="BH")

kruskal.test(budget~ strictness, data=large2017)
pairwise.wilcox.test(large2017$budget, small2017$strictness, p.adjust.method="BH")

nirgrahamuk · April 30, 2020, 2:08pm

almost there.
can you provide sufficient example data, i.e. total2017 so as to produce some output against your code ?

FAQ: How to do a minimal reproducible example ( reprex ) for beginners

es_dutch · April 30, 2020, 2:26pm

Is this okay?


example <- tibble::tribble(
     ~size,     ~budget, ~strictself,
   "Small", "11,718183",          1L,
  "Medium",  "5,264815",          2L,
   "Large",  "-0,43848",          NA,
   "Small",  "0,658158",          2L,
  "Medium",   "-1,2867",          3L,
   "Large",   "-0,8482",          2L,
   "Small",  "4,584138",          1L,
  "Medium",   "7,26868",          1L,
   "Large",   "-7,1868",          NA,
   "Small",  "8,186884",          1L,
  "Medium",   "16,1548",          NA,
   "Large",  "1,516844",          2L,
   "Small",   "-5,1687",          2L,
  "Medium",  "11,15687",          2L,
   "Large",   "-4,1867",          3L,
   "Small",   "7,25687",          3L,
  "Medium", "5,1682186",          1L,
   "Large", "-6,186515",          2L,
   "Small",    "7,2687",          3L
  )

head(example)
#> # A tibble: 6 x 3
#>   size   budget    strictself
#>   <chr>  <chr>          <int>
#> 1 Small  11,718183          1
#> 2 Medium 5,264815           2
#> 3 Large  -0,43848          NA
#> 4 Small  0,658158           2
#> 5 Medium -1,2867            3
#> 6 Large  -0,8482            2

^{Created on 2020-04-30 by the reprex package (v0.3.0)}

nirgrahamuk · April 30, 2020, 3:34pm

your budget wasnt numeric so i had to alter that. and strictness has become strictself.

library(tidyverse)
library(broom) # tidy models
library(purrr) # iteration
example <- tibble::tribble(
  ~size, ~budget, ~strictself,
  "Small", "11,718183", 1L,
  "Medium", "5,264815", 2L,
  "Large", "-0,43848", NA,
  "Small", "0,658158", 2L,
  "Medium", "-1,2867", 3L,
  "Large", "-0,8482", 2L,
  "Small", "4,584138", 1L,
  "Medium", "7,26868", 1L,
  "Large", "-7,1868", NA,
  "Small", "8,186884", 1L,
  "Medium", "16,1548", NA,
  "Large", "1,516844", 2L,
  "Small", "-5,1687", 2L,
  "Medium", "11,15687", 2L,
  "Large", "-4,1867", 3L,
  "Small", "7,25687", 3L,
  "Medium", "5,1682186", 1L,
  "Large", "-6,186515", 2L,
  "Small", "7,2687", 3L
) %>% mutate(budget = parse_number(budget))

(size_strict_summary <- example %>% 
    group_by(size, strictself) %>%
  summarise(mean = mean(budget, na.rm = TRUE),
            sd = sd(budget, na.rm = TRUE),
            n = n()) %>%
  filter(!is.na(strictself)))

(krusjal_tests <- map_dfr(
  unique(size_strict_summary$size),
  ~ mutate(kruskal.test(budget ~ strictself,
                        data = filter(example, size == .)) %>% tidy(),
    esub = .
  ) %>% select(esub, everything())
))

(pairwilcox_tests <- map_dfr(
  unique(size_strict_summary$size),
  ~ mutate(pairwise.wilcox.test(filter(example, size == .)[["budget"]],
                                filter(example, size == .)[["strictself"]],
    p.adjust.method = "BH"
  ) %>% tidy(),
  esub = .
  ) %>% select(esub, everything())
))

system · May 21, 2020, 3:43pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.