Looking for examples displaying data with multi-level totals

Fwiw, the dev version of data.table recently added some functions by Jan Gorecki for this. I am just following Uwe's examples from SO:

library(data.table)
DT = data.table(salesData)

rollup(DT, 
  lapply(.SD, sum), 
  by = c("District", "Name"), 
  sets = list("District", "Name")
)[order(District, Name)]

which gives

   District Name NetSales Units       AUR
1:        1   S1     1000    50  20.00000
2:        1   S2     1500   120  12.50000
3:        1   S3     3000    52  57.69231
4:        1   NA     5500   222  90.19231
5:        2   S4     2000    12 166.66667
6:        2   S5     1500    64  23.43750
7:        2   S6     3000    52  57.69231
8:        2   NA     6500   128 247.79647
9:       NA   NA    12000   350 337.98878

I agree with the others that this is an untidy format not really useful for further analysis, but it may be handy for browsing data.

I don't really know the syntax for these functions well enough, so there is probably a simpler way to write the code above. For those who don't know data.table:

  • lapply(.SD, sum) is summing all columns not involved in the by= list.
  • [order(x,y,z)] is the same as arrange(x,y,z).

You may find other answers in the SO link helpful as well, since it's a similar question.

1 Like