Fwiw, the dev version of data.table recently added some functions by Jan Gorecki for this. I am just following Uwe's examples from SO:
library(data.table)
DT = data.table(salesData)
rollup(DT,
lapply(.SD, sum),
by = c("District", "Name"),
sets = list("District", "Name")
)[order(District, Name)]
which gives
District Name NetSales Units AUR
1: 1 S1 1000 50 20.00000
2: 1 S2 1500 120 12.50000
3: 1 S3 3000 52 57.69231
4: 1 NA 5500 222 90.19231
5: 2 S4 2000 12 166.66667
6: 2 S5 1500 64 23.43750
7: 2 S6 3000 52 57.69231
8: 2 NA 6500 128 247.79647
9: NA NA 12000 350 337.98878
I agree with the others that this is an untidy format not really useful for further analysis, but it may be handy for browsing data.
I don't really know the syntax for these functions well enough, so there is probably a simpler way to write the code above. For those who don't know data.table:
-
lapply(.SD, sum)
is summing all columns not involved in theby=
list. -
[order(x,y,z)]
is the same asarrange(x,y,z)
.
You may find other answers in the SO link helpful as well, since it's a similar question.