I am having a problem trying to use the by option in a data table. Either I am missing something blindingly obvious or I have a serious R problem and I have no idea how to check it. The data supplied here is the actual data in use but I have mocked up a couple of 3X3 data.tables and get the same errors.
I am trying to sum a column of integers (dollars) by a another variable (status)
It works until I try to add a variable name to the summation.
Setup details below.
Problem
library(data.table)
# Load data & convert to data.table ---------------------------------------
DT <- structure(list(iso = c("BRA", "CHN", "IRN", "ETH", "IND", "IDN",
"ARE", "RUS", "ZAF", "EGY", "SAU", "BLR", "BOL", "CUB", "KAZ",
"MYS", "THA", "UGA", "UZB"), cty = c("Brazil", "China", "Egypt",
"Ethiopia", "India", "Indonesia", "Iran", "Russia", "South Africa",
"UAE", "Saudi Arabia", "Belarus", "Bolivia", "Cuba", "Kazakhstan",
"Malaysia", "Thailand", "Uganda", "Uzbekistan"), dollars = c(4735725L,
38190085L, 1819807L, 434151L, 16192423L, 4662888L, 870439L, 6921249L,
989308L, 2225198L, 2519571L, 301471L, 159854L, NA, 842049L, 1378901L,
1771065L, 163713L, 431926L), status = c("Member", "Member", "Member",
"Member", "Member", "Member", "Member", "Member", "Member", "Member",
"Member", "Partner", "Partner", "Partner", "Partner", "Partner",
"Partner", "Partner", "Partner")), class = "data.frame", row.names = c(NA,
-19L))
setDT(DT) ; DT
# Works -------------------------------------------------------------------
DT[, sum(dollars, na.rm = TRUE)]
# Works
DT[, sum(dollars, na.rm = TRUE), by = status]
# Crashes -----------------------------------------------------------------
DT[, sigma = sum(dollars, na.rm = TRUE), by = status]
# Desperate attempt --use factor. Crashes------------------------------------------
DT[, sigma = sum(dollars, na.rm = TRUE), by = as.factor(status)]
*Error in [.data.table(DT, , sigma = sum(dollars, na.rm = TRUE), by = status) : *
- unused argument (sigma = sum(dollars, na.rm = TRUE))*
Ubuntu 24.04
RStudio 2025.09.2+418 "Cucumberleaf Sunflower"
sessionInfo()
R version 4.5.2 (2025-10-31)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_CA.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_CA.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/Toronto
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.18.0
loaded via a namespace (and not attached):
[1] compiler_4.5.2 tools_4.5.2 rstudioapi_0.18.0