I'm trying to come up with the most efficient data.table
version of an operation I typically perform using dplyr
. I have a solution, but I'm wondering if anyone has a cleaner/more efficient answer to this simple task. All I'm doing is returning a sorted vector of the unique elements in a column.
dplyr
version
library(tibble)
library(dplyr)
x.tbl <- tibble(
a = c("b", "a", "b", "a"),
b = 1:4
)
x.tbl %>%
distinct(a) %>%
pull() %>%
sort()
#> [1] "a" "b"
data.table
version
library(data.table)
x.dt <- as.data.table(x.tbl)
sort(x.dt[, .N, by = a][, a])
#> [1] "a" "b"
Can anyone suggest a more efficient or cleaner way of doing this with data.table
? Is there an way to eliminate the data.table
chaining and wrapping in sort()
?
Thanks!