OK, I talked myself off the ledge of using `group_by( row_number() )` by testing it. Turns out @hadley 's list naming trick is ~ 10x faster. Here's my test:
```library(rbenchmark)
library(tidyverse)
set.seed(42)
n <- 1e4
df <- data.frame(my_int = sample(1:5, n, replace=TRUE),
my_min = sample(1:5, n, replace=TRUE),
range = sample(1:5, n, replace=TRUE))
benchmark(
df %>%
group_by(r=row_number()) %>%
mutate(calc = list(runif(my_int, my_min, my_min + range) )) %>%
ungroup() %>%
select(-r) ->
out
)
#> test
#> 1 out <- df %>% group_by(r = row_number()) %>% mutate(calc = list(runif(my_int, my_min, my_min + range))) %>% ungroup() %>% select(-r)
#> replications elapsed relative user.self sys.self user.child sys.child
#> 1 100 51.51 1 51.42 0.06 NA NA
benchmark(
df %>%
mutate(data = pmap(list(n = my_int, min = my_min, max = my_min + range), runif)) -> out
)
#> test
#> 1 out <- df %>% mutate(data = pmap(list(n = my_int, min = my_min, max = my_min + range), runif))
#> replications elapsed relative user.self sys.self user.child sys.child
#> 1 100 5.5 1 5.5 0 NA NA
```
5 Likes