Looping with non-standard evaluation (NSE)

slowkow · February 11, 2021, 9:24pm

I am trying to understand how to use NSE, but I have gotten lost.

I also struggle with thinking of the search terms to use in order to find
tutorials or code snippets.

I might be tempted to use functions like select_at() or group_by_at(),
but these say “lifecycle:superseded”.

What is the best path forward? Should we continue using superseded functions?

GOAL

I want this code to run as expected:

get_composition(d, c("size", "owner", "item"))

Failed attempts

Below, I tried a whole bunch of different things, and none of them work.

library(rlang)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

get_composition <- function(d, c1, c2, c3) {
  d %>%
    select({{ c1 }}, {{ c2 }}, {{ c3 }}) %>%
    group_by({{ c1 }}, {{ c2 }}, {{ c3 }}) %>%
    summarize(n = n()) %>%
    mutate(percent = 100 * n / sum(n))
}

d <- data.frame(
  size = sample(c("large", "small"), size = 1e3, replace = TRUE),
  color = sample(c("red", "blue"), size = 1e3, replace = TRUE),
  owner = sample(letters[11:15], size = 1e3, replace = TRUE),
  item = sample(letters[1:10], size = 1e3, replace = TRUE)
)

This works.

get_composition(d, size, owner, item)
#> `summarise()` regrouping output by 'size', 'owner' (override with `.groups` argument)
#> # A tibble: 100 x 5
#> # Groups:   size, owner [10]
#>    size  owner item      n percent
#>    <chr> <chr> <chr> <int>   <dbl>
#>  1 large k     a         7    7.14
#>  2 large k     b        15   15.3 
#>  3 large k     c        10   10.2 
#>  4 large k     d        15   15.3 
#>  5 large k     e         8    8.16
#>  6 large k     f         6    6.12
#>  7 large k     g        11   11.2 
#>  8 large k     h         8    8.16
#>  9 large k     i        11   11.2 
#> 10 large k     j         7    7.14
#> # … with 90 more rows

Yep, this also works.

get_composition(d, color, owner, item)
#> `summarise()` regrouping output by 'color', 'owner' (override with `.groups` argument)
#> # A tibble: 100 x 5
#> # Groups:   color, owner [10]
#>    color owner item      n percent
#>    <chr> <chr> <chr> <int>   <dbl>
#>  1 blue  k     a        13   14.1 
#>  2 blue  k     b         9    9.78
#>  3 blue  k     c         9    9.78
#>  4 blue  k     d        14   15.2 
#>  5 blue  k     e         5    5.43
#>  6 blue  k     f         6    6.52
#>  7 blue  k     g         9    9.78
#>  8 blue  k     h         9    9.78
#>  9 blue  k     i        11   12.0 
#> 10 blue  k     j         7    7.61
#> # … with 90 more rows

Can we call get_composition() in a loop like this? Nope.

for (mycol in c("size", "color")) {
  get_composition(d, mycol, owner, item) # Error: Column `mycol` is not found.
}
#> Note: Using an external vector in selections is ambiguous.
#> ℹ Use `all_of(mycol)` instead of `mycol` to silence this message.
#> ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
#> This message is displayed once per session.
#> Error: Must group by variables found in `.data`.
#> * Column `mycol` is not found.

OK, let’s make a character variable and try to use it …

mycol <- "size"

These attempts make a column with values equal to “size”. That’s not what we want.

get_composition(d, !!mycol, owner, item)
#> `summarise()` regrouping output by '"size"', 'owner' (override with `.groups` argument)
#> # A tibble: 50 x 5
#> # Groups:   "size", owner [5]
#>    `"size"` owner item      n percent
#>    <chr>    <chr> <chr> <int>   <dbl>
#>  1 size     k     a        24   12.1 
#>  2 size     k     b        24   12.1 
#>  3 size     k     c        17    8.59
#>  4 size     k     d        25   12.6 
#>  5 size     k     e        15    7.58
#>  6 size     k     f        14    7.07
#>  7 size     k     g        23   11.6 
#>  8 size     k     h        18    9.09
#>  9 size     k     i        20   10.1 
#> 10 size     k     j        18    9.09
#> # … with 40 more rows

get_composition(d, rlang::as_name(mycol), owner, item)
#> `summarise()` regrouping output by 'rlang::as_name(mycol)', 'owner' (override with `.groups` argument)
#> # A tibble: 50 x 5
#> # Groups:   rlang::as_name(mycol), owner [5]
#>    `rlang::as_name(mycol)` owner item      n percent
#>    <chr>                   <chr> <chr> <int>   <dbl>
#>  1 size                    k     a        24   12.1 
#>  2 size                    k     b        24   12.1 
#>  3 size                    k     c        17    8.59
#>  4 size                    k     d        25   12.6 
#>  5 size                    k     e        15    7.58
#>  6 size                    k     f        14    7.07
#>  7 size                    k     g        23   11.6 
#>  8 size                    k     h        18    9.09
#>  9 size                    k     i        20   10.1 
#> 10 size                    k     j        18    9.09
#> # … with 40 more rows

get_composition(d, rlang::enexpr(mycol), owner, item)
#> `summarise()` regrouping output by 'rlang::enexpr(mycol)', 'owner' (override with `.groups` argument)
#> # A tibble: 50 x 5
#> # Groups:   rlang::enexpr(mycol), owner [5]
#>    `rlang::enexpr(mycol)` owner item      n percent
#>    <chr>                  <chr> <chr> <int>   <dbl>
#>  1 size                   k     a        24   12.1 
#>  2 size                   k     b        24   12.1 
#>  3 size                   k     c        17    8.59
#>  4 size                   k     d        25   12.6 
#>  5 size                   k     e        15    7.58
#>  6 size                   k     f        14    7.07
#>  7 size                   k     g        23   11.6 
#>  8 size                   k     h        18    9.09
#>  9 size                   k     i        20   10.1 
#> 10 size                   k     j        18    9.09
#> # … with 40 more rows

get_composition(d, rlang::string(mycol), owner, item)
#> `summarise()` regrouping output by 'rlang::string(mycol)', 'owner' (override with `.groups` argument)
#> # A tibble: 50 x 5
#> # Groups:   rlang::string(mycol), owner [5]
#>    `rlang::string(mycol)` owner item      n percent
#>    <chr>                  <chr> <chr> <int>   <dbl>
#>  1 size                   k     a        24   12.1 
#>  2 size                   k     b        24   12.1 
#>  3 size                   k     c        17    8.59
#>  4 size                   k     d        25   12.6 
#>  5 size                   k     e        15    7.58
#>  6 size                   k     f        14    7.07
#>  7 size                   k     g        23   11.6 
#>  8 size                   k     h        18    9.09
#>  9 size                   k     i        20   10.1 
#> 10 size                   k     j        18    9.09
#> # … with 40 more rows

These attempts throw errors

get_composition(d, rlang::quo(mycol), owner, item)
#> Error: Must subset columns with a valid subscript vector.
#> x Subscript has the wrong type `quosure/formula`.
#> ℹ It must be numeric or character.

get_composition(d, rlang::parse_expr(mycol), owner, item)
#> Error: Problem with `mutate()` input `..1`.
#> x Input `..1` must be a vector, not a symbol.
#> ℹ Input `..1` is `rlang::parse_expr(mycol)`.

get_composition(d, rlang::expr(mycol), owner, item)
#> Error: Can't subset columns that don't exist.
#> x Column `mycol` doesn't exist.

get_composition(d, rlang::ensym(mycol), owner, item)
#> Error: Problem with `mutate()` input `..1`.
#> x Input `..1` must be a vector, not a symbol.
#> ℹ Input `..1` is `rlang::ensym(mycol)`.

This seems to work. It was inspired by this article.

get_composition2 <- function(d, c1, c2, c3) {
  d %>%
    select(.data[[c1]], .data[[c2]], .data[[c3]]) %>%
    group_by(.data[[c1]], .data[[c2]], .data[[c3]]) %>%
    summarize(n = n()) %>%
    mutate(percent = 100 * n / sum(n))
}

get_composition2(d, "size", "owner", "item")
#> `summarise()` regrouping output by 'size', 'owner' (override with `.groups` argument)
#> # A tibble: 100 x 5
#> # Groups:   size, owner [10]
#>    size  owner item      n percent
#>    <chr> <chr> <chr> <int>   <dbl>
#>  1 large k     a         7    7.14
#>  2 large k     b        15   15.3 
#>  3 large k     c        10   10.2 
#>  4 large k     d        15   15.3 
#>  5 large k     e         8    8.16
#>  6 large k     f         6    6.12
#>  7 large k     g        11   11.2 
#>  8 large k     h         8    8.16
#>  9 large k     i        11   11.2 
#> 10 large k     j         7    7.14
#> # … with 90 more rows

This does not work as expected… how do we fix it?

get_composition3 <- function(d, ...) {
  d %>%
    select({{...}}) %>%
    group_by({{...}}) %>%
    summarize(n = n()) %>%
    mutate(percent = 100 * n / sum(n))
}

get_composition3(d, size, owner, item)
#> Error: object 'owner' not found

get_composition3(d, c("size", "owner", "item"))
#> Error: "x" must be an argument name

get_composition3(d, "size", "owner", "item")
#> Error: unused arguments ("owner", "item")

This seems to work. So, what’s the “stringy” way to do this?

get_composition4 <- function(d, ...) {
  d %>%
    select(...) %>%
    group_by(...) %>%
    summarize(n = n()) %>%
    mutate(percent = 100 * n / sum(n))
}

get_composition4(d, size, owner, item)
#> `summarise()` regrouping output by 'size', 'owner' (override with `.groups` argument)
#> # A tibble: 100 x 5
#> # Groups:   size, owner [10]
#>    size  owner item      n percent
#>    <chr> <chr> <chr> <int>   <dbl>
#>  1 large k     a         7    7.14
#>  2 large k     b        15   15.3 
#>  3 large k     c        10   10.2 
#>  4 large k     d        15   15.3 
#>  5 large k     e         8    8.16
#>  6 large k     f         6    6.12
#>  7 large k     g        11   11.2 
#>  8 large k     h         8    8.16
#>  9 large k     i        11   11.2 
#> 10 large k     j         7    7.14
#> # … with 90 more rows

It’d be nice to have one of these expressions work as expected:

get_composition3(d, c("size", "owner", "item"))
get_composition3(d, "size", "owner", "item")

I followed the link in one of the error messages:
https://tidyselect.r-lib.org/reference/faq-external-vector.html

It let me to write this code, which also doesn’t work.

get_composition5 <- function(d, xs) {
  d %>%
    select(all_of(xs)) %>%
    group_by(all_of(xs)) %>%
    summarize(n = n()) %>%
    mutate(percent = 100 * n / sum(n))
}

xs <- c("size", "owner", "item")
get_composition5(d, xs)
#> Error: Problem with `mutate()` input `..1`.
#> x Input `..1` can't be recycled to size 1000.
#> ℹ Input `..1` is `all_of(xs)`.
#> ℹ Input `..1` must be size 1000 or 1, not 3.

^{Created on 2021-02-11 by the reprex package (v0.3.0)}

Session info

devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.3 (2020-10-10)
#>  os       macOS Catalina 10.15.7      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2021-02-11                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                             
#>  assertthat    0.2.1      2019-03-21 [2] CRAN (R 4.0.2)                     
#>  callr         3.5.1      2020-10-13 [2] CRAN (R 4.0.2)                     
#>  cli           2.2.0      2020-11-20 [2] CRAN (R 4.0.2)                     
#>  crayon        1.3.4      2017-09-16 [2] CRAN (R 4.0.2)                     
#>  desc          1.2.0      2018-05-01 [2] CRAN (R 4.0.2)                     
#>  devtools      2.3.0      2020-04-10 [2] CRAN (R 4.0.2)                     
#>  digest        0.6.27     2020-10-24 [2] CRAN (R 4.0.2)                     
#>  dplyr       * 1.0.2      2020-08-18 [2] CRAN (R 4.0.2)                     
#>  ellipsis      0.3.1      2020-05-15 [2] CRAN (R 4.0.2)                     
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 4.0.1)                     
#>  fansi         0.4.1      2020-01-08 [2] CRAN (R 4.0.2)                     
#>  fs            1.5.0      2020-07-31 [1] CRAN (R 4.0.2)                     
#>  generics      0.1.0      2020-10-31 [2] CRAN (R 4.0.2)                     
#>  glue          1.4.2      2020-08-27 [2] CRAN (R 4.0.2)                     
#>  highr         0.8        2019-03-20 [2] CRAN (R 4.0.2)                     
#>  htmltools     0.5.0      2020-06-16 [2] CRAN (R 4.0.2)                     
#>  knitr         1.30       2020-09-22 [1] CRAN (R 4.0.2)                     
#>  lifecycle     0.2.0      2020-03-06 [2] CRAN (R 4.0.2)                     
#>  magrittr      2.0.1.9000 2020-12-15 [1] Github (tidyverse/magrittr@bb1c86a)
#>  memoise       1.1.0.9000 2020-12-15 [1] Github (r-lib/memoise@0901e3f)     
#>  pillar        1.4.7      2020-11-20 [2] CRAN (R 4.0.2)                     
#>  pkgbuild      1.1.0      2020-07-13 [2] CRAN (R 4.0.2)                     
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 4.0.2)                     
#>  pkgload       1.1.0      2020-05-29 [2] CRAN (R 4.0.2)                     
#>  prettyunits   1.1.1      2020-01-24 [2] CRAN (R 4.0.2)                     
#>  processx      3.4.5      2020-11-30 [2] CRAN (R 4.0.2)                     
#>  ps            1.5.0      2020-12-05 [2] CRAN (R 4.0.2)                     
#>  purrr         0.3.4      2020-04-17 [2] CRAN (R 4.0.2)                     
#>  R6            2.5.0      2020-10-28 [2] CRAN (R 4.0.2)                     
#>  remotes       2.2.0      2020-07-21 [1] CRAN (R 4.0.2)                     
#>  rlang       * 0.4.9      2020-11-26 [2] CRAN (R 4.0.2)                     
#>  rmarkdown     2.6        2020-12-14 [1] CRAN (R 4.0.2)                     
#>  rprojroot     2.0.2      2020-11-15 [2] CRAN (R 4.0.2)                     
#>  sessioninfo   1.1.1      2018-11-05 [2] CRAN (R 4.0.2)                     
#>  stringi       1.5.3      2020-09-09 [2] CRAN (R 4.0.2)                     
#>  stringr       1.4.0      2019-02-10 [2] CRAN (R 4.0.2)                     
#>  testthat      3.0.0      2020-10-31 [2] CRAN (R 4.0.2)                     
#>  tibble        3.0.4      2020-10-12 [2] CRAN (R 4.0.2)                     
#>  tidyselect    1.1.0      2020-05-11 [2] CRAN (R 4.0.2)                     
#>  usethis       1.6.1      2020-04-29 [2] CRAN (R 4.0.2)                     
#>  utf8          1.1.4      2018-05-24 [2] CRAN (R 4.0.2)                     
#>  vctrs         0.3.5      2020-11-17 [2] CRAN (R 4.0.2)                     
#>  withr         2.3.0      2020-09-22 [2] CRAN (R 4.0.2)                     
#>  xfun          0.19       2020-10-30 [1] CRAN (R 4.0.2)                     
#>  yaml          2.2.1      2020-02-01 [2] CRAN (R 4.0.2)                     
#> 
#> [1] /Users/kamil/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

slowkow · February 11, 2021, 9:36pm

I have adopted the {{}} syntax into several functions now... but I am realizing in retrospect that this might not have been a wise decision.

In fact, I now want to pass a character to those functions (instead of an expression), because I find myself calling the functions in a loop.

As far as I know, there is no way to write a loop like this:

# Not valid!
for (mycol in c(wt, mpg, cyl)) { myfunc(d, mycol) }

We always write such a loop like this:

for (mycol in c("wt", "mpg", "cyl")) { myfunc(d, mycol) }

Right?

I guess my short term solution is to copy and paste code instead of writing a loop. But I hope that someone in the RStudio community can teach me a better way.

In other words, my "solution" is to write this instead of writing a loop over the 3 columns:

myfunc(d, wt)
myfunc(d, mpg)
myfunc(d, cyl)

gabriel.de.wit · February 12, 2021, 8:02pm

Hi, sorry if I missed this in the question: what are you expecting? Could you expand on your goal with an example of what you want the result of get_composition(d, c("size", "owner", "item")) to look like?

slowkow · February 15, 2021, 4:48pm

I might have over-shared too many details.

Let me try again, this time with a shorter post.

Code 1

This is great! But what’s the “stringy” way to do this?

get_composition4 <- function(d, ...) {
  d %>%
    select(...) %>%
    group_by(...) %>%
    summarize(n = n()) %>%
    mutate(percent = 100 * n / sum(n))
}

get_composition4(d, size, owner, item)
#> `summarise()` regrouping output by 'size', 'owner' (override with `.groups` argument)
#> # A tibble: 100 x 5
#> # Groups:   size, owner [10]
#>    size  owner item      n percent
#>    <chr> <chr> <chr> <int>   <dbl>
#>  1 large k     a         7    7.14
#>  2 large k     b        15   15.3 
#>  3 large k     c        10   10.2 
#>  4 large k     d        15   15.3 
#>  5 large k     e         8    8.16
#>  6 large k     f         6    6.12
#>  7 large k     g        11   11.2 
#>  8 large k     h         8    8.16
#>  9 large k     i        11   11.2 
#> 10 large k     j         7    7.14
#> # … with 90 more rows

Code 2

This is not working! How can we make it work like Code 1?

Specifically, we need to write a function that is going to work in a for-loop over column names.

get_composition3 <- function(d, ...) {
  d %>%
    select({{...}}) %>%
    group_by({{...}}) %>%
    summarize(n = n()) %>%
    mutate(percent = 100 * n / sum(n))
}

get_composition3(d, c("size", "owner", "item"))
#> Error: "x" must be an argument name

Code 3

For dplyr version 1.0.2, if we run ?select_at, then we get a documentation page that says lifecycle: superseded.

So, should we use it or not? It does seem to work...

get_composition6 <- function(d, ...) {
  d %>%
    select_at(...) %>%
    group_by_at(...) %>%
    summarize(n = n()) %>%
    mutate(percent = 100 * n / sum(n))
}

get_composition6(d, c("size", "owner", "item"))
#> `summarise()` regrouping output by 'size', 'owner' (override with `.groups` argument)
#> # A tibble: 100 x 5
#> # Groups:   size, owner [10]
#>    size  owner item      n percent
#>    <chr> <chr> <chr> <int>   <dbl>
#>  1 large k     a         7    7.14
#>  2 large k     b        15   15.3 
#>  3 large k     c        10   10.2 
#>  4 large k     d        15   15.3 
#>  5 large k     e         8    8.16
#>  6 large k     f         6    6.12
#>  7 large k     g        11   11.2 
#>  8 large k     h         8    8.16
#>  9 large k     i        11   11.2 
#> 10 large k     j         7    7.14
#> # … with 90 more rows

Since we have a "stringy" version, now we can write this for loop. Hooray!

(We cannot use Code 1 in this for loop.)

for (mycol in c("size", "color")) {
  get_composition6(d, c(mycol, "owner", "item"))
}

nirgrahamuk · February 15, 2021, 4:55pm

You can use it if it serves your purpose. It has been superceded by approaches involving across(), so the superseding message is to point you to that new functionality.

gabriel.de.wit · February 16, 2021, 12:26am

This works, but I don't know much about how grouped_df fits into the picture historically.

get_composition7 <- function(d, ...) {
  d %>%
    select(...) %>%
    grouped_df(...) %>%
    summarize(n = n()) %>%
    mutate(percent = 100 * n / sum(n))
}

get_composition7(d, c("size", "owner", "item"))
#> `summarise()` regrouping output by 'size', 'owner' (override with `.groups` argument)
#> # A tibble: 100 x 5
#> # Groups:   size, owner [10]
#> size  owner item      n percent
#> <fct> <fct> <fct> <int>   <dbl>
#>   1 large k     a         8    8.99
#> 2 large k     b         6    6.74
#> 3 large k     c        12   13.5 
#> 4 large k     d        12   13.5 
#> 5 large k     e         8    8.99
#> 6 large k     f         7    7.87
#> 7 large k     g         6    6.74
#> 8 large k     h         5    5.62
#> 9 large k     i        15   16.9 
#> 10 large k     j        10   11.2 
#> # … with 90 more rows

jsvnc · February 18, 2021, 10:11am

I can't think of a way of making a function accept both unquoted variable names and string variable names, but to "stringify" your function get_composition(), try this:

get_composition_string <- function(d, c1, c2, c3) {
  d %>%
    select(!!rlang::sym(c1),!!rlang::sym(c2),!!rlang::sym(c3)) %>%
    group_by(!!rlang::sym(c1),!!rlang::sym(c2),!!rlang::sym(c3)) %>%
    summarize(n = n()) %>%
    mutate(percent = 100 * n / sum(n))
}

joels · February 18, 2021, 10:51am

You can use across inside group_by() (which I think is what Nir was getting at) to allow a mixture of strings and/or bare column names for the grouping columns. For example:

library(tidyverse)

set.seed(2)
d <- data.frame(
  size = sample(c("large", "small"), size = 1e3, replace = TRUE),
  color = sample(c("red", "blue"), size = 1e3, replace = TRUE),
  owner = sample(letters[11:15], size = 1e3, replace = TRUE),
  item = sample(letters[1:10], size = 1e3, replace = TRUE)
)

get_composition5 <- function(data, groups=NULL) {
  data %>%
    group_by(across({{groups}})) %>%
    summarize(n = n()) %>%  # Could also use tally() here
    ungroup %>% 
    mutate(percent = 100 * n / sum(n))
}

Now run the function with bare column names or strings:

get_composition5(d, c(size, owner, item))
#> `summarise()` has grouped output by 'size', 'owner'. You can override using the `.groups` argument.
#> # A tibble: 100 x 5
#>    size  owner item      n percent
#>    <chr> <chr> <chr> <int>   <dbl>
#>  1 large k     a        12     1.2
#>  2 large k     b        16     1.6
#>  3 large k     c         8     0.8
#>  4 large k     d         9     0.9
#>  5 large k     e         9     0.9
#>  6 large k     f         8     0.8
#>  7 large k     g        16     1.6
#>  8 large k     h        10     1  
#>  9 large k     i         5     0.5
#> 10 large k     j         7     0.7
#> # … with 90 more rows

get_composition5(d, c("size", owner, item))
#> `summarise()` has grouped output by 'size', 'owner'. You can override using the `.groups` argument.
#> # A tibble: 100 x 5
#>    size  owner item      n percent
#>    <chr> <chr> <chr> <int>   <dbl>
#>  1 large k     a        12     1.2
#>  2 large k     b        16     1.6
#>  3 large k     c         8     0.8
#>  4 large k     d         9     0.9
#>  5 large k     e         9     0.9
#>  6 large k     f         8     0.8
#>  7 large k     g        16     1.6
#>  8 large k     h        10     1  
#>  9 large k     i         5     0.5
#> 10 large k     j         7     0.7
#> # … with 90 more rows

get_composition5(mtcars, c("cyl", vs))
#> `summarise()` has grouped output by 'cyl'. You can override using the `.groups` argument.
#> # A tibble: 5 x 4
#>     cyl    vs     n percent
#>   <dbl> <dbl> <int>   <dbl>
#> 1     4     0     1    3.12
#> 2     4     1    10   31.2 
#> 3     6     0     3    9.38
#> 4     6     1     4   12.5 
#> 5     8     0    14   43.8

We can run the function with various combinations of grouping columns using map. We also use map along with combn to produce all of the group combinations (including no groups).

g = c("size","owner","item")
seq_along(g) %>% 
  map(~combn(0:length(g), .x, simplify=FALSE)) %>% 
  flatten %>% 
  map(~get_composition5(d, .x))

#> [[1]]
#> # A tibble: 1 x 2
#>       n percent
#>   <int>   <dbl>
#> 1  1000     100
#> 
#> [[2]]
#> # A tibble: 2 x 3
#>   size      n percent
#> * <chr> <int>   <dbl>
#> 1 large   496    49.6
#> 2 small   504    50.4
#> 
#> [[3]]
#> # A tibble: 2 x 3
#>   color     n percent
#> * <chr> <int>   <dbl>
#> 1 blue    499    49.9
#> 2 red     501    50.1
#> 
#> [[4]]
#> # A tibble: 5 x 3
#>   owner     n percent
#> * <chr> <int>   <dbl>
#> 1 k       195    19.5
#> 2 l       198    19.8
#> 3 m       221    22.1
#> 4 n       206    20.6
#> 5 o       180    18  
#> 
#> [[5]]
#> # A tibble: 2 x 3
#>   size      n percent
#> * <chr> <int>   <dbl>
#> 1 large   496    49.6
#> 2 small   504    50.4
#> 
#> [[6]]
#> # A tibble: 2 x 3
#>   color     n percent
#> * <chr> <int>   <dbl>
#> 1 blue    499    49.9
#> 2 red     501    50.1
#> 
#> [[7]]
#> # A tibble: 5 x 3
#>   owner     n percent
#> * <chr> <int>   <dbl>
#> 1 k       195    19.5
#> 2 l       198    19.8
#> 3 m       221    22.1
#> 4 n       206    20.6
#> 5 o       180    18  
#> 
#> [[8]]
#> # A tibble: 4 x 4
#>   size  color     n percent
#>   <chr> <chr> <int>   <dbl>
#> 1 large blue    249    24.9
#> 2 large red     247    24.7
#> 3 small blue    250    25  
#> 4 small red     254    25.4
#> 
#> [[9]]
#> # A tibble: 10 x 4
#>    size  owner     n percent
#>    <chr> <chr> <int>   <dbl>
#>  1 large k       100    10  
#>  2 large l        95     9.5
#>  3 large m       115    11.5
#>  4 large n        93     9.3
#>  5 large o        93     9.3
#>  6 small k        95     9.5
#>  7 small l       103    10.3
#>  8 small m       106    10.6
#>  9 small n       113    11.3
#> 10 small o        87     8.7
#> 
#> [[10]]
#> # A tibble: 10 x 4
#>    color owner     n percent
#>    <chr> <chr> <int>   <dbl>
#>  1 blue  k       104    10.4
#>  2 blue  l       113    11.3
#>  3 blue  m       111    11.1
#>  4 blue  n        86     8.6
#>  5 blue  o        85     8.5
#>  6 red   k        91     9.1
#>  7 red   l        85     8.5
#>  8 red   m       110    11  
#>  9 red   n       120    12  
#> 10 red   o        95     9.5
#> 
#> [[11]]
#> # A tibble: 4 x 4
#>   size  color     n percent
#>   <chr> <chr> <int>   <dbl>
#> 1 large blue    249    24.9
#> 2 large red     247    24.7
#> 3 small blue    250    25  
#> 4 small red     254    25.4
#> 
#> [[12]]
#> # A tibble: 10 x 4
#>    size  owner     n percent
#>    <chr> <chr> <int>   <dbl>
#>  1 large k       100    10  
#>  2 large l        95     9.5
#>  3 large m       115    11.5
#>  4 large n        93     9.3
#>  5 large o        93     9.3
#>  6 small k        95     9.5
#>  7 small l       103    10.3
#>  8 small m       106    10.6
#>  9 small n       113    11.3
#> 10 small o        87     8.7
#> 
#> [[13]]
#> # A tibble: 10 x 4
#>    color owner     n percent
#>    <chr> <chr> <int>   <dbl>
#>  1 blue  k       104    10.4
#>  2 blue  l       113    11.3
#>  3 blue  m       111    11.1
#>  4 blue  n        86     8.6
#>  5 blue  o        85     8.5
#>  6 red   k        91     9.1
#>  7 red   l        85     8.5
#>  8 red   m       110    11  
#>  9 red   n       120    12  
#> 10 red   o        95     9.5
#> 
#> [[14]]
#> # A tibble: 20 x 5
#>    size  color owner     n percent
#>    <chr> <chr> <chr> <int>   <dbl>
#>  1 large blue  k        49     4.9
#>  2 large blue  l        54     5.4
#>  3 large blue  m        60     6  
#>  4 large blue  n        41     4.1
#>  5 large blue  o        45     4.5
#>  6 large red   k        51     5.1
#>  7 large red   l        41     4.1
#>  8 large red   m        55     5.5
#>  9 large red   n        52     5.2
#> 10 large red   o        48     4.8
#> 11 small blue  k        55     5.5
#> 12 small blue  l        59     5.9
#> 13 small blue  m        51     5.1
#> 14 small blue  n        45     4.5
#> 15 small blue  o        40     4  
#> 16 small red   k        40     4  
#> 17 small red   l        44     4.4
#> 18 small red   m        55     5.5
#> 19 small red   n        68     6.8
#> 20 small red   o        47     4.7

^{Created on 2021-02-18 by the reprex package (v1.0.0)}

slowkow · February 18, 2021, 3:21pm

Thank you, Joel, for showing an example of group_by() with across()!

I must have been looking in the wrong places for examples and documentation. Your reply is the golden ticket for me.

I might need to spend some time trying to understand the examples shown in the documentation for ?across. This looks useful.

system · February 25, 2021, 3:21pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.