When using unnest_longer() on a list column that contains tibbles the output is different than what is anticipated. I am unsure how to describe the problem so please refer to the reprex below as a supplement!
When using the legacy unnest(), a new column is made for each column in the tibbles in the list columns. When using unnest_longer() rather than creating new columns as described above, the unnested column becomes a tibble column. While this looks fine when printed, it cannot be interacted with with standard dplyr functions—or if it can, it is unclear how to do so.
Is there a way to unnest the tibble as the legacy unnest() does with unnest_longer()? Or is there a new approach to do this within the tidyverse?
library(tidyr)
library(dplyr)
tbl_list <- list(
tibble(y = letters[1:3]),
tibble(y = letters[4:6]),
tibble(y = letters[7:9])
)
my_tbl <- tibble(id = 1:3, x = tbl_list)
(tbl_legacy_unnest <- my_tbl %>%
unnest(x))
#> # A tibble: 9 x 2
#> id y
#> <int> <chr>
#> 1 1 a
#> 2 1 b
#> 3 1 c
#> 4 2 d
#> 5 2 e
#> 6 2 f
#> 7 3 g
#> 8 3 h
#> 9 3 i
(tbl_unnest_longer <- my_tbl %>%
unnest_longer(x))
#> # A tibble: 9 x 2
#> id x$y
#> <int> <chr>
#> 1 1 a
#> 2 1 b
#> 3 1 c
#> 4 2 d
#> 5 2 e
#> 6 2 f
#> 7 3 g
#> 8 3 h
#> 9 3 i
select(tbl_legacy_unnest, y)
#> # A tibble: 9 x 1
#> y
#> <chr>
#> 1 a
#> 2 b
#> 3 c
#> 4 d
#> 5 e
#> 6 f
#> 7 g
#> 8 h
#> 9 i
select(tbl_unnest_longer, y)
#> Error: Can't subset columns that don't exist.
#> x Column `y` doesn't exist.
Are you sure you want to use unnest_longer() in this case? From the docs here:
These principles guide their behaviour when they are called with a non-primary data type. For example, if you unnest_wider() a list of data frames, the number of rows must be preserved, so each column is turned into a list column of length one. Or if you unnest_longer() a list of data frame, the number of columns must be preserved so it creates a packed column. I'm not sure how if these behaviours are useful in practice, but they are theoretically pleasing.
The key phrase there about unnest_longer() is "the number of columns must be preserved so it creates a packed column" (my emphasis). You probably want to use unnest() (the newer variant). But in case you don't, you can still interact with y (which I assume is your end goal here), like so:
mutate(tbl_unnest_longer, rev_y = rev(x$y))
#> # A tibble: 9 x 3
#> id x$y rev_y
#> <int> <chr> <chr>
#> 1 1 a i
#> 2 1 b h
#> 3 1 c g
#> 4 2 d f
#> 5 2 e e
#> 6 2 f d
#> 7 3 g c
#> 8 3 h b
#> 9 3 i a
And here's just a plain-ole unnest():
unnest(my_tbl, x)
#> # A tibble: 9 x 2
#> id y
#> <int> <chr>
#> 1 1 a
#> 2 1 b
#> 3 1 c
#> 4 2 d
#> 5 2 e
#> 6 2 f
#> 7 3 g
#> 8 3 h
#> 9 3 i