I've run into a problem with my understanding of unnest on a particular dataset and after numerous attempts to resolve it via Stack Overflow, blogs and tons of Googling, I step away from the problem. I'm certain its my lack of understanding but I'm hoping asking this here, helps others.
I have a dataframe -- simplified here as I haven't figured out a simple way to provide code to create it. It has a column of names and then two columns of lists representing input and output observations with their timestamps (seconds since the epoch) and values:
> str(unnestquestion)
'data.frame': 3 obs. of 3 variables:
$ Input:List of 3
..$ : num [1:3, 1:2] 1.51e+09 1.51e+09 1.51e+09 5.00e+05 NA ...
..$ : num [1:3, 1:2] 1.51e+09 1.51e+09 1.51e+09 3.39e+05 NA ...
..$ : num [1:3, 1:2] 1.51e+09 1.51e+09 1.51e+09 4.60e+06 NA ...
$ Ouput:List of 3
..$ : num [1:3, 1:2] 1.51e+09 1.51e+09 1.51e+09 4.22e+06 NA ...
..$ : num [1:3, 1:2] 1.51e+09 1.51e+09 1.51e+09 7.46e+06 NA ...
..$ : num [1:3, 1:2] 1.51e+09 1.51e+09 1.51e+09 2.39e+07 NA ...
$ name : chr "CIR0019209" "CIR0019431" "CIR0006077"
I've dreamt I resolved this in the past but my RStudio had 50 open tabs and I over aggressively cleaned it up recently.
Right now you have list columns of matrices, which don't unnest well. You can use purrr::map to iterate over each list column and coerce each matrix to a data.frame, which can be unnested properly:
library(tidyverse)
df_of_matrices <- data_frame(name = c('a', 'b', 'c'),
input = list(matrix(1:6, 3)), # recycles 3x
output = list(matrix(rnorm(6), 3))) # recycles 3x
df_of_matrices
#> # A tibble: 3 x 3
#> name input output
#> <chr> <list> <list>
#> 1 a <int [3 x 2]> <dbl [3 x 2]>
#> 2 b <int [3 x 2]> <dbl [3 x 2]>
#> 3 c <int [3 x 2]> <dbl [3 x 2]>
df_of_matrices %>%
mutate_if(is.list, map, as_data_frame) %>%
unnest()
#> # A tibble: 9 x 5
#> name V1 V2 V11 V21
#> <chr> <int> <int> <dbl> <dbl>
#> 1 a 1 4 0.5319480 1.1047462
#> 2 a 2 5 1.9041804 -0.6874434
#> 3 a 3 6 0.5646727 0.2721582
#> 4 b 1 4 0.5319480 1.1047462
#> 5 b 2 5 1.9041804 -0.6874434
#> 6 b 3 6 0.5646727 0.2721582
#> 7 c 1 4 0.5319480 1.1047462
#> 8 c 2 5 1.9041804 -0.6874434
#> 9 c 3 6 0.5646727 0.2721582
The limitation, obviously, is that by default it will make a mess of names, but they can be set to something more useful in the same fashion or afterwards.
This response is fantastic, I learned so much from a small number of lines. Thanks for helping me understand how to recreate the example data, a great use of mutate_if and purrr::map! I'll be spending more time with purrr.