flatten ouput of rsample::vfold_cv(), respectively rsample::loo_cv()

How can we flatten out the ouput of rsample::vfold_cv(), respectively rsample::loo_cv()?

I tried tidyr::unnest(splits) without success.

Hi @John_Rambo,

I am not entirely sure what you mean by "flatten", but if you want to extract all of the analysis/training data for each fold into one data set, this should work:

library(rsample)
library(tidyverse)

folds <- vfold_cv(mtcars)

folds %>% 
  mutate(data = map(splits, analysis)) %>% 
  unnest(data)
#> # A tibble: 288 × 13
#>    splits         id       mpg   cyl  disp    hp  drat    wt  qsec    vs    am
#>    <list>         <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1 <split [28/4]> Fold01  21       6  160    110  3.9   2.62  16.5     0     1
#>  2 <split [28/4]> Fold01  21       6  160    110  3.9   2.88  17.0     0     1
#>  3 <split [28/4]> Fold01  22.8     4  108     93  3.85  2.32  18.6     1     1
#>  4 <split [28/4]> Fold01  21.4     6  258    110  3.08  3.22  19.4     1     0
#>  5 <split [28/4]> Fold01  18.7     8  360    175  3.15  3.44  17.0     0     0
#>  6 <split [28/4]> Fold01  18.1     6  225    105  2.76  3.46  20.2     1     0
#>  7 <split [28/4]> Fold01  24.4     4  147.    62  3.69  3.19  20       1     0
#>  8 <split [28/4]> Fold01  19.2     6  168.   123  3.92  3.44  18.3     1     0
#>  9 <split [28/4]> Fold01  17.8     6  168.   123  3.92  3.44  18.9     1     0
#> 10 <split [28/4]> Fold01  16.4     8  276.   180  3.07  4.07  17.4     0     0
#> # … with 278 more rows, and 2 more variables: gear <dbl>, carb <dbl>
1 Like

Looks good- thank you!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.