Version 1.0.0 of tidyr::unnest() returns error where the old version merged column types.

Steen_Harsted · October 11, 2019, 9:48am

The older version of tidyr::unnest(), now called tidyr::unnest_legacy(), handles unnesting of different columns types by merging the column types.

The new version of tidyr::unnest() returns an error if some of the columns are of the same name, but different types.

I'm guessing the new version is safer, but I never had any issues with the old version.

How should I unnest a list of dataframes with different column types going forward?

For now, I can use unnest_legacy(), but if I want to use the new unnest() should I then first run something like mutate() and map() to get identical column types or is there an easier way around this issue?

Reproducible example:

library(tidyr)

a <- tibble(
  value = rnorm(2),
  char_vec = c(NA, "A")) # character vector

b <- tibble(
  value = rnorm(2),
  char_vec = c(NA, NA)) # logical

df <- tibble(
  file = list(a, b))

# New tidyr::unnest()
unnest(df, cols = c(file))
#> No common type for `..1$file$char_vec` <character> and `..2$file$char_vec`
#> <logical>.

# Old tidyr::unnest()
unnest_legacy(df, file)
#> # A tibble: 4 x 2
#>     value char_vec
#>     <dbl> <chr>   
#> 1  0.295  <NA>    
#> 2 -0.389  A       
#> 3  0.0308 <NA>    
#> 4 -1.31   <NA>

^{Created on 2019-10-11 by the reprex package (v0.3.0)}

valeri · October 11, 2019, 11:16am

Hi @Steen_Harsted,

I thought there might be a way to override this check (for common class types) but I didn't see any. So far, a fix to the example you had could be - clearly this is not ideal and automatic as it doesn't check whether the column types are the same.

library(tidyr)

a <- tibble(
    value = rnorm(2),
    char_vec = c(NA, "A")) # character vector

b <- tibble(
    value = rnorm(2),
    char_vec = as.character(c(NA, NA))) # was logical, now character

df <- tibble(
    file = list(a, b))

# New tidyr::unnest()
unnest(df, cols = c(file))
#> # A tibble: 4 x 2
#>    value char_vec
#>    <dbl> <chr>   
#> 1 -0.346 <NA>    
#> 2 -0.960 A       
#> 3  1.04  <NA>    
#> 4  0.293 <NA>

Created on 2019-10-11 by the reprex package (v0.3.0)

BTW - if at all applicable you can use NA_character_ to specify character NAs

Steen_Harsted · October 11, 2019, 11:34am

Hi @valeri

Thank you for the reply and for looking into this.

Changing the data generation process could be a solution. I would just have to update the import function and reimport all the data.

Another solution, that I was recommended on SO, could be to change the columns using purrr and mutate like this:

library(purrr)
df %>%
  mutate(
  file = map(file, ~mutate(.x, char_vec = as.character(char_vec)))) %>% 
  unnest(cols = c(file))

That solution works fine as well, but I find it somewhat complicated - at least compared to the unnest_legacy() that just solved this particular case of unnesting without issues.

I am mainly raising the issue because I would like some comments on what the intended or best-practice workflow should be in a case like this.

Thanks again.

system · November 1, 2019, 11:34am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.