Unnesting - Error: Can't combine <list> and <character>. from tibble column

jhp · August 19, 2020, 2:32am

I'm having trouble unnesting a tibble column I was hoping to get some help with. I've done my best to make a reprex but it is still quite packed because of the tibble column, please excuse me.

How can I change the nested pap_name column to specifically to allow for unnesting?

Edit: Simplified example after based on @technocrat's advice

library(tidyverse)
df <-
  structure(list(user_id = c("6bemf", "vb76d"), registration_results = list(
    structure(list(
      pap_name = list(NULL),
      pap_file_url = list(NULL)
    ), row.names = c(
      NA,
      -1L
    ), class = c("tbl_df", "tbl", "data.frame")), structure(list(
      pap_name = "", pap_file_url = ""
    ), row.names = c(NA, -1L), class = c(
      "tbl_df",
      "tbl", "data.frame"
    ))
  )), row.names = c(NA, -2L), class = c(
    "tbl_df",
    "tbl", "data.frame"
  ))
df %>%
  unnest(registration_results)
#> Error: Can't combine `..1$pap_name` <list> and `..2$pap_name` <character>.

^{Created on 2020-08-18 by the reprex package (v0.3.0)}

technocrat · August 19, 2020, 2:46am

is what the error message is complaining about. Behind that, there seems to be a complaint about applying nest to a function. I suggest simplifying to a test case to help isolate.

jhp · August 19, 2020, 3:06am

I am wondering if I can use purrr to target pap_file_url and change it from list(NULL) to something like list("")?

technocrat · August 19, 2020, 3:17am

Maybe. But first start with the tiniest step that will get you an object to explore further.

jhp · August 19, 2020, 2:58pm

I guess you can think of the problem as

tibble(data = c(tibble(x = list(NULL)), 
                 tibble(x = ""))) %>% 
  unnest(data)
#> Error: Can't combine `..1$data` <list> and `..2$data` <character>.

although I'm not sure if there I am loosing too much information here. My guess is maintaining the list structure is going to be important to scale up

nirgrahamuk · August 19, 2020, 3:22pm

tibble(data = c(tibble(x = list(NULL)), 
                tibble(x = list("")))) %>% 
  unnest(data)

This code doesnt error

jhp · August 19, 2020, 4:17pm

So my open questions are:

how do I put the second row into a list and not the first?
How do I apply that only to the the nested x column and not the other columns (a la for the original data)?

AlexisW · August 20, 2020, 2:21pm

Something like that is ugly but should work:

map_df(df$registration_results, ~ map_chr(.x, ~ if(is.list(.x)){""}else{.x}))

There are 2 nested maps:

loop on the rows of the nested column
loop on the columns of the tibble in each row of the nested column

And for each column look whether it's a character vector or a list(NULL). It may be slightly more robust that way:

map_df(df$registration_results, ~ map_chr(.x, ~ if(identical(.x, list(NULL))){""}else{.x}))

First I think could be obtained by modifying what I have above. Second, you can use a mutate with across, I guess, to treat list-columns only.

system · September 10, 2020, 2:21pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.