I have two data.frames
, they each have an indexing column, but these do not match identically across data.frames
(i.e. some values are in one and not the other and vice verse). They each also have a list column. I am trying to merge them using all=T
(i.e. a full merge). However, I'm getting inconsistent behaviour in the output. If the index is missing for the first data.frame
, then the list value is NULL
, but if it is missing for the second data.frame
, then the list value is NA
tbl1 <- data.frame(
x = c(1,2)
)
tbl1$y <- list(c("a","b"),c("d"))
tbl2 <- data.frame(
x = c(1,3)
)
tbl2$z <- list(c("e"),c("f","g","h"))
> tbl1
x y
1 1 a, b
2 2 d
> tbl2
x z
1 1 e
2 3 f, g, h
> merge(tbl1,tbl2,by="x",all=T)
x y z
1 1 a, b e
2 2 d NA
3 3 NULL f, g, h
Is there a way to combat this explicitly within the merge()
function? It would also be easier for the missing output to be NA
in both columns as then I can just use is.na()
rather than vapply()
& is.null()
to get them.
(PS I am aware that I can use dplyr::full_join()
but I am creating a package and would like to minimise dependencies)