a <- c(10,20,30,40)
b <- c('book', 'pen', 'textbook', 'pencil_case')
c <- c(TRUE,FALSE,TRUE,FALSE)
d <- c(2.5, 8, 10, 7)
e <- c(2.4, 5, 10, 7)
df1<-data.frame(a,b,c)
df2<-data.frame(b,c,d,e)
Is there a function that will create a new data frame combining the df1 and df2 where it automatically detects columns that overlap, combines them, but then leaves the unique ones (in this case, just e) in the data frame as well?
Important! I know you can do this via the merge function where you pass the overlapping column names into the "by" argument, but it can be a bit tiresome and error proof to manually input all the overlapping columns when there are many of them!
library(dplyr)
a <- c(10,20,30,40)
b <- c('book', 'pen', 'textbook', 'pencil_case')
c <- c(TRUE,FALSE,TRUE,FALSE)
d <- c(2.5, 8, 10, 7)
e <- c(2.4, 5, 10, 7)
df1<-data.frame(a,b,c)
df2<-data.frame(b,c,d,e)
inner_join(df1, df2)
Joining, by = c("b", "c")
a b c d e
1 10 book TRUE 2.5 2.4
2 20 pen FALSE 8.0 5.0
3 30 textbook TRUE 10.0 10.0
4 40 pencil_case FALSE 7.0 7.0
I do not understand why you say that only e is a unique column. Column a only appears in df1 and columns d and e only appear in df2. Thus the join is done on the shared columns b and c.