I've prepared the following simple Shiny app aiming to import multiple datasets with that contain different columns. I'd like to merge them using a loop since I could upload more than two datasets and I'd like to discard the unmatching columns.
The data I'd like to upload, as an example, are the following:
data_A <- data.frame(
Sample = as.factor(c("x_1", "x_2", "x_3")),
Pop = as.factor(c("FF", "FF", "FF")),
aaa = c(31L, 27L, 17L),
aaa.1 = c(3L, 7L, 19L),
bbb = c(10L, 31L, 37L),
bbb.1 = c(23L, 1L, 13L),
ccc = c(20L, 18L, 37L),
ccc.1 = c(9L, 12L, 10L),
ddd = c(3L, 6L, 28L),
ddd.1 = c(32L, 27L, 27L),
eee = c(34L, 33L, 3L),
eee.1 = c(38L, 13L, 17L),
fff = c(21L, 26L, 40L),
fff.1 = c(38L, 4L, 28L)
)
data_B <- data.frame(
Sample = as.factor(c("x_4", "X_5", "X_6")),
Pop = as.factor(c("GG", "GG", "GG")),
aaa = c(21L, 24L, 24L),
aaa.1 = c(9L, 1L, 36L),
ccc = c(3L, 7L, 30L),
ccc.1 = c(37L, 13L, 23L),
fff = c(6L, 6L, 3L),
fff.1 = c(11L, 3L, 30L)
)
data_C <- data.frame(
Sample = as.factor(c("x_7", "x_8", "x_9")),
Pop = as.factor(c("ZZ", "ZZ", "ZZ")),
aaa = c(31L, 27L, 17L),
aaa.1 = c(3L, 7L, 19L),
bbb = c(10L, 31L, 37L),
bbb.1 = c(23L, 1L, 13L),
ccc = c(20L, 18L, 37L),
ccc.1 = c(9L, 12L, 10L),
fff = c(21L, 26L, 40L),
fff.1 = c(38L, 4L, 28L)
)
and I'd like to obtain the following final dataset:
data_def <- data.frame(
Sample = as.factor(c("x_1", "x_2", "x_3", "x_4", "X_5", "X_6", "x_7","x_8", "x_9")),
Pop = as.factor(c("FF", "FF", "FF", "GG", "GG", "GG", "ZZ", "ZZ","ZZ")),
aaa = c(31L, 27L, 17L, 21L, 24L, 24L, 31L, 27L, 17L),
aaa.1 = c(3L, 7L, 19L, 9L, 1L, 36L, 3L, 7L, 19L),
ccc = c(20L, 18L, 37L, 3L, 7L, 30L, 20L, 18L, 37L),
ccc.1 = c(9L, 12L, 10L, 37L, 13L, 23L, 9L, 12L, 10L),
fff = c(21L, 26L, 40L, 6L, 6L, 3L, 21L, 26L, 40L),
fff.1 = c(38L, 4L, 28L, 11L, 3L, 30L, 38L, 4L, 28L)
)
So, I prepared the following function and I tried to use intersect
function, but I noticed it works only in case of two datasets, so I'd like to use it in a loop since I'd need to merge event more than two datasets. Here I report an example with 3 datasets, but I'd like to extend it to a N number of uploaded datasets:
datasets <- list(data_A,data_B,data_C)
numfiles = length(datasets)
mydata = datasets
variables = list()
for (i in 1:numfiles)
{
variables[[i]] = c(colnames(mydata[[i]]))
}
InAll = lapply(variables, FUN = intersect)
do.call(rbind, mydata[,inAll])
I tried different solutions but I could not achieve my goal. I'd be very grateful if anybody could help me with this matter!