I'd like to know what by.x and by.y do in merge().
I thought by.x = BY and by.y =BY would create common data key between data frame x and y in case that there's no common key or are many common columns. So I tried to verify my idea but failed. R doesn't recognize them as identical ones.
So please let me know if my thought was wrong or my code has a problem.
Data preparation
xx
name1 math
1 a 1
2 b 2
3 c 3
4 d 8
yy
name2 english
1 c 4
2 b 5
3 a 6
4 e 9
FAIL CASE 1
merge(xx,yy, by.xx = name1, by.yy = name2)
name1 math name2 english
1 a 1 c 4
2 b 2 c 4
3 c 3 c 4
4 d 8 c 4
5 a 1 b 5
6 b 2 b 5
7 c 3 b 5
8 d 8 b 5
9 a 1 a 6
10 b 2 a 6
11 c 3 a 6
12 d 8 a 6
13 a 1 e 9
14 b 2 e 9
15 c 3 e 9
16 d 8 e 9
FAIL CASE 2. The same result of CASE 1
merge(xx,yy, name1 = NAME , name2 = NAME)
name1 math name2 english
1 a 1 c 4
2 b 2 c 4
3 c 3 c 4
4 d 8 c 4
5 a 1 b 5
6 b 2 b 5
7 c 3 b 5
8 d 8 b 5
9 a 1 a 6
10 b 2 a 6
11 c 3 a 6
12 d 8 a 6
13 a 1 e 9
14 b 2 e 9
15 c 3 e 9
16 d 8 e 9
You have to pass the column names as a vector (or for one column, as a string surrounded by quotes), not just by specifying them directly. See below:
xx <- read.table(text = "name1 math
1 a 1
2 b 2
3 c 3
4 d 8",
header = TRUE,
stringsAsFactors = FALSE)
yy <- read.table(text = "name2 english
1 c 4
2 b 5
3 a 6
4 e 9",
header = TRUE,
stringsAsFactors = FALSE)
merge(x = xx,
y = yy,
by.x = "name1",
by.y = "name2")
#> name1 math english
#> 1 a 1 6
#> 2 b 2 5
#> 3 c 3 4
Thank you so much. I really appreciate it!
I wonder how people here answer the questions correctly in such a short time without any compensation.
It's amazing, especially for people who study by themselves like me.