reprex - identify rows in dataframe d2 column c3 not in dataframe d1 column c1
c1 <- c("A", "B", "C", "D", "E")
c2 <- c("a", "b", "c", "d", "e")
c3 <- c("A", "z", "C", "z", "E", "F")
c4 <- c("a", "x", "x", "d", "e", "f")
d1 <- data.frame(c1, c2, stringsAsFactors = F)
d2 <- data.frame(c3, c4, stringsAsFactors = F)
x <- unique(d1["c1"])
y <- d2[,"c3"]
id <- which(!(y %in% x) )
I am trying to find the id's of rows in y where the specified column does not include values of x
Your problem is in defining x:
x <- unique(d1["c1"])
x
#> c1
#> 1 A
#> 2 B
#> 3 C
#> 4 D
#> 5 E
class(x)
#> [1] "data.frame"
In that case, you wanted to extract one column of d1
, but you still have a data.frame. You need to extract the column so that the result is a vector:
x <- unique(d1[["c1"]])
# or x <- unique(d1[,"c1"])
x
#> [1] "A" "B" "C" "D" "E"
class(x)
#> [1] "character"
Then it'll work. See sections 20.5.2 and 20.5.3 of the r4ds book for more visual explanations.
1 Like
system
Closed
3
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.