Dplyr filter from another dataframe

I am trying to filter out only the rows where the column values are one of the column values of a seperate dataframe column.
i tried the following

top100frame<-Datpar %>% filter(Channel.ID %in% helper1$Channel.ID)

But it does not work, and instead just copies all entries of the dataframe into the new variable. Can someone spot my mistake?

Hi,
It's hard to help you since you don't provide a reproducible example.

Maybe you should consider the function dplyr::semi_join. The answer you provide might be quite slow if you have a lot of Channel.ids in helper1

I am sorry, Beginner here. I must have written it badly, since I do not wanna join anything. helper1$Channel.ID is a column with strings in it.. and i want to only keep the rows of Datpar where the column values are one of those values. Does that make more sense?

I understood what you want to do. Your code seems ok. I still think semi_join is more appropriate for your problem. I unfortunately can't help you more without a reproducible example.

As @Flo_P said you should provide us reprex. It helps us help you and you will end up with a much clearer explanation of what your are trying to do.

Is the sort of thing you are trying to do?

suppressPackageStartupMessages(library(tidyverse))
df <- tibble::tribble(~id, ~v,
                                            1L, "a",
                                            2L, "b",
                                            3L, "c",
                                            4L, "d")

sel <- tibble::tribble(~id, ~v,
                                            11L, "c",
                                            8L, "d")

filter(df, v == sel$v)
#> # A tibble: 2 x 2
#>      id v    
#>   <int> <chr>
#> 1     3 c    
#> 2     4 d

Created on 2018-04-16 by the reprex package (v0.2.0).

1 Like

Check your data types to make sure that Channel.ID is the same type in both datasets.

1 Like