I'm currently trying to filter out all rows after some condition, including the rows which meet that condition. With dplyr::between()
, it seems like I can do that, except it'll still contain the rows that meet that condition.
Here's an example:
library(dplyr)
mtcars %>%
as_tibble() %>%
filter(between(row_number(), 1, which(mpg == 17.8)))
#> # A tibble: 11 x 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
#> 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
#> 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
#> 11 17.8 6 168. 123 3.92 3.44 18.9 1 0 4 4
So, this contains the row in which mpg = 17.8. Any idea how I can get it so it doesn't include that row? Thanks!
At the moment you have to use inequalities:
library(dplyr)
mtcars %>%
as_tibble() %>%
filter(1 < row_number(), row_number() < which(mpg == 17.8))
#> Warning in filter_impl(.data, quo): hybrid evaluation forced for
#> `row_number`. Please use dplyr::row_number() or library(dplyr) to remove
#> this warning.
#> Warning in filter_impl(.data, quo): hybrid evaluation forced for
#> `row_number`. Please use dplyr::row_number() or library(dplyr) to remove
#> this warning.
#> # A tibble: 9 x 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 2 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 3 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
#> 4 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
#> 5 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#> 6 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
#> 7 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
#> 8 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
#> 9 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
(Side note: I have no idea why the warnings are popping up; this usage seems unambiguous. Using dplyr::row_number()
does make them go away. Reported. )
I generally use inequalities anyway, as generally in English "between" is exclusive, but dplyr::between
is based on the SQL function, which is inclusive:
sql
Using inequalities removes any ambiguity.
1 Like
cderv
August 12, 2018, 9:40pm
3
As you are working with rows that are integer number, x <= 1
is equivalent to x < 2
.
So if you don't want the row that meet the condition, just take the previous row number (which(mpg == 17.8) - 1
).
library(dplyr, warn.conflicts = FALSE)
mtcars %>%
as_tibble() %>%
filter(between(row_number(), 1, which(mpg == 17.8) - 1))
#> # A tibble: 10 x 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
#> 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
#> 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
Created on 2018-08-12 by the reprex package (v0.2.0).
3 Likes