dplyr::slice extended to vectors

Hello everyone.

I often get in the situation where I have an index vector and a data object and I need to :

  • slice data, if it's a data.frame
  • indexing data, if it's a vector or list

I was wondering, is there already a function doing that in base R or tidyverse ? Because it bothers me to reimplement it in each of my projetcs, and I'm quite sure I'm not the only one who needs it.

If this function is not already available, I think it would be nice to extend dplyr::slice to handle vectors and lists (by indexing them). What do you think about it ?

Thanks

slice is nice, but [ , ] are easier to use.

For example, using mtcars

# we have already created an index
idx <- c(3, 4, 6, 18, 20, 21, 26)
mtcars[idx,]
#>                 mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#> Datsun 710     22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
#> Valiant        18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
#> Fiat 128       32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
#> Toyota Corolla 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
#> Toyota Corona  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
#> Fiat X1-9      27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
# no index to hand
mtcars[which(mtcars$cyl == 6 & mtcars$carb == 1),]
#>                 mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Valiant        18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Created on 2023-02-17 with reprex v2.0.2

{data.table} has a similar parsimonious syntax.

dplyr::slice arguable has a place because it can potentially improve readability/transparency over the square bracket syntax for accessing data.frame's. for example : mydat[x,y] ; is x or y the row or column etc ? slice(mydat,5) is in that sense 'clearer'
but the square bracket syntax to access a vector is trivially myvec[x]; would slice(myvec,x) therefore be an improvement ? its arguable a step back for readabiility; and for people used to seeing data.frames get sliced by slice, simply seeing an object being sliced might mislead them into expecting it to have been a data.frame whereas its a vector.

Just my thoughts.

The order is [row,col], and slice is preferred only within a tverse based workflow. A user confused by the difference between a data frame and a vector is likely to be more confused with slice, which takes as its argument

## Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See *Methods*, below, for more details.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.