Using NSE to filter a list of lists by value

I do a lot of batch/parametric data analysis and have found that lists of lists that define/configure the processing tasks and hold the results next to the config that produced them are maximally flexible and convenient. This leads naturally to the condition where I want to filter my config/results using arbitrary boolean criteria applied to their values. I'm looking for or planning to implement something that operates in a manner similar to dplyr::filter or purrr::keep, but evaluates NSEs against the list elements and returns a subset of the original list containing the matches (i.e. rather than operating on rows in the context of tbls).

Looking at quo(), UQ(), !!, etc. I think I see a path to the implementation, but there is so much in purrr and elsewhere that is similar that I half expect to stumble upon a pre-existing implementation or a discussion involving similar capabilities. After searching I haven't quite found it. So the question is how to implement the following hypothetical list_filter function:

config = list( list(a=1), list(a=3, b=1), list(b=2))

config %>% list_filter( a == 1 )         # returns config[1]
config %>% list_filter( ! is.null(b) )   # returns config[2:3]
config %>% list_filter( ! is.null(a) )   # returns config[1:2]
config %>% list_filter( a > 1 & b <= 2 ) # returns config[2]

Thanks!

1 Like

(Second try to post because first resulted in and internal server error)

Here is a incomplete and brittle way of doing this. The basics are you have to convert your predicate into a quosue then parse, i.e. use the pieces of, that quosure to build an expession that can be evaluated.

The problem you have is that a predicate like a == 1 needs to somehow be converted into something like config[1][["a"] == 1 for each list in you list of lists.

This example is brittle because it assumes that that the predicate looks something like a == 1 and not something more complicated. Any production solution would need to verify that the args passed in were predicates and handle more complicated predicates.

There is not much documentation on how to do this and the tidyverse is in flux so this might not work in the future. I put together a tutorial on how to do things like this with the caveat that the tidyverse is in flux.

suppressPackageStartupMessages(library(tidyverse))
config = list(list(a = 1), list(a = 3, b = 1), list(b = 2))

list_config <- function(l, pred) {
    # make quosue out of pedicate
    q <- rlang::enquo(pred)
    # parse quosue to pull out left-hand, right-hand,
    # and operator
    ql <- rlang::expr_text(q[[c(2, 2)]])
    qr <- rlang::expr_text(q[[c(2, 3)]])
    qop <- rlang::expr_text(q[[c(2, 1)]])
    
    # make logic selection vector
    s <- purrr::map_lgl(1:length(l),  function(i) {
        c <- l[[i]]
        if (is.null(c[[ql]])) {
            # false if element is missing
            FALSE
        } else {
            # make text expession out of parts of predicate
            exp <- glue::glue("c[['{ql}']] {qop} {qr}")
            # parse and eval this expression
            rlang::eval_tidy(rlang::parse_quosure(exp))
        }
    })
    # select the elements that meet the predicate
    l[s]
}

list_config(config, a > 0)
#> [[1]]
#> [[1]]$a
#> [1] 1
#> 
#> 
#> [[2]]
#> [[2]]$a
#> [1] 3
#> 
#> [[2]]$b
#> [1] 1
list_config(config, a == 1)
#> [[1]]
#> [[1]]$a
#> [1] 1
1 Like