Hi there,
This question may fall into the "why would I do this category" but I was hoping that I could at least get some feedback why the following is a bad idea. Or may it isn't.
I am building a function that always draws upon the same SQLite database. The data source is always the same. What varies is a vector supplied to that function that create the desired subset of data. The advantage of piping this vector to the function is that it wraps everything up nicely in a pipe and also takes advantage of dbplyr
's laziness when it queries the data. So my question is about using that vector to pipe to another function rather than pipe data frames and what types of problems that might create. Take this simple example below (using the nycflight13
) data:
## Simple function
func1 <- function(carrier_code, data = nycflights13::flights){
flights_sub <- dplyr::filter(data, carrier == carrier_code) # filter
flights_sub <- dplyr::group_by(flights_sub, carrier) # group by vector
dplyr::summarise(flights_sub, avg_dep_delay = mean(dep_delay, na.rm = TRUE)) # some manipulations
}
Now imagine if the first two lines were much more involved using some sf
joins all to arrive at a vector of carriers
. Then I pull
out the vector of interest which is then piped to my trivial function func1
:
airlines %>%
filter(carrier %in% c("AA","AS")) %>%
pull(carrier) %>% ## left with a vector pipes to func1
func1()
Something about this just feels wrong or against the spirit of tidytools but I wanted to check here. Is there a case for or against piping vectors instead of data frames in the context of creating tidytools?
Any input is much appreciated.
Sam