Hi, first question here so please let me know if I am missing anything.
I want to create a function that passes in parameters to tidymodels' t_test function. I believe quosures need to be used, but I'm relatively new to using them. In general, I haven't had an issue with other tidyverse functions, but I get an error when I try to pass !!var into t_test.
Am I misunderstanding how quosures should be used? I see there is a newer convention of using double curly braces ({{ }}), but it yields the same error.
Hi David, thanks for the detailed walkthrough! Here is another attempt using the default gss dataset as an example. Essentially I would like to create a function that wraps around infer::t_test where I can bind multiple outputs row-wise, but I'm starting with the simplest case of a single call to infer::t_test.
I have also tried mduvekot's suggestion to no avail -- I'm happy to post another reprex if helpful but below is the reprex using my original example:
library(infer)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
infer::t_test(gss, hours ~ college)
#> Warning: The statistic is based on a difference or ratio; by default, for
#> difference-based statistics, the explanatory variable is subtracted in the
#> order "no degree" - "degree", or divided in the order "no degree" / "degree"
#> for ratio-based statistics. To specify this order yourself, supply `order =
#> c("no degree", "degree")`.
#> # A tibble: 1 × 7
#> statistic t_df p_value alternative estimate lower_ci upper_ci
#> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 -1.12 366. 0.264 two.sided -1.54 -4.24 1.16
perform_t_test <- function(group_col, response_col) {
infer::t_test(gss, !!response_col ~ !!group_col)
}
perform_t_test(college, hours)
#> Error in `infer::t_test()`:
#> ! The response variable `! and !response_col` cannot be found in this
#> dataframe.
One approach would be to convert each argument to a character value and then construct the formula to feed to t_test:
library(infer)
library(dplyr)
library(tidyr)
perform_t_test <- function(group_col, response_col) {
group_col = as_label(enquo(group_col))
response_col = as_label(enquo(response_col))
form = as.formula(paste(response_col, "~", group_col))
infer::t_test(gss, formula=form)
}
perform_t_test(college, hours)
#> Warning: The statistic is based on a difference or ratio; by default, for
#> difference-based statistics, the explanatory variable is subtracted in the
#> order "no degree" - "degree", or divided in the order "no degree" / "degree"
#> for ratio-based statistics. To specify this order yourself, supply `order =
#> c("no degree", "degree")`.
#> # A tibble: 1 × 7
#> statistic t_df p_value alternative estimate lower_ci upper_ci
#> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 -1.12 366. 0.264 two.sided -1.54 -4.24 1.16