Original column is numeric. It doesn't seem to convert to factor, but instead makes all the values = column name. Why does the LHS recognise it correctly, but not the RHS?
df |>
mutate( {{x}} := as.factor( {{x}} ))
Original column is numeric. It doesn't seem to convert to factor, but instead makes all the values = column name. Why does the LHS recognise it correctly, but not the RHS?
df |>
mutate( {{x}} := as.factor( {{x}} ))
I think we might need a bit more info than what you've given - is this in a function? As this works:
library(tidyverse)
myfunc = function(df, x){
df |>
mutate( {{x}} := as.factor( {{x}} ))
}
myfunc(tibble(mtcars), carb)
# A tibble: 32 x 11
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>
1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
# ... with 22 more rows
Hmm I see that.
library(tidyverse)
ddff = 'tibble(mtcars)'
xx = 'carb'
myfunc = function(df, x){
{{ df }} |>
mutate( {{x}} := as.factor( {{x}} ))
}
myfunc(ddff, xx)
Something like this is what I would like.
Ah I see! This is challenging, but there's a pretty good article here:
This will run.
library(tidyverse)
ddff = 'tibble(mtcars)'
xx = 'carb'
myfunc = function(df, x){
x = sym(x)
eval(parse(text=df)) |>
mutate({{x}} := as.factor({{x}}))
}
myfunc(ddff, xx)
The top line evaluates whatever string you provide as "df" as R code.
The "x" argument - again provided as a string - turns the string into a symbol, which can then be parsed by the curly-curly-enclosed variable names inside of the function.
I have additional details. It seems that similar methods won't work for recipes
. I tried messing around with the sym(x)
, x
, {{ x }}
, and !!sym(x)
. The link you have provided (and after searching a bit online) doesn't seem to contain explicit information on recipes for functional programming purposes.
df = data.frame(text = c('blah blah blah', 'hello, hi, welcome', 'what, why'),
xx = as.factor(c('1', '0', '1')))
myfunc = function(dataa, x){
recc = recipe(as.formula(paste0(x, ' ~ text')), data = dataa) |>
step_tokenize(!!sym(x)) |> # tried some variations
prep() |>
bake(NULL) |>
View()
}
ddff = 'df'
label = 'xx'
myfunc(ddff, label)
#Can't convert <textrecipes_tokenlist> to <character>.
#Run `rlang::last_error()` to see where the error occurred.
I think your issue is that you haven't evaluated your data frame. This runs:
library(tidymodels)
library(textrecipes)
df = data.frame(text = c('blah blah blah', 'hello, hi, welcome', 'what, why'),
xx = as.factor(c('1', '0', '1')))
myfunc = function(dataa, x){
recc = recipe(as.formula(paste0(x, ' ~ text')), data = eval(parse(text=dataa))) |>
step_tokenize(!!sym(x)) |>
prep() |>
bake(NULL)
return(recc)
}
ddff = 'df'
label = 'xx'
myfunc(ddff, label)
Though this comes with the caveat that I don't really know what step_tokenize
is meant to do! Should it be label = "text"
rather than "xx"
?
Oh oops, I think evaluation of prep()
might have skipped there. This might be the better reprex.
library(tidymodels)
library(textrecipes)
rm(list = ls())
df = data.frame(text = c('blah blah blah', 'hello, hi, welcome', 'what, why'),
xx = as.factor(c('1', '0', '1')))
myfunc = function(dataa, x){
recc = recipe(as.formula(paste0(x, ' ~ text')), data = eval(parse(text=dataa))) |>
step_tokenize(!!sym(x))
return(recc)
}
ddff = 'df'
label = 'xx'
myfunc(ddff, label) |> prep() |> bake(NULL) |> View()
# Error: Can't convert <textrecipes_tokenlist> to <character>.
I think !!sym(x)
is not being converted properly.
I think the issue might literally just be the use of View()
. When you stop the pipe at bake()
it looks to me like the code runs okay, the RStudio viewing pane can't seem to display a "tknlist".
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
library(textrecipes)
rm(list = ls())
df = data.frame(text = c('blah blah blah', 'hello, hi, welcome', 'what, why'),
xx = as.factor(c('1', '0', '1')))
myfunc = function(dataa, x){
recc = recipe(as.formula(paste0(x, ' ~ text')), data = eval(parse(text=dataa))) |>
step_tokenize(!!sym(x))
return(recc)
}
ddff = 'df'
label = 'text'
myfunc(ddff, label) |> prep() |> bake(NULL)
#> # A tibble: 3 x 1
#> text
#> <tknlist>
#> 1 [3 tokens]
#> 2 [3 tokens]
#> 3 [2 tokens]
Created on 2021-12-02 by the reprex package (v2.0.1)
just some friendly advice... this is a habit best avoided... consequences might be that Jenny Bryan will set your computer on fire !
Project-oriented workflow (tidyverse.org)