How can I set limits for custom qualitative parameters in custom parsnip models?

lbuifire · February 5, 2023, 7:20am

So anyone remember these series of threads from 2020?:

Why use stepwise when there are so many better alternatives? Because your boss tells you to.

I decided to God's work. To make my own custom model for at least one type of stepwise regression, if only to convince people that there are better tidymodels functions out there. This way no person out there has an excuse even not to use tidymodels.

I am following the directions from tidymodels - How to build a parsnip model to make my own package. What I am stuck on is for qualitative models how can I specify the values.

I would like to make a dial's like construtor for forward/backward selection. However, that is a text parameter. So I am not sure how to set up these constructor arguments to limit new values.

A function reference for a constructor that will be used to generate tuning parameter values. This should be a character vector with a named element called fun that is the constructor function. There is an optional element pkg that can be used to call the function using its namespace. If referencing functions from the dials package, quantitative parameters can have additional arguments in the list for trans and range while qualitative parameters can pass values via this list.

# Part Zero define some necessary pre-model depednancies
step_direction <- dials::new_qual_param(
type = "character",
values =c("both", "backward", "forward"),
label = c(step_direction = "direction"),
finalize = NULL
)

# set the likelihood function for AIC / BIC.  I prefer 2 for AIC to keep it simple
# 8 was picked because log2(256) is 8 and my data is near n = 256
step_likely <- dials::new_qual_param(
  type = "double",
  range = c(2,8),
  trans = NULL,
  inclusive = c(TRUE, TRUE),
  label = c(step_likely = "k"),
  finalize = NULL
)

Edit: Updated the question for clarification

hannah · February 6, 2023, 2:49pm

It looks to me like you have figured out what you wanted to do with dials and how to make a qualitative parameter object. However, your step_likely() should be using dials::new_quant_param(), since you're trying to make a quantitative parameter object in that code snippet.

This issue thread has more information on how to link up parsnip and dials, in case that's handy in your endeavor: Creating Parsnip Model Functions · Issue #832 · tidymodels/parsnip · GitHub

Max · February 6, 2023, 3:51pm

Sorry :-/

I do believe that not including this in our tools is the best idea, but I see that it is frustrating if you need it.

You can add questions here as you need them and we will help facilitate.

lbuifire · February 24, 2023, 5:45am

Thank you for the honest reply @Max and @hannah. i was able to figure out the parameter part thanks to your tips. I was able to get get the function to work with tuning, but I ran into a silly bug when I re-ran the code. hope to use this experience for other projects (see **). I am a bit stuck on what I feel like is a silly bug.

Every time I run the code after starting up the console the code works. However, every time that I run the code a subsequent time, the code fails. Reading into this I found this line in the tutorials:

ADD PARSNIP MODELS TO ANOTHER PACKAGE
The process here is almost the same. All of the previous functions are still required but their execution is a little different.
....
Warning: To use a new model and/or engine in the broader tidy models infrastructure, we recommend your model definition declarations (e.g. set_new_model() and similar) reside in a package. If these definitions are in a script only, the new model may not work with the tune package, for example for parallel processing.

Reading through all of that my last line made me wonder if a package would fix my load bug. I could probably get a friend to initialize a package on GitHub, but I think I should first know the minimum information of what is required for a ""tidymodels R package"".

How would you recommend writing a package that uses tidy models?
What else would you recommend?

(**) P.S. Right now I am taking a Data Mining Course and a Simulation Course that both use the old version of caret that Max wrote. The professor required stepwise as part of the cross-over between the two. So originally the purpose of this function was supposed to be like the model equivalent of the "Student T-Test" , which is a distribution that is great for learning, but in reality, almost nothing actually follows a normal distribution (except errors). Instead, you have to use a nonparametric KS Test or Mann-Whitney Test. So this stepwise regression using tidy models would be a "Students T-Model"
After I tackle this simple simulation, I would like to try more complex ones... like what if we could predict distributions by having fitdistributionplus run in tidy models.

lbuifire · March 12, 2023, 7:13pm

Thank you for your replies. Since I have not heard back I will make a new thread.

system · March 19, 2023, 7:14pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.