Pardon what is a novice question in a few ways, but I'm interested in use of functions that input and output tibbles and also (possibly) have a class system.
My use case is a package for a type of clustering, what is called in my field Latent Profile Analysis. I noticed many beginners to this analysis found the greatest challenge to be figuring out what form the output took. My (proposed) solution was to have the main function in the package take a tibble
(or a data.frame
) and output a modified tibble
- namely, one with the classification, or the profile to which the observation is assigned, in a new column.
If this is a good idea, I'm also curious how to additionally have a class system (so generic functions like plot()
would work on the output). It's not clear to me whether this is a good idea - would intermediate steps (i.e., use of filter()
on the output) strip the new class (at some point)? Would it be preferable to have a function like plot_profiles()
that simply works on the modified tibble
?
Related, I'm also considering having an option that defaults to outputting a modified tibble, with the other option being to return an object with a class system - with the output of the function and other data, like the fitted model object, on which generic functions would work (unlike for the tibble
output). Does this seem like a good idea?
So, in summary, there would be the (default) tibble
output - which would be especially easy to use interactively - but also the option to also output a model object of its own class, for which functions that extract information (or create other output, like plots) - some of which would be generic functions and others which would not - would be written. To make it concrete, the interface would be something like:
# by default and for interactive use
main_function(..., to_return = "tibble")
# with class system for more fine-grained output available for the output of the fitted model object
x <- main_function(..., to_return = "class_name")
While highly specific, I wonder if this question could also be relevant more widely as package developers (like those for corrr or skimr) take a "tidy" approach with the functions in their packages.
This is a bit of a brainstorming question and so I appreciate any insight that can be shared with this novice package developer. If interested, the package tidyLPA
is only on GitHub here.