How should functions behave when they receive bad inputs?

mfrasco · June 26, 2025, 7:15pm

When building packages that are used by other people, is there a best practice in the R community for validating the inputs provided to a function?

For example, let's say I have a function that will only provide a meaningful result if the input is an integer vector. If the input contains non-integer numeric values, the results will not be accurate or useful.

Let's assume that the documentation clearly specifies that the input to the function should be an integer vector. I can think of three options for how to proceed.

My function should do nothing and assume that the inputs are provided correctly.
My function should validate the inputs by checking the type. If the expectations are not met, the function can raise an error.
My function should validate the inputs by checking the type. If the expectations are not met, the function can attempt to change the input (e.g. convert from numeric to integer).

I do not like option 3, since that type of input transformation should be done by the user. However, I can see arguments for both option 1 and option 2. Is there a preference in the R community?

Thanks for your help

startz · June 26, 2025, 8:44pm

please. Because I make dumb errors all the time and appreciate extra protection. The only exception I can see is if the function will absorb a great deal of compute time and doing the error checking takes a substantial amount of time compared to the underlying purpose of the function.

wasd · June 27, 2025, 12:10pm

The answer is always "it depends", but for most cases I appreciate option 2. Failing early is better than potentially failing silently (which can happen in options 1 or 3), and by controlling the point of failure you can make more informative suggestions to the user.

Try not to go too far in this either, though. It can suck away a lot of effort and it doesn't really ever end. The more you protect the user against their own mistakes, the more inventive your user will be at creating their mistakes. (Including the case where the user is just future you, after you've forgotten how the internals of your old code workers.)

Please be extremely reserved in doing option 3 though. Swapping for example between integers and doubles can be a perfectly safe convenience feature (in what I imagine to be most contexts), but even then I wouldn't go so far as converting a non-integer valued double (e.g. the value 1.001) to an integer for example.