Argument naming conventions in tidyverse

Moody_Mudskipper · June 6, 2018, 1:16pm

Are the rules used for naming arguments laid out somewhere ?

dplyr and purrr use dotted arguments (.arg), tidyr doesn't. dplyr names the first argument .data , tidyr uses data, purrr uses .x , but not always and sometimes even uses x.

I know that the tidyverse generally aims for consistency so I'm sure a lot of thoughts went into this, but I don't get it.

Another way to phrase my question: If I want to build functions that integrate well with the tidyverse, how should I name my arguments ?

martin.R · June 6, 2018, 1:23pm

Please see this thread:

Moody_Mudskipper · June 6, 2018, 2:48pm

Thanks Martin, it answers all my questions.

I'll summarize it as I understand it:

... + NSE = possible argument conflicts, so it's a good idea to use dotted arguments in this case
in other cases don't use them, so they won't be there to conflict with the former, and less typing
Not everything is consistent, because of legacy and priorities. dplyr is the oldest tidyverse package so it has the most inconsistencies
Corollary is that to be up to date with current tidyverse, better to look at younger functions.

I suppose the following conventions though not exhaustive would be a good start (to add dot or not see previous bullet points):

data.frame (wide sense) input :data (don't use tbl or df)
if it's an element of a list, vector, or data.frame, it should be named x (y , z) if it takes a second element)
a list : l
predicate function : p (though some functions use predicate)
other function : f
list of functions : funs
an integer : n
an environment : env
an id column name (as string) : id
for these %operators%: x and y (%>% uses lhs and rhs but not newer ones)
type: type