I am helping a client improve a suite of R packages that they use internally. We have come across a question and are not sure if the answer is substantive or just one of style. I'm hoping that the community can help.
Background: The packages succeed at doing something useful to the organization. But they were written without any regard to passing R CMD check
. Most of my current effort is around helping make the packages pass R CMD check
.
As a toy example, the packages previously listed dplyr
in Depends:
, and had a lot of functions like this:
#' Happy Select
#'
#' Just like dplyr's select function. But also prints an inspiring message.
#'
#' @param df a data.frame
#' @param ... other parameters to pass to dplyr's select function
#' @export
happy_select = function(df, ...) {
print("Today is a wonderful day, isn't it?")
select(df, ...)
}
Note that there was no @importFrom dplyr select. The code works because dplyr is listed in Depends:. But it triggers two NOTEs:
checking dependencies in R code ... NOTE
Package in Depends field not imported from: ‘dplyr’
These packages need to be imported from (in the NAMESPACE file)
for when this namespace is loaded but not attached.
checking R code for possible problems ... NOTE
happy_select: no visible global function definition for ‘select’
Undefined global functions or variables:
select
I fixed the NOTEs like these by moving dplyr
from Depends:
to Imports:
and adding #' @importFrom dplyr select
. Note that if you have a large number of packages in Depends:
(as we did) you also get the NOTE
:
Depends: includes the non-default packages:
...
Adding so many packages to the search path is excessive and importing
selectively is preferable.
The functions now look like this, and generate no complaints from R CMD check
.
#' Happy Select
#'
#' Just like dplyr's select function. But also prints an inspiring message.
#'
#' @param df a data.frame
#' @param ... other parameters to pass to dplyr's select function
#' @importFrom dplyr select
#' @export
happy_select = function(df, ...) {
print("Today is a wonderful day, isn't it?")
select(df, ...)
}
My client now asked me an interesting question that I am not sure the answer of. They have seen a lot of code that always specifies the package you want to call the function from. Using that convention, the last line of happy_select
would be dplyr::select(df, ...)
instead of just select(df, ...)
.
My personal opinion is that:
R CMD check
seems to not care either way, so the code is likely "safe" as-is anddplyr::select
seems more cautious, and might be useful to future readers who don't know where select is coming from
That is, I don't really have a strong opinion on this one way or the other. Is there an accepted convention for this in the community? And if so, is there anything substantive to back it up rather than just aesthetics?
Thanks.