Moving to dplyr threw up warnings I couldn't explain - until I tracked it down to my homemade "mode" function. MostCommon takes the most common (ie mode) value in its input and returns the maximum value in the even of a tie. I am very open to be told a better alternative or how to improve
The warning I got was "Returning more (or less) than 1 row per summarise()
group was deprecated in dplyr 1.1.0" from simple group_by/summarise. But only when using MostCommon. Same scenario with just min or max as an aggregate function has no error.
MostCommon still gets called but with a 0 length input. And returns that. With dplyr 1.0.0 group_by/summarise didn't worry. With dplyr 1.1.0 it throws the warning.
My current approach (from writing this up) is to change the return value to the below. Is there a better way of returning the rigth (non) value? Reprex below
if(length(ux) == 0) {
NA
} else {
ux # this returns the NA with the right class. ie that of x
}
library(tidyverse)
library(reprex)
MostCommon <- function(x) {
ux <- unique(x)
uxnotna <- ux[which(!is.na(ux))]
if(length(uxnotna) > 0) {
tab <- tabulate(match(x, uxnotna))
candidates = uxnotna[tab == max(tab)]
if (is.logical(x)) {
any(candidates) # return TRUE if any true. max returns an integer
} else {
max(candidates) # return highest (ie max) value
}
} else {
ux # this returns the NA with the right class. ie that of x
}
}
#####
emptymtcars <- mtcars %>%
filter(cyl > max(cyl))
emptymtcarscylsummary <- emptymtcars %>%
group_by(cyl, gear) %>%
summarise(
count = n(),
hp = mean(hp),
carb = MostCommon(carb)
)
#> Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
#> dplyr 1.1.0.
#> ℹ Please use `reframe()` instead.
#> ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
#> always returns an ungrouped data frame and adjust accordingly.
#> `summarise()` has grouped output by 'cyl', 'gear'. You can override using the
#> `.groups` argument.
emptymtcarscylsummary2 <- emptymtcars %>%
reframe(
count = n(),
hp = mean(hp),
carb = MostCommon(carb),
.by = c(cyl, gear)
)
Created on 2023-02-01 with reprex v2.0.2