Strange results from function base::min() with option rn.na = TRUE when called from function `dplyr::summarise()`

Can you explain to me if this is a bug or a feature of the dplyr::summarise() function?

structure(list(Month = c(2, 2, 2, 3, 3, 3), 
  daily_temperature_2m_min = c(23.4, 23.2, 23.4, 24, 24.6, 24.6)), 
  row.names = c(NA, -6L), class = c("data.frame")) |>
  dplyr::summarise(Low = base::min(daily_temperature_2m_min, rn.na = TRUE), .by = Month)
  Month Low
1     2   1
2     3   1
structure(list(Month = c(2, 2, 2, 3, 3, 3),
  daily_temperature_2m_min = c(23.4, 23.2, 23.4, 24, 24.6, 24.6)), 
  row.names = c(NA, -6L), class = c("data.frame")) |> 
  dplyr::summarise(Low = base::min(daily_temperature_2m_min), .by = Month)
  Month  Low
1     2 23.2
2     3 24.0

Is it really possible that the appearance of an option rn.na = TRUE in the function dplyr::summarise() can only produce units in the function base::min()?

I dont have time to investigate this failure mode, but you do have a typographical error where you are intending na.rm but writing something else rn.na

this works:

structure(
  list(
    Month = c(2, 2, 2, 3, 3, 3),
    daily_temperature_2m_min = c(23.4, 23.2, 23.4, 24, 24.6, 24.6)
  ),
  row.names = c(NA, -6L),
  class = c("data.frame")
) |>
  dplyr::summarise(Low = base::min(daily_temperature_2m_min, na.rm = TRUE),
                   .by = Month)

gives

  Month  Low
1     2 23.2
2     3 24.0

it is a "feature", if you will, of min:

> base::min(c(23.4, 23.2, 23.4, 24, 24.6, 24.6), rn.na = TRUE)
[1] 1

Thanks for the clarification. So this is a feature, and rn.na = TRUE is the second argument of the base:min() function, which as an integer is 1.

I'll try to explain the best I can.
rn.na is not an expected parameterinto base::min function. (though na.rm is)
Therefore , min() interprets rn.na = TRUE, as the value TRUE with a name on it (rn.na).
names are ignored by min() so only the TRUE is evaluated, and as is conventional in many programming language, theres an implicit conversion from TRUE to 1L (and there would be from FALSE to 0L

lets go through some simple examples

> min(c(4,6),2)
[1] 2
> min(c(4,6),x=2)
[1] 2

we see that the minimum of 2,4,6 is 2, regardles of whether 2 might be associated with some name like x
in your example , rn.na is just like x, not matching any expected param, so name effectively ignored.

min(c(4,6),rn.na=2)
[1] 2

same as when 2 was tagged with the name x

on to the next example :

> min(c(4,6),na.rm=2)
[1] 4

na.rm is actually an expected param, which controls the behaviour of encountering NA values , 2 is considered Truthy as it is > 0, therefore this is the same as writing

min(c(4,6),na.rm=TRUE)

which would evaluate to 4, just as if the na.rm has been excluded. (as there are no NA values)

I hope this helps.

2 Likes