In the same vein as this question regarding head, why was the percentage sign %
chosen as the delimiter for user defined operators? And while we're at it, why was a delimiter required at all?
Does R even support Operator Overloading? I'm guessing it doesn't or there probably wouldn't be the need for a delimiter. I'm curious about this answer as well.
EDIT looking into it, it looks like R does support operator overloading but it's bad taste to overload pre-defined operators (+, >, <, =, etc...) Still the question remains why %
was chosen.
I believe :=
is an example of overloading exploited by data.table and subsequently rlang.
No idea on why %
, but addressing your last question:
Operators are treated in a special way when parsing (official docs), so R probably wants to be sure it knows what's an operator. Remember, R allows a symbol to be shared by a function and a non-function:
"%in%" <- 1
`%in%`
# [1] 1
`%in%`(2, 1:4)
# TRUE
Now imagine no delimiters were required. And let's say package foo
exports an operator named bar
, which is basically just an identity check:
bar <- function(x, y) identical(x, y)
Consider this code:
library(foo)
bar <- NULL
bar bar NULL
Should it check if the newly-defined bar
is NULL
(which it is)? Or should it check if the function foo::bar
is NULL
(which it isn't)? And this is just with single-symbol expressions on either side. Imagine trying to parse this:
bar bar bar NULL
From @hadley's Advanced R
:
It is possible to override the definitions of these special functions [
+
,for
,[
, etc.], but this is almost certainly a bad idea. However, there are occasions when it might be useful: it allows you to do something that would have otherwise been impossible. For example, this feature makes it possible for the dplyr package to translate R expressions into SQL expressions. Domain specific languages uses this idea to create domain specific languages that allow you to concisely express new concepts using existing R constructs.
"Old" S had %%
and %/
. "New" S had %%
, %/%
and %*%
. When additional infix operators were needed, I suspect it seemed like a natural extension to allow anything in between the %
.
You can see that the parser must've supported it early in the evolution of R with the commit that implemented %in%
: https://github.com/wch/r-source/commit/5f581abd52b10e3b1ac6d3de3bcbe5853fbd6e00
Maybe this is just a reflection of the languages I've been in contact with, but operator overloading is something I tend to associate with object-oriented languages. I mean, R has OO and class systems, but I'm not sure that those systems are really tied into type safety at the parser level the way that, say, C++'s or even Python's classes do.
JavaScript is an interesting comparison. It goes even further than R by not having a class system at all, and it doesn't allow operator overloading. I couldn't imagine operator overloading even making sense in that language.