Your thoughts on interstital comments «like this» to make our R and Rmarkdown code more understandable

Dear Posit Community,
As programming languages, R and Rmarkdown enable us to

  1. express instructions for computers to execute
  2. express these instructions in ways that we humans can understand.

When it comes to writing understandable code, the ball's in our court as programmers, but programming language syntax and semantics determine what' s possible, e.g., compare

  • APL: {(+⌿⍵)÷≢⍵}
  • R: mean(w, na.rm=TRUE)

Integrated Development Environments (IDEs) like RStudio can also play a powerful role, and I'm particularly interested here in the way they can allow us to write one thing (e.g., $e^{i \pi}+1=0$, [Posit Forum](https://forum.posit.co/)) which displays in another way.

This makes me think about interstital comments, i.e., comments in the interstices—the intervening gaps between other things. Languages like C have them, e.g.,

int i /* index over rows */ = N - 1 /* last row */;

...though I haven't seen much code that tries to take advantage of this style.

R, however, has only # comments to end of line
...which limits the kinds of comments we can make.

Now, as someone who has experienced the frustration of trying to change code that was actually within a multiline comment (a long time ago, in a vi editor far, far away), I suggest the following with some hesitation, but also at a time where IDEs are doing more and more to enable us to communicate better in code.

What do you think about interstital comments «like this» to make our R and Rmarkdown code more understandable?

Imagine using these comments to insert additional text to improve readability:

saveRDS(New_ClassList, «to the local» ClassList_file)

Or providing to provide alternate display text for code, so that the following

saveRDS(New_ClassList, «to local file»⸨ClassList_file⸩)

could be displayed as

saveRDS(New_ClassList, «to local file»)

while being interpreted by R as

saveRDS(New_ClassList, ClassList_file)

I'm curious to know what you think about this kind of literate programming idea, its potential pros and cons, and whether such a thing could ever happen in the R universe.
Cheers,
David

I would go with the classic "comments should explain the why, not the how" (discussed in many places, e.g. here). So for your example, I would actually suggest to split it in two lines and have the comment become a variable name:

to_local_path <- ClassList_file
saveRDS(New_ClassList, to_local_path)

I don't really understand what this code is supposed to mean, which makes it hard to write a clear variable name, but that is the general idea: if you have a comment that explains what you're doing, make it the name of a variable or a function. That will avoid later on changing the code and not the comment, and ending up with a comment that doesn't match the code.

Looking at your example

saveRDS(New_ClassList, «to the local» ClassList_file)

this also reminds me a lot of object-oriented languages that use classes. Here «to the local» might be a class which enforces things for ClassList_file, and, if its name is chosen well, allows the reader to better understand what ClassList_file represents.


The main point of literate programming in my eyes is when there is a lot of explanations of the "why", e.g. you want to explain all these equations and illustrate with plots, and you just interleave the code as a way to support the explanation. But the code is usually not the important part when first reading: just like in a math textbook, you might skip the proofs on the first read, but it's important they are there and support the main text.

Typically, most R package vignettes (DESeq2 is a good example IMO) can spend most of their time explaining why you need to run this or that step, and just happen to give you the code in passing (but the fact that the explanation was created by running the code guarantees that they match).

So, back to your question, in a way having additional forms of comments is missing what I think is the main point: the goal is to tie the comments and the code, so that a wrong comment makes the code fail immediately, and forces you to address it. Inline comments make it worse, by having more ways for the comments to diverge from the code, without anyone noticing.