I use RMarkdown (specifically {bookdown}) for the sake of reproducibility and dynamic control of the human-readable text. I wonder if it is possible to automatically convert dplyr::filter() conditions to human-readable text.
If I understand you correctly, you would like to define the filtering condition as a variable, and then plug it into the dply::filter and the text so both you update at the same time? As far as I know, there is no simple solution for this, unless you're going to set up a lot more complex logic to achieve this, which would bypass the point. Is there a specific reason you want to do this? I would think that the easiest for now would be to just edit the text in the different instances.
Example condition 2:
City miles per gallon vs. highway miles per gallon for cars with engine displacements
greater than or equal to `r threshold `.
Example condition 3:
City miles per gallon vs. highway miles per gallon for cars with engine displacements
between `r threshold[1]` and `r threshold[2]`.
Maybe someone knows a more elegant solution, but the logic inside the filters can become very complex and automatic text translation would be difficult I imagine.
Thank you for your reply. I expected that there would be no such solution.
The specific reason for this "automation" is that now I have to remember to update text every time I change filtering conditions. This is a problem at the early stage of the project when I'm trying to write the text explanations along with data wrangling (not to forget why I do things). However, if some changes in the upstream analysis occur, I might need to change filtering conditions and forget to update the text accordingly.
Anyway, it is not a big issue, and I just need to re-read the text once in a while to check that it is still okay.
I completely understand that, this happens to me all the time as well
What I would do is use some RegEx to quickly find all the parts in my code where this occurs, and then check if the text matches. I have written you a regex string to search for any filtering in your code using the filter() function
(filter\([^>=<%]+[>=<]+)|(filter\(\s*between\()
If you paste this into the RStudio search box and check the regex option, you should be able to jump from filter to filter and quickly check if the text below matches.
For me it depends on tbe variability / variety and complexity of the filter statements you intend to have. If you will limit your self to single filter statements that are one of a few fixed types then it should be relatively trivial although somewhat menial, to set up a function that can work as you describe. Its a question of the level of ambition for me.