Why avoiding `%<>%` operator from magrittr?

Chuchu · June 8, 2021, 1:22pm

Hi,

I was looking for proper usage of pipe operators and found the R Style Guide book by Hadley Wickham. When reading the section on pipe operators, I noticed this in 4.6:

The magrittr package provides the %<>% operator as a shortcut for modifying an object in place. Avoid this operator.

I was confused since I saw some examples with %<>% operator, and wondering if there are particular reasons for not using this operator.

Thanks in advance!

mara · June 8, 2021, 2:11pm

Modifying in place is useful if and only if you absolutely mean to do it. It also means that if you run the same code twice, you'll (often) get different results, since you've overwritten the original data or variable. So, basically, it's a powerful operator that you can use, but it's one that we avoid for the most part in the tidyverse.

Style guides are really general patterns which, of course, have appropriate exceptions.

EeethB · June 8, 2021, 2:32pm

I just wanted to hop on and say thanks for this question I love it! So many questions on here are concrete, technical questions, and while I know that's appropriate, I personally enjoy the freedom we have here to ask opinion, discussion-based questions.

Chuchu · June 8, 2021, 3:48pm

Thank you for the reply. I think this operator is going to be helpful when modifying a dataset for something like:

# pseudo code
df <- read_csv("some file")
df %<>%
  rename("some variable") %>%
  mutate("some variable operations")

I agree that it's pretty limited, e.g. the %<>% operator won't be appropriate as in the following pseudo code:

# the original dataset just disappears with the summary
df %<>%
  group_by("some grouping variables") %>%
  summarize("some summaries")

# for summaries it's better to have a new dataset, e.g.
df_summary <-
  df %>%
  group_by("some grouping variables") %>%
  summarize("some summaries")

So my conclusion is I'm going to use this operator only when cleaning/wrangling a dataset.

Thanks again for your great answer and if any comments please let me know😊

nirgrahamuk · June 8, 2021, 4:40pm

In this situation the content on the right(read.csv) is not dependent on the left (df) so <- would be preferred.

Chuchu · June 8, 2021, 6:00pm

Thank you for the correction. The code has been updated.

martin.R · June 9, 2021, 12:48pm

My opinion/usage:
There is absolutely nothing wrong with using %<>% if you know what you are doing, i.e. you are aware of the pitfalls mentioned above.

isadora · June 9, 2021, 4:36pm

Agreed, I am a big fan of it because when you clean data the point is you DO want to override the data. You don't make a new variable or dataset for every stage. Since R is reproducible, if you made a mistake you can always rerun up until the mistake and change it. No biggie. I was looking for a long time how to put the compound pipe in addins for a long time, AddinexamplesWV has it, and after you add it you can create a keyboard shortcut, just like for %>%

Matthias · June 10, 2021, 7:21am

So the %<>% operator is just a shortcut for
df = df %>%
do something?
So it takes a dataset, runs some stuff and replaces it? Is this correct?

The clearly it can be confusing, but it also prevents the need of finding new names or (or adding 1,2,3) in a longer pipeline of processing steps. As long everything is run from the start...

martin.R · June 10, 2021, 8:49am

Yes.

I'm not sure how it can be confusing: it does precisely what it states. It cannot even be easily used by accident because magrittr needs to be loaded first and there is no inbuilt keyboard shortcut.

The original question was about a style guide, which is inevitably a subjective judgement as to what makes sense and is user-friendly.

Matthias · June 10, 2021, 12:07pm

Thanks for the clarification!
Actually I may consider using this for the above mentioned situation.

isadora · June 10, 2021, 2:57pm

I will also point out that using = to assign objects instead of <- IS much more discouraged since it is best to leave = for arguments. It could be thought of as a style thing but has become a more stronger convention.

martin.R · June 10, 2021, 3:16pm

Sorry to be the one to disagree again.

Whilst I prefer <- over = this is not a convention. It's just a personal preference which most R users appear to share but is very far from universal. Even one of R's original authors advocates = and strongly dislikes <-.

The reason I raise this is that I have observed new users being told not to use = which I think is a wrong message to give them (particularly when there are so many more important issues).

Chuchu · June 10, 2021, 3:21pm

Wow, that's an interesting topic too because I'm one of those people who have been told not to use = but <-. I wonder why the recommendation has been changed for this assignment operator?

isadora · June 10, 2021, 3:57pm

Yes I think there is a generational change aspect to it, so it may be a convention among "newer" users (6 year user here, not sure how "new" that makes me but certainly not original author "old"). I don't see what's wrong with discouraging = for object assignment though, I think it helps beginners understand how objects are different from arguments, which I think is the point of why it is discouraged.

isadora · June 10, 2021, 3:59pm

To help beginners understand and conceptually demarcate the difference between setting an argument, assigning an object, and testing equality. See Hadley's two responses

system · June 17, 2021, 3:59pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.