Hi everyone, I'm trying to understand why my code doesn't work with the second syntax. I'm new to R, and I don't understand why the quotations make a difference in my results? In the first syntax, I get the result I want, with any number less than 10 is labeled as "short" and any number larger than 1 is labeled as "tall." In the second syntax, however, every label is "tall," when that's clearly not right. Why? Thank you.
This takes some explaining and I might not be completely precise in my explanation.
The functions of the tidyverse, like mutate(), take as their first argument a data frame. Any bare, without quotes, names used in the function are looked for first as column names in the data frame. In
starwars %>%
mutate(height_in_cm = height/10)
mutate() get the starwars data frame and looks for a column named height to do the required calculation. The same happens in
you would get the same result.
In many functions outside of the tidyverse, column names and other names do need to be quoted. It just depends on how the function was written.
Got it, so it's important to me to understand that the quotations will confuse R studio and it will consider it to be text and not a pre-made column. I will make sure to keep this in mind moving forward! Thank you!
To follow up on the discussion, an alternative way of expressing the reason for the behavior described by @FJCC is that the column name "height in cm" is non-syntactic — it violates the naming rules in R by including space characters. However, if instead you use backticks, `height in cm`, then R will treat it as a proper column name.