I have been having a problem with a class I am teaching. The students are importing an excel file that contains a unicode character \u2103 (℃) in the header row. They are then using janitor::clean_names()
.
For most of the students janitor::clean_names() converts the column name to "temperature_c" in both Rstudio and when rendering with quarto.
For about 20% of the students, janitor::clean_names() converts the "℃" to "temperature_u_00b0_c" (the unicode for "°") in Rstudio but to "temperature_c" when rendered with quarto. This then causes problems with the rest of their code when they render the document
In both rstudio and quarto the "℃" is being imported correctly as utf-8 and has the same output with charToRaw() - e2 84 83, so it is not an import problem. Somehow janitor is treating the unicode differently depending on how R is being run.
All the affected students are using R4.2.1 with the current version of RStudio on windows. Students might have Norwegian locales - I haven't been able to check that.
I have given the students some quick fixes, but it would be good to avoid this confusing problem
Minimal example
tibble::tibble("Temperature (℃)"= 1) |> janitor::clean_names() |> names()
#temperature_u_00b0_c