Should unicode / utf-8 be avoided within .R files?

ace · February 18, 2019, 5:40am

On a whim I wanted to replace how I write code from

c("sigma[phi]", "sigma[epsilon]", "rho[phi]")

to this

c("σ[φ]", "σ[ε]", "ρ[φ]")

This caused some problems that made me give up. I can fix them (I think), but this post isn't about that.

I would just like to know what the general practice is regarding unicode characters in R files. Should they be avoided?

Elle · February 18, 2019, 2:23pm

Hi Ace, is this article any help? https://support.rstudio.com/hc/en-us/articles/200532197-Character-Encoding

kevinushey · February 19, 2019, 7:43pm

It doesn't need to be avoided, but there are unfortunately a number of landmines that occur when attempting to use arbitrary unicode text with R on Windows in particular. In general, the recommendation is to use characters representable in the native system encoding when possible.

There have been rumblings of native UTF-8 locale support coming to Windows in the future, but IIUC that feature still has not yet landed.

system · February 26, 2019, 7:56pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.