I use Rstudio for Windows (10) and for Linux. Sometimes I use data (most of the times from *.sav files) which have characters from non-English languages (usually Portuguese) and I end up with a lot of encoding errors. In Linux everything goes fine (en_US.UTF-8), although with Windows the same doesn't happen. And the data that I have imported and saved in a *.Rdata file appears with encoding errors on Windows. Any possible solution for this issue?
Cannot figure out what the exact problem is, so Here is several solutions for this problem.
Go to global options and set default text encoding as UTF-8
when load data, there is encoding options like 'read.table(file, encoding = "UTF-8")'
Sys.getlocale() this function shows your locale system. and Encoding(file) shows your file's encoding.
iconv(data, "CP949", "UTF-8") is the function transforms CP949 encoding data to UTF-8
Thanks for your suggestions, the iconv() function didn't work in my case, I tried to convert the "problematic" data frame column and it shows the other strange characters. I this case, I'm using a data frame imported from an online query.
You might need to re-mark the encoding on some character vectors, e.g. (assuming they truly are UTF-8)
Encoding(cv) <- "UTF-8"
However, there are a number of assorted issues re: the handling of UTF-8 encoding (and printing of UTF-8 characters) on Windows, so it's possible that everything is indeed encoded and 'working' correctly; it's simply not printing correctly. Unfortunately, many of these issues are R issues rather than RStudio issues and so any associated fixes would need to be implemented upstream.
Thank you, @kevinushey!
Nevertheless, can you suggest me a default solution? In other words, every time I import a UTF-8 data frame I would have to do this?
I'm not sure if there's an automatic solution. Some APIs for reading files (e.g. read.table()) have arguments for assuming an encoding (e.g. the fileEncoding argument); without more information it's hard to say. This is somewhat outside the bounds of IDE-specific questions, so you might want to follow up in a separate category.
Thanks, @kevinushey, the data comes from a query hosted in LimeSurvey. I import the data through the limer package. I will ask for a solution to the creator of the package.