Hi,
I am writing a document with knitr and LaTeX in Spanish and I am using RStudio.
I need to write comments lines inside the R-chunks which include accents but it fails when I compile the Rnw document saying "These lines contain invalid UTF-8 character..."
Is there any way of doing it?
Thank you!!
Lola
Can you supply an example and sessionInfo()
Yes, thank you!
I have created a simple document named example.Rnw with a chunk which includes comment lines in Spanish with accents. This is the document:
\documentclass{article}
\usepackage[ansinew]{inputenc}
\usepackage[spanish]{babel}
\begin{document}
Below I include a chunk which includes comment lines in Spanish with words with accents ('Creación' and 'función'):
<<echo=TRUE,prompt=TRUE,comment=NA>>=
# Creación de una secuencia
y<-1:5
# De forma equivalente se puede usar la función seq()
@
\end{document}
When I compile it e.g. knit('example.Rnw') I get the following error:
> knit('example.Rnw')
processing file: example.Rnw
Error in sub(re, "", x, perl = TRUE) : input string 1 is invalid UTF-8
In addition: Warning message:
In xfun::read_utf8(input) :
The file example.Rnw is not encoded in UTF-8. These lines contain invalid UTF-8 characters: 5, 7, 9
This is the sessionInfo:
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252
system code page: 65001
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.36
loaded via a namespace (and not attached):
[1] compiler_4.1.2 magrittr_2.0.1 tools_4.1.2 stringi_1.7.5 stringr_1.4.0
[6] xfun_0.28
I must admit that I am lost. I have never worked with a .Rnv file.
However I do not understand what
\usepackage[ansinew]{inputenc}
is doing. What happens if you try
\usepackage[utf8]{inputenc}
Otherwise, to run a quick test try knitting this as a .Rmd file and see what happens
---
title: "Playing"
author: "toucan"
date: "13/12/2021"
output: pdf_document
---
H~2~O is a liquid. 2^10^ is 1024.
C~5~ is something else.
```{r plot, echo=FALSE}
xx <- 1:20
# Québec être ça
plot(xx)
This runs for me with this setup
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 21.10
Matrix products: default
BLAS: /usr/local/lib/R/lib/libRblas.so
LAPACK: /usr/local/lib/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_CA.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_CA.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_CA.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] tictoc_1.0.1 vroom_1.5.7 ggbiplot_0.55 scales_1.1.1 plyr_1.8.6 fs_1.5.2 lubridate_1.8.0 forcats_0.5.1 stringr_1.4.0
[10] dplyr_1.0.7 purrr_0.3.4 readr_2.1.1 tidyr_1.1.4 tibble_3.1.6 ggplot2_3.3.5 tidyverse_1.3.1 arrow_6.0.1
loaded via a namespace (and not attached):
[1] tidyselect_1.1.1 xfun_0.29 haven_2.4.3 colorspace_2.0-2 vctrs_0.3.8 generics_0.1.1 htmltools_0.5.2 yaml_2.2.1 utf8_1.2.2
[10] rlang_0.4.12 pillar_1.6.4 glue_1.6.0 withr_2.4.3 DBI_1.1.1 bit64_4.0.5 dbplyr_2.1.1 modelr_0.1.8 readxl_1.3.1
[19] lifecycle_1.0.1 munsell_0.5.0 gtable_0.3.0 cellranger_1.1.0 rvest_1.0.2 evaluate_0.14 knitr_1.37 tzdb_0.2.0 fastmap_1.1.0
[28] parallel_4.1.2 fansi_1.0.2 broom_0.7.9 Rcpp_1.0.8 backports_1.3.0 jsonlite_1.7.3 bit_4.0.4 hms_1.1.1 digest_0.6.29
[37] stringi_1.7.6 cli_3.1.0 tools_4.1.2 magrittr_2.0.1 crayon_1.4.2 pkgconfig_2.0.3 ellipsis_0.3.2 rsconnect_0.8.24 xml2_1.3.3
[46] reprex_2.0.1 rstudioapi_0.13 assertthat_0.2.1 rmarkdown_2.11 httr_1.4.2 R6_2.5.1 compiler_4.1.2
It seems your file is not saved as being encoded to UTF-8. Try to save it as UTF8.
You can do that in mostly all editor including RStudio IDE : https://support.rstudio.com/hc/en-us/articles/200532197-Character-Encoding-in-the-RStudio-IDE
Thank you for your answer.
I need to include \usepackage[ansinew]{inputenc} to be able to have accents in the body text. If I replace it by what you suggest then the accents are replaced by other symbols so it does not work.
On the other hand I tried your example with RMarkdown and it works but I need latex for what I'm doing.
I was using Sweave before and having accents in the chunks was not an issue but yes for me with knitr.
I would appreciate any other suggestion...
Thanks a lot!
Thank you for your answer!
If I save my file with that encoding I lose all the accents in the text so it does not work....
Did you select "save with Encoding" ?
You need to convert the file to UTF-8, not just change the encoding. The accent should stay;
Notepad++ is also another editor I know can do this
Yes I did that but when I open again the file in RStudio it shows funny symbols where I had accents in the text. How can I fix that?
On the other hand it compiles correctly if I include in the preambule \usepackage[utf8]{inputenc}
Thanks a lot!
Oh I forgot about that too. Then the encoding of the file was not the issue. Sorry for misleading you.
Can you supply us with a Minimal Working Example that Includes the preamble, an R chunk and some text with the accented characters.
By the way, my understanding is that my .Rmd example uses LaTeX to produce the PDF.
Hi,
I sent one minimal example in my second email. I understand that it includes what you are asking for.
This was the example:
Thank you!!!
Sorry, I misread your second post. I was confusing it with the first one.
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.