How to use non-English characters in a bib file in RMarkdown with vitae package?

sbac · April 14, 2021, 8:53am

This is an RMarkdown file that I use to knit a cv in pdf.
I have no problem using Portuguese characters in the body of the Rmd file.

But the same characters imported in the bib file (test.bib) when I do pubs <- bibliography_entries("test.bib") are not well encoded.

This only happens in Windows. When I use my Mac the output is fine!

---
header-includes:
   - \usepackage[T1]{fontenc}
   - \usepackage[utf8]{inputenc}
   - \usepackage[portuges]{babel}
   - \usepackage{apalike}
output: vitae::hyndman
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE)
library(vitae)
```

# Publicações
Sérgio Conceição é o autor de um trabalho sobre habitação.

```{r publications}
library(dplyr)
pubs <- bibliography_entries("test.bib")
pubs
```

This is my test.bib file:

@article{conc2021,
  title={História da Habitação},
  author={Conceição, Sérgio},
  journal={Portuguese History},
  number={1},
  year={2021}
}

This is the output:

cderv · April 14, 2021, 9:56am

Could be related to this ?

github.com/mitchelloharawild/vitae

Special Characters in References not rendering correctly

opened 01:05PM - 23 Feb 21 UTC

closed 11:35AM - 28 Jul 21 UTC

fmsabatini

bug

I'm having trouble with my awesomeCV. After updating to R 4.0.2 it's not rende…ring my reference list correctly anymore. The special characters (umlaut, apostrophs and so on) are not picked up as UTF-8 symbols. I spent all morning updating all packages, as well as pandoc. Yet, the references are not displaying as expected: ![image](https://user-images.githubusercontent.com/51127026/108846589-08c1af80-75df-11eb-9f5b-978b74d84e0d.png) SessionInfo() ``` R version 4.0.1 (2020-06-06) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363) Matrix products: default locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] tools stats graphics grDevices utils datasets methods base other attached packages: [1] rmarkdown_2.7 dplyr_1.0.0 vitae_0.4.2.9000 loaded via a namespace (and not attached): [1] Rcpp_1.0.4.6 highr_0.8 pillar_1.4.4 compiler_4.0.1 prettyunits_1.1.1 remotes_2.2.0 testthat_2.3.2 digest_0.6.25 pkgbuild_1.0.8 pkgload_1.1.0 [11] jsonlite_1.6.1 tibble_3.0.1 memoise_1.1.0 evaluate_0.14 lifecycle_0.2.0 pkgconfig_2.0.3 rlang_0.4.10 cli_2.0.2 rstudioapi_0.11 curl_4.3 [21] yaml_2.2.1 xfun_0.21 stringr_1.4.0 withr_2.2.0 knitr_1.29 hms_0.5.3 desc_1.2.0 generics_0.0.2 fs_1.4.1 vctrs_0.3.6 [31] devtools_2.3.0 tidyselect_1.1.0 rprojroot_2.0.2 glue_1.4.1 R6_2.4.1 processx_3.4.2 fansi_0.4.1 sessioninfo_1.1.1 readr_1.3.1 purrr_0.3.4 [41] callr_3.4.3 magrittr_1.5 ps_1.3.3 ellipsis_0.3.1 htmltools_0.5.0 usethis_1.6.1 assertthat_0.2.1 utf8_1.1.4 tinytex_0.24 stringi_1.4.6 [51] crayon_1.3.4 ``` Pandoc Version ``` rmarkdown::pandoc_version() [1] ‘2.11.2’ ``` I appreciate this is a problem linked to the locale of my Windows 10. I tried to change my locale to UTF-8, but without success ``` > Sys.getlocale() [1] "LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.1252;LC_MONETARY=English_United Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252" ``` ``` Sys.setlocale("LC_CTYPE", "UTF-8") `[1] "" Warning message: In Sys.setlocale("LC_CTYPE", "UTF-8") : OS reports request to set locale to "UTF-8" cannot be honored` ``` Suggestions?

sbac · April 14, 2021, 9:57am

@cderv Please see my post edit. Yes it is the same problem with Windows 10. Not with MacOS or Linux.

cderv · April 16, 2021, 11:02am

If it is windows only, it may be link to non-UTF8 by default on windows and a file that may not be read with the correct encoding.

You should also be sure that your files are all UTF8 encoded.

I would follow the issue on Github from there if it is the same.

I can't do much more. Hope it helps

sbac · April 16, 2021, 2:47pm

I tried to change Windows default to utf-8 and also the language to Portuguese but none of succeeded.
It is strange because when I open test.bib with Notepad on Windows it reads well. The file is utf-8 coded.

cderv · April 16, 2021, 5:33pm

Seems like an issue with vitae then. I don't know how this package works so I can't help further. Sorry

jlacko · April 16, 2021, 6:05pm

Encoding is pain - and the English speaking guys have it easy.

I am not familiar with {vitae}, but two tricks that I found helpful when hacking character encoding in general are:

using the unicode escape function stringi::stri_escape_unicode() to encode characters that the current codepage has difficulty handling
re-encoding characters from unicode back to unicode stringi::stri_encode(yer_string, from = 'UTF-8', to = 'UTF-8') - this is a truly ugly hack, but it has worked for me when nothing else would...

system · May 7, 2021, 6:05pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.