CRAN package and non latin "annotations"

I am the developer of the ggwordcloud package ( which uses ggplot2 as a backbone to produce "nice" wordclouds. I was contacted yesterday by the CRAN team because my package was "attempting to plot non-Latin annotations", which makes sense as I am using UFT8 encoded strings in various languages in the vignette.

The easiest fix would be to fail back to a demo wordcloud using only latin characters... but I would really like to keep the current one, which stresses the beauty of the different ways of writing. My question is thus: is there a clean way to produce a plot with ggplot using non latin characters that passes the CRAN check?

I am not entirely sure what that means, but I am fairly sure that non-latin text is allowed in graphics.

Did you get an actual email from CRAN? Can you share the relevant parts of that email?

Or do you mean the "Note: found 203 marked UTF-8 strings" note on CRAN at CRAN Package Check Results for Package ggwordcloud? That note is allowed, there is nothing to do with it.

Here is the cryptic mail I have received with other package maintainers:

CRAN packages attempting to plot non-Latin annotations

Prof Brian Ripley

That is

    AMR ggwordcloud grwat moranajp vivainsights

Please do look at your package's check output, especially its -Ex.pdf
file and trhe warnings about encoding in the -Ex.Rout file.

How to plot such charsets on a pdf() device was covered many years ago
in the reference

Please correct before 2023-10-29 to safely retain the package on CRAN.

One of the maintainers, Martin Chan, asked yesterday for more information, so far withour success...



If you run R CMD check on your package, go into the ggwordcloud.Rcheck directory and look at the ggwordcloud-Ex.pdf and ggwordcloud-Ex.Rout files. The PDF file is missing some fonts for me, and the .Rout file has a bunch of warnings:

Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y,  :
  conversion failure on '愛' in 'mbcsToSbcs': dot substituted for <e6>

I don't know how to fix this, but maybe that R News article (Non-Standard Fonts in PostScript and PDF Graphics, page 41) has some clues.

Thank you. It seems that I need to include and register a font including those characters.

I think I understand what is going on. I was using examples in the documentation involving non-latin characters in (gg)plot. As often with non-latin characters, the rendering depends on a lot on the device used... and the pdf() one, used to build the documentation pdf, has trouble using those non-latin characters. If I modify the examples to use only latin characters, I remove the "conversion failure" errors which I think now was the reason the CRAN contact me.
I will submit a new version as soon as possible to confirm my hypothesis.

R does not run the examples for the documentation PDF. The documentation PDF is fine, well, unless you also have figures with non-ascii characters in it.

The other PDF is created by R CMD check and it includes the output of all plotting from your examples.

Indeed, if you only plot ASCII stuff in your examples, that is a workaround.

Thank you for the clarificaion! I think the "no-latin character" policy in the examples will work to pass the CRAN check... even if the only difference is thus in this PDF created only for the check!

