CRAN package and non latin "annotations"

Dear all,

I am the developer of the ggwordcloud package (https://cran.r-project.org/web/packages/ggwordcloud/index.html) which uses ggplot2 as a backbone to produce "nice" wordclouds. I was contacted yesterday by the CRAN team because my package was "attempting to plot non-Latin annotations", which makes sense as I am using UFT8 encoded strings in various languages in the vignette.

The easiest fix would be to fail back to a demo wordcloud using only latin characters... but I would really like to keep the current one, which stresses the beauty of the different ways of writing. My question is thus: is there a clean way to produce a plot with ggplot using non latin characters that passes the CRAN check?

Thank you,

Erwan

I am not entirely sure what that means, but I am fairly sure that non-latin text is allowed in graphics.

Did you get an actual email from CRAN? Can you share the relevant parts of that email?

Or do you mean the "Note: found 203 marked UTF-8 strings" note on CRAN at CRAN Package Check Results for Package ggwordcloud? That note is allowed, there is nothing to do with it.

Here is the cryptic mail I have received with other package maintainers:

CRAN packages attempting to plot non-Latin annotations

Prof Brian Ripley ripley@stats.ox.ac.uk

That is

    AMR ggwordcloud grwat moranajp vivainsights

Please do look at your package's check output, especially its -Ex.pdf
file and trhe warnings about encoding in the -Ex.Rout file.

How to plot such charsets on a pdf() device was covered many years ago
in the reference https://www.r-project.org/doc/Rnews/Rnews_2006-2.pdf.

Please correct before 2023-10-29 to safely retain the package on CRAN.

One of the maintainers, Martin Chan, asked yesterday for more information, so far withour success...

Yours,

Erwan

If you run R CMD check on your package, go into the ggwordcloud.Rcheck directory and look at the ggwordcloud-Ex.pdf and ggwordcloud-Ex.Rout files. The PDF file is missing some fonts for me, and the .Rout file has a bunch of warnings:

Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y,  :
  conversion failure on '愛' in 'mbcsToSbcs': dot substituted for <e6>

I don't know how to fix this, but maybe that R News article (Non-Standard Fonts in PostScript and PDF Graphics, page 41) has some clues.

Thank you. It seems that I need to include and register a font including those characters.

I think I understand what is going on. I was using examples in the documentation involving non-latin characters in (gg)plot. As often with non-latin characters, the rendering depends on a lot on the device used... and the pdf() one, used to build the documentation pdf, has trouble using those non-latin characters. If I modify the examples to use only latin characters, I remove the "conversion failure" errors which I think now was the reason the CRAN contact me.
I will submit a new version as soon as possible to confirm my hypothesis.

R does not run the examples for the documentation PDF. The documentation PDF is fine, well, unless you also have figures with non-ascii characters in it.

The other PDF is created by R CMD check and it includes the output of all plotting from your examples.

Indeed, if you only plot ASCII stuff in your examples, that is a workaround.

Thank you for the clarificaion! I think the "no-latin character" policy in the examples will work to pass the CRAN check... even if the only difference is thus in this PDF created only for the check!

This topic was automatically closed after 45 days. New replies are no longer allowed.


If you have a query related to it or one of the replies, start a new topic and refer back with a link.