CRAN package and non latin "annotations"

Erwan_Le_Pennec · October 16, 2023, 5:43am

Dear all,

I am the developer of the ggwordcloud package (https://cran.r-project.org/web/packages/ggwordcloud/index.html) which uses ggplot2 as a backbone to produce "nice" wordclouds. I was contacted yesterday by the CRAN team because my package was "attempting to plot non-Latin annotations", which makes sense as I am using UFT8 encoded strings in various languages in the vignette.

The easiest fix would be to fail back to a demo wordcloud using only latin characters... but I would really like to keep the current one, which stresses the beauty of the different ways of writing. My question is thus: is there a clean way to produce a plot with ggplot using non latin characters that passes the CRAN check?

Thank you,

Erwan

Gabor · October 16, 2023, 6:54am

I am not entirely sure what that means, but I am fairly sure that non-latin text is allowed in graphics.

Did you get an actual email from CRAN? Can you share the relevant parts of that email?

Or do you mean the "Note: found 203 marked UTF-8 strings" note on CRAN at CRAN Package Check Results for Package ggwordcloud? That note is allowed, there is nothing to do with it.

Erwan_Le_Pennec · October 17, 2023, 7:52am

Here is the cryptic mail I have received with other package maintainers:

CRAN packages attempting to plot non-Latin annotations

Prof Brian Ripley ripley@stats.ox.ac.uk

That is

    AMR ggwordcloud grwat moranajp vivainsights

Please do look at your package's check output, especially its -Ex.pdf
file and trhe warnings about encoding in the -Ex.Rout file.

How to plot such charsets on a pdf() device was covered many years ago
in the reference https://www.r-project.org/doc/Rnews/Rnews_2006-2.pdf.

Please correct before 2023-10-29 to safely retain the package on CRAN.

One of the maintainers, Martin Chan, asked yesterday for more information, so far withour success...

Yours,

Erwan

Gabor · October 17, 2023, 9:14am

If you run R CMD check on your package, go into the ggwordcloud.Rcheck directory and look at the ggwordcloud-Ex.pdf and ggwordcloud-Ex.Rout files. The PDF file is missing some fonts for me, and the .Rout file has a bunch of warnings:

Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y,  :
  conversion failure on '愛' in 'mbcsToSbcs': dot substituted for <e6>

I don't know how to fix this, but maybe that R News article (Non-Standard Fonts in PostScript and PDF Graphics, page 41) has some clues.

Erwan_Le_Pennec · October 17, 2023, 9:16am

Thank you. It seems that I need to include and register a font including those characters.

Erwan_Le_Pennec · October 17, 2023, 12:36pm

I think I understand what is going on. I was using examples in the documentation involving non-latin characters in (gg)plot. As often with non-latin characters, the rendering depends on a lot on the device used... and the pdf() one, used to build the documentation pdf, has trouble using those non-latin characters. If I modify the examples to use only latin characters, I remove the "conversion failure" errors which I think now was the reason the CRAN contact me.
I will submit a new version as soon as possible to confirm my hypothesis.

Gabor · October 17, 2023, 12:55pm

R does not run the examples for the documentation PDF. The documentation PDF is fine, well, unless you also have figures with non-ascii characters in it.

The other PDF is created by R CMD check and it includes the output of all plotting from your examples.

Indeed, if you only plot ASCII stuff in your examples, that is a workaround.

Erwan_Le_Pennec · October 17, 2023, 1:57pm

Thank you for the clarificaion! I think the "no-latin character" policy in the examples will work to pass the CRAN check... even if the only difference is thus in this PDF created only for the check!

system · November 30, 2023, 5:43am

This topic was automatically closed after 45 days. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.