Error when moving to 3.5 : C stack usage ... is too close to the limit

felloumi · September 14, 2018, 9:35pm

Dear all,

We have production and development servers (with the same configuration) which had been running a shiny application (using shiny-server pro) for few years without any issues until recently, with the problems starting after changing from R 3.4 to R 3.5.0.

We are seeing segfault errors on both servers, e.g. in /var/log/messages we see lines such as

Sep 14 13:41:33 server kernel: R[20882]: segfault at 1 ip 00007f462caa0120 sp 00007ffe15b00aa0 error 4 in libR.so[7f462c903000+407000]

Sometimes, on the production system (protected by a firewall), but not on development server, we see messages about C stack close to the limit, e.g. in the /var/log/shiny-server/*.log files we sometimes see lines like:

Loading required package: rcdk

Loading required package: rcdklibs

Loading required package: rJava

Loading required package: fingerprint

Loading required package: rcellminerData

Consider citing this package: Luna A, et al. rcellminer: exploring molecular profiles and drug response of the NCI-60 cell lines in R. PMID: 26635141; citation("rcellminer")

Error: C stack usage 186609093604 is too close to the limit

Execution halted

Error in exists(name, envir = ld, inherits = FALSE) :

not a BUILTIN function

Calls: <Anonymous> -> cleanup -> :: -> getExportedValue

Warning: stack imbalance in 'lazyLoadDBfetch', 12 then -8

Fatal error: error during cleanup

When we get the above, our website becomes unreachable, and there is always a job running that is using about 200% of a cpu. We have found that killing that job allows us to again access the website. However, it appears, and this could be wrong and misleading, but again I’ll say, it “appears” that if we simply kill the job, then it isn’t too long before the website again becomes unreachable, and again a job running 200% must be killed. However, if we kill the job, then do ‘restart shiny-server’, then it seems to take much longer before the problem occurs again. However, let me state that we may be wrong about that, and doing ‘restart shiny-server’ might not actually have any effect upon how long it is before the problem re-occurs. It might simply be due to what people are running at the time.

Both servers are running CentOS 6.10 (Linux 2.6.32-696.23.1.el6.x86_64). They both have the same shiny-configuration, however the production system is using the xhr streaming protocol and the development one is using the web socket protocol.

It seems, we may have some type of memory of leak, however we don’t have any recursive indefinite call.

Note that after changing from R 3.4 to R 3.5.0 we did re-install all of the R packages and our internal packages from source, so we are not still using packages that were installed under R 3.4. Our internal packages were developed under a Mac OS machine and installed on the servers from source using the options –preclean –clean and –resave-data. Our packages have R and data folders with lazy loading requested for the data. The largest data that could be loaded is about 39Mb.

Do you have any idea or hint about this issue?

Thank you!

andrie · September 17, 2018, 1:21pm

Since you are using Shiny Server Pro, you are eligible for premium support, so I urge you to create a support ticket at https://support.rstudio.com/hc/en-us/requests/new

You don't state whether your upgrade of R was done using apt-get install R or whether you built R from source. But the error message you get reminds me of a similar client situation where these error messages were resolved by installing R from source.

We also strongly suggest that you maintain multiple installations of R side-by-side. This is easily achieved by installing from source.

I hope this helps.

felloumi · September 17, 2018, 8:17pm

Thanks Andrie for your suggestion. We installed R with Yum for Centos OS 6 and we submitted a ticket to support@rstudio.com team(28874).

felloumi · September 27, 2018, 2:05pm

Dear Andrie,

Following your suggestion, we installed R from source for version 3.5.1 and then for 3.4.4 however it did not work and had the same issue.

We went back to the previous working version R 3.4.1 and got the same issue. I wonder if we need to go back to the same GCC compiler version?

The problem is our shiny app works fine for few hours and then randomly crashes with error in the log file like this:

Loading required package: rcdk

Loading required package: rcdklibs

Loading required package: rJava

Loading required package: fingerprint

Loading required package: rcellminerData

Error: C stack usage 82761618932 is too close to the limit

Execution halted

**** caught segfault ****

address 0x7fe99265f070, cause 'invalid permissions'

An irrecoverable exception occurred. R is aborting now ...

/opt/shiny-server/scripts/shiny-run: line 3: 63094 Segmentation fault /bin/bash --login -c "$SHINY_EXEC_COMMAND SHINY_R_PATH "*

As you can read above in, this happened randomly (in the majority of the cases) after loading the library rcellminer. We are looking at the dependencies if there are some issues with some packages like rcdk and fingerprint.

Lately, we disabled the JIT (just in time compilation). It worked for few hours with some warnings in the messages log file: “server kernel: R[56104] general protection ip:7fa358ce2f73 sp:7fff6f20fd00 error:0 in libR.so[7fa358bc8000+2c2000]” and then it crashed. We re-started the shiny-server to have the app running.

I have to note that the shiny-server.log reported some warnings like:

[2018-09-26T18:28:11.343] [WARN] shiny-server - Expected 1 processes, got 0

(node:63173) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 101 release-socket listeners added. Use emitter.setMaxListeners() to increase limit

[2018-09-26T19:21:43.171] [WARN] shiny-server - Expected 2 processes, got 1

[2018-09-26T23:21:07.068] [WARN] shiny-server - Expected 1 processes, got 0

[2018-09-27T01:03:11.030] [WARN] shiny-server - Expected 1 processes, got 0

We reached the shiny support team for help however they reported that is an R issue.

Any hint or help will be appreciated.

Thanks.