SparklyR/Spark crashing

In a single Linux node, I would like to set up Rstudio with Sparkly. Three to four people make up the dev team.
I am aware of the single-node spark cluster's constraints. When there is a resource problem with Spark, I want to know when more users join in to use Sparkly in Rstudio. It should simply retain the new jobs in the queue rather than crashing.
Would you kindly share the optimal method for allocating Spark's resources in this situation?

Hi @deekbakk , welcome to Community! When a dev being a Spark session, the dev will request a specific amount of memory and CPU cores, if this is not specified, then Spark will assume that they are requesting the default for the given version of Spark. This means that if there are no resources after, for example, 2 devs are already in the server, then the 3rd dev will not be able to start a session, it will time out. Here is a resource that may be of help in connection with requesting resources from a Stand Alone cluster: sparklyr - Configuring Spark Connections

1 Like

Hi @edgararuiz ,

Apologies for the delay in response. Thanks a lot for your suggestion. Noted and Understood. Let me check the configuration link that you shared and will keep you posted if I have any further questions.


This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.