However, it does not work and gives me following error:
Error in spark_connect_gateway(gatewayAddress, gatewayPort, sessionId, :
Gateway in localhost:8880 did not respond.
Try running `options(sparklyr.log.console = TRUE)` followed by `sc <- spark_connect(...)` for more debugging info.
I included the sparklyr.log.console options, but there is no additional information given in the error message.
I tried to also install the spark again from the package instead of using my SPARK_HOME. Unfortunately, this does not help as well.
Is there something that I miss here? How can I get it to run? Ultimately, I need to run this in a yarn cluster. However, both local and yarn mode does not work right now.
It should be the Cloudera spark. I added the spark_home param to be sure that it takes the right one. However, not using the spark_home parameter does not change anything unfortunately.
It seems like there is something that I miss that cause the gateway to not respond.
Ultimately, I need to run the script in a Haddop Yarn cluster (master = "yarn"). I tried to use master = "yarn" instead of local, but this does not help as well.
The code above gives me also the gateway error code. Unfortunately, I cannot get more error message than this. The sparklyr.log.console options does not work properly. I also tried to used the option as conf$sparklyr.log.console <- TRUE but nothing happened.
Error in spark_connect_gateway(gatewayAddress, gatewayPort, sessionId, :
Gateway in localhost:8880 did not respond.
Try running `options(sparklyr.log.console = TRUE)` followed by `sc <- spark_connect(...)` for more debugging info.
The R session runs in a CDSW session if this matters.
The error now also mentions that the executor memory must be a positive number. Which is weird as I inserted 2g there
Exception in thread "main" org.apache.spark.SparkException: Executor memory must be a positive number
at org.apache.spark.deploy.SparkSubmitArguments.error(SparkSubmitArguments.scala:657)
at org.apache.spark.deploy.SparkSubmitArguments.validateSubmitArguments(SparkSubmitArguments.scala:274)
at org.apache.spark.deploy.SparkSubmitArguments.validateArguments(SparkSubmitArguments.scala:251)
at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:120)
at org.apache.spark.deploy.SparkSubmit$$anon$2$$anon$1.<init>(SparkSubmit.scala:909)
at org.apache.spark.deploy.SparkSubmit$$anon$2.parseArguments(SparkSubmit.scala:909)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:81)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:922)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:931)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Error in spark_connect_gateway(gatewayAddress, gatewayPort, sessionId, :
Gateway in localhost:8880 did not respond.
2
Try running `options(sparklyr.log.console = TRUE)` followed by `sc <- spark_connect(...)` for more debugging info.
master = "local" still does not work at all unfortunately.