I am trying to connect to spark from R studio.currently we are using cloudera hadoop distribution where the spark(2.2) is running.I tested everything from edge node, I was able to create spark context and execute my queries as well. Everything works fine till yesterday from Rstudio, suddenly we have issues from R Studio.
library(dplyr)
library(sparklyr)
config <- spark_config()
config$spark.driver.memory <- "8G"
config$spark.executor.memory <- "8G"
config$spark.executor.executor <- "2"
config$spark.executor.cores <- "4"
config$spark.kryoserializer.buffer.max <- "2000m"
config$spark.driver.maxResultSize <- "4G"
config$spark.akka.frameSize <- "768"
sc <- spark_connect(master="yarn-client",
version="2.2.0",
config=config,
spark_home = '/opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/lib/spark2')
Error in force(code) : Failed while connecting to sparklyr to port (8880) for sessionid (14727): Sparklyr gateway did not respond while retrieving ports information after 60 seconds Path: /opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/lib/spark2/bin/spark-submit Parameters: --class, sparklyr.Shell, '/usr/lib64/R/library/sparklyr/java/sparklyr-2.2-2.11.jar', 8880, 14727 Log: /tmp/RtmpoNJQEH/file151b437c0313b_spark.log
---- Output Log ---- 18/11/12 13:54:50 INFO sparklyr: Session (14727) is starting under 127.0.0.1 port 8880 18/11/12 13:54:50 INFO sparklyr: Session (14727) found port 8880 is not available 18/11/12 13:54:50 INFO sparklyr: Backend (14727) found port 8884 is available 18/11/12 13:54:50 INFO sparklyr: Backend (14727) is registering session in gateway 18/11/12 13:54:50 INFO sparklyr: Backend (14727) is waiting for registration in gateway
---- Error Log ----
I verified the version for sparklyr as well, it was 0.9.2
Can some please let me know what could be the wrong ?