Hi,
I am having trouble getting started with sparklyr and a local install of Spark on Windows 10. Any help appreciated, I'm just getting started with Spark.
tl;dr; It looks like I am missing %SPARK_HOME/launcher/target/scala-2.xx
. Where does that come from?
All the details
I installed sparklyr 0.94 from CRAN, then installed Spark 2.4.0 using
sparklyr::spark_install(version = "2.4.0")
When I try to start Spark with
sc <- spark_connect(master = "local", version='2.4.0')
I get this error:
Error in force(code) :
Failed while connecting to sparklyr to port (8880) for sessionid (37723): Gateway in localhost:8880 did not respond.
Path: C:\Users\kjohnson\AppData\Local\spark\spark-2.4.0-bin-hadoop2.7\bin\spark-submit2.cmd
Parameters: --class, sparklyr.Shell, "C:\Program Files\R\Library\sparklyr\java\sparklyr-2.3-2.11.jar", 8880, 37723
Log: C:\Users\KJOHNS~1.CAL\AppData\Local\Temp\RtmpuKTspW\file3fa013955648_spark.log
---- Output Log ----
---- Error Log ----
Calls: spark_connect ... tryCatchOne -> <Anonymous> -> abort_shell -> <Anonymous> -> force
In addition: Warning message:
In system2(spark_submit_path, args = shell_args, stdout = stdout_param, :
'CreateProcess' failed to run 'C:\Users\kjohnson\AppData\Local\spark\SPARK-~1.7\bin\SPARK-~1.CMD --class sparklyr.Shell "C:\Program Files\R\Library\sparklyr\java\sparklyr-2.3-2.11.jar" 8880 37723'
If I try the command from the command line, I get a somewhat more informative error:
C:\> C:\Users\kjohnson\AppData\Local\spark\spark-2.4.0-bin-hadoop2.7\bin\spark-submit2.cmd --class sparklyr.Shell "C:\Program Files\R\Library\sparklyr\java\sparklyr
-2.3-2.11.jar" 8880 76708
Exception in thread "main" java.lang.IllegalStateException: Cannot find any build directories.
at org.apache.spark.launcher.CommandBuilderUtils.checkState(CommandBuilderUtils.java:248)
at org.apache.spark.launcher.AbstractCommandBuilder.getScalaVersion(AbstractCommandBuilder.java:242)
at org.apache.spark.launcher.AbstractCommandBuilder.buildClassPath(AbstractCommandBuilder.java:196)
at org.apache.spark.launcher.AbstractCommandBuilder.buildJavaCommand(AbstractCommandBuilder.java:117)
at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitCommand(SparkSubmitCommandBuilder.java:261)
at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildCommand(SparkSubmitCommandBuilder.java:164)
at org.apache.spark.launcher.Main.buildCommand(Main.java:110)
at org.apache.spark.launcher.Main.main(Main.java:63)
Deleting C:\Users\KJOHNS~1.CAL\AppData\Local\Temp\spark-class-launcher-output-26749.txt
1 file deleted
Looking at this source code, it appears that the command is looking for a file at either %SPARK_HOME/launcher/target/scala-2.12
or %SPARK_HOME/launcher/target/scala-2.11
.
My SPARK_HOME
is C:\Users\kjohnson\AppData\Local\spark\spark-2.4.0-bin-hadoop2.7\bin\..
and there is no launcher
directory there.
So, is there something else I need to install? Am I doing something wrong?
I retried specifying Spark version 2.3.2 for spark_install
and spark_connect
with the same result...
Thanks for any help!!
Kent