Hello,
We are trying to connect our RStudio to hive using the following code:
install.packages("rJava")
install.packages("RJDBC",dep=TRUE)
options( java.parameters = "-Xmx8g" )
library("DBI")
library("rJava")
library("RJDBC")
cp = c("/usr/hdp/current/hive-client/lib/hive-jdbc.jar",
"/usr/hdp/current/hadoop-client/hadoop-common.jar")
.jinit(classpath=cp)
drv <- JDBC("org.apache.hive.jdbc.HiveDriver",
"/usr/hdp/current/hive-client/lib/hive-jdbc.jar",
identifier.quote="`")
conn <- dbConnect(drv, "jdbc:hive2:<SERVER_NAME>", "user", "pass")
show_databases <- dbGetQuery(conn, "show databases")
show_databases
Currently, I get "java.lang.NoClassDefFoundError: org/apache/thrift/TException", however previously when I reinstalled the packages I got "java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback"
This seems like an issue with the library not being loaded in the classpath which I am currently debugging. Is there another method to connect RStudio to Hive or has anyone encountered a similar difficulty?
Thanks!
Another option would be to use ODBC instead of JDBC. Here are a couple of links that may be of help on how to set that up:
1 Like
Thanks edgaruiz, I'll try that out.
This was fixed by having the user first kinit (were using kerberos) and then executing the following code:
# install packages("DBI")
# install.packages("rJava")
# install.packages("RJDBC",dep=TRUE)
# install.packages("odbc")
library(DBI)
library(rJava)
library(RJDBC)
print("Attempting Hive Connection...")
hadoop.class.path = list.files(path=c("/usr/hdp/current/hadoop-client"),pattern="jar", full.names=T);
hive.class.path = list.files(path=c("/usr/hdp/current/hive-client/lib"),pattern="jar", full.names=T);
hadoop.lib.path = list.files(path=c("/usr/hdp/current/hadoop/lib"),pattern="jar",full.names=T);
mapred.class.path = list.files(path=c("/usr/hdp/current/hadoop-mapreduce-client/lib"),pattern="jar",full.names=T);
cp = c(hive.class.path,hadoop.lib.path,mapred.class.path,hadoop.class.path,hadoop.common.path)
.jinit(classpath=cp)
drv <- JDBC("org.apache.hive.jdbc.HiveDriver","/usr/hdp/current/hive-client/lib/hive-jdbc.jar",identifier.quote="`")
conn <- dbConnect(drv, "jdbc:hive2://JDBC-provided-by-ambari-server")
show_databases <- dbGetQuery(conn, "show databases")
print("Connected.")
print(show_databases)
Hopefully this helps someone else
system
Closed
June 19, 2019, 1:14pm
5
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.