SPARKをHive実行エンジンとして使用しようとしていますが、以下のエラーが発生します。Spark 1.5.0がインストールされ、Hive1.1.0を使用しています。 Hadoop2.7.0バージョンのバージョン。
Hive_emp
テーブルは、HiveでORC形式のテーブルとして作成されます。
Hive (Koushik)> insert into table Hive_emp values (2,'Koushik',1);
Query ID = hduser_20150921072727_feba8363-258d-4d0b-8976-662e404bca88
Total jobs = 1
Launching Job 1 out of 1
In order to change the average load for a reducer (in bytes):
set Hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set Hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Java.lang.NoClassDefFoundError: org/Apache/spark/SparkConf
at org.Apache.hadoop.Hive.ql.exec.spark.HiveSparkClientFactory.generateSparkConf(HiveSparkClientFactory.Java:140)
at org.Apache.hadoop.Hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.Java:56)
at org.Apache.hadoop.Hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.Java:55)
at org.Apache.hadoop.Hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.Java:116)
at org.Apache.hadoop.Hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.Java:113)
at org.Apache.hadoop.Hive.ql.exec.spark.SparkTask.execute(SparkTask.Java:95)
at org.Apache.hadoop.Hive.ql.exec.Task.executeTask(Task.Java:160)
at org.Apache.hadoop.Hive.ql.exec.TaskRunner.runSequential(TaskRunner.Java:88)
at org.Apache.hadoop.Hive.ql.Driver.launchTask(Driver.Java:1638)
at org.Apache.hadoop.Hive.ql.Driver.execute(Driver.Java:1397)
at org.Apache.hadoop.Hive.ql.Driver.runInternal(Driver.Java:1183)
at org.Apache.hadoop.Hive.ql.Driver.run(Driver.Java:1049)
at org.Apache.hadoop.Hive.ql.Driver.run(Driver.Java:1039)
at org.Apache.hadoop.Hive.cli.CliDriver.processLocalCmd(CliDriver.Java:207)
at org.Apache.hadoop.Hive.cli.CliDriver.processCmd(CliDriver.Java:159)
at org.Apache.hadoop.Hive.cli.CliDriver.processLine(CliDriver.Java:370)
at org.Apache.hadoop.Hive.cli.CliDriver.executeDriver(CliDriver.Java:754)
at org.Apache.hadoop.Hive.cli.CliDriver.run(CliDriver.Java:675)
at org.Apache.hadoop.Hive.cli.CliDriver.main(CliDriver.Java:615)
at Sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at Sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.Java:57)
at Sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.Java:43)
at Java.lang.reflect.Method.invoke(Method.Java:601)
at org.Apache.hadoop.util.RunJar.run(RunJar.Java:221)
at org.Apache.hadoop.util.RunJar.main(RunJar.Java:136)
Caused by: Java.lang.ClassNotFoundException: org.Apache.spark.SparkConf
at Java.net.URLClassLoader$1.run(URLClassLoader.Java:366)
at Java.net.URLClassLoader$1.run(URLClassLoader.Java:355)
at Java.security.AccessController.doPrivileged(Native Method)
at Java.net.URLClassLoader.findClass(URLClassLoader.Java:354)
at Java.lang.ClassLoader.loadClass(ClassLoader.Java:423)
at Sun.misc.Launcher$AppClassLoader.loadClass(Launcher.Java:308)
at Java.lang.ClassLoader.loadClass(ClassLoader.Java:356)
... 25 more
FAILED: Execution Error, return code -101 from org.Apache.hadoop.Hive.ql.exec.spark.SparkTask. org/Apache/spark/SparkConf
また、HiveShellでsparkパスと実行エンジンを設定しました。
hduser@ubuntu:~$ spark-Shell
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.5.0
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_21)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc.
SQL context available as sqlContext.
scala> exit;
warning: there were 1 deprecation warning(s); re-run with -deprecation for details
hduser@ubuntu:~$ Hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/Hive/lib/spark-Assembly-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/Hive/auxlib/spark-Assembly-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/Hive/lib/spark-Assembly-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/Hive/auxlib/spark-Assembly-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Logging initialized using configuration in file:/usr/lib/Hive/conf/Hive-log4j.properties
Hive (default)> use Koushik;
OK
Time taken: 0.593 seconds
Hive (Koushik)> set spark.home=/usr/local/src/spark;
以下のように.hivercも作成しました
hduser@ubuntu:/usr/lib/Hive/conf$ cat .hiverc
SET Hive.cli.print.header=true;
set Hive.cli.print.current.db=true;
set Hive.auto.convert.join=true;
SET hbase.scan.cacheblock=0;
SET hbase.scan.cache=10000;
SET hbase.client.scanner.cache=10000;
SET Hive.execution.engine=spark;
以下に示すDEBUGモードエラーの詳細:
hduser@ubuntu:~$ Hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/Hive/lib/spark-Assembly-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/Hive/auxlib/spark-Assembly-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/Hive/lib/spark-Assembly-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/Hive/auxlib/spark-Assembly-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Logging initialized using configuration in file:/usr/lib/Hive/conf/Hive-log4j.properties
Hive (default)> use Koushik;
OK
Time taken: 0.625 seconds
Hive (Koushik)> set Hive --hiveconf Hive.root.logger=DEBUG
> ;
Hive (Koushik)> set Hive.execution.engine=spark;
Hive (Koushik)> desc Hive_emp;
OK
col_name data_type comment
empid int
empnm varchar(50)
deptid int
Time taken: 0.173 seconds, Fetched: 3 row(s)
Hive (Koushik)> select * from Hive_emp;
OK
Hive_emp.empid Hive_emp.empnm Hive_emp.deptid
Time taken: 1.689 seconds
Hive (Koushik)> insert into table Hive_emp values (2,'Koushik',1);
Query ID = hduser_20151015112525_c96a458b-34f8-42ac-ab11-52c32479a29a
Total jobs = 1
Launching Job 1 out of 1
In order to change the average load for a reducer (in bytes):
set Hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set Hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Java.lang.NoSuchMethodError: org.Apache.spark.scheduler.LiveListenerBus.addListener(Lorg/Apache/spark/scheduler/SparkListener;)V
at org.Apache.hadoop.Hive.ql.exec.spark.LocalHiveSparkClient.<init>(LocalHiveSparkClient.Java:85)
at org.Apache.hadoop.Hive.ql.exec.spark.LocalHiveSparkClient.getInstance(LocalHiveSparkClient.Java:69)
at org.Apache.hadoop.Hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.Java:56)
at org.Apache.hadoop.Hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.Java:55)
at org.Apache.hadoop.Hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.Java:116)
at org.Apache.hadoop.Hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.Java:113)
at org.Apache.hadoop.Hive.ql.exec.spark.SparkTask.execute(SparkTask.Java:95)
at org.Apache.hadoop.Hive.ql.exec.Task.executeTask(Task.Java:160)
at org.Apache.hadoop.Hive.ql.exec.TaskRunner.runSequential(TaskRunner.Java:88)
at org.Apache.hadoop.Hive.ql.Driver.launchTask(Driver.Java:1638)
at org.Apache.hadoop.Hive.ql.Driver.execute(Driver.Java:1397)
at org.Apache.hadoop.Hive.ql.Driver.runInternal(Driver.Java:1183)
at org.Apache.hadoop.Hive.ql.Driver.run(Driver.Java:1049)
at org.Apache.hadoop.Hive.ql.Driver.run(Driver.Java:1039)
at org.Apache.hadoop.Hive.cli.CliDriver.processLocalCmd(CliDriver.Java:207)
at org.Apache.hadoop.Hive.cli.CliDriver.processCmd(CliDriver.Java:159)
at org.Apache.hadoop.Hive.cli.CliDriver.processLine(CliDriver.Java:370)
at org.Apache.hadoop.Hive.cli.CliDriver.executeDriver(CliDriver.Java:754)
at org.Apache.hadoop.Hive.cli.CliDriver.run(CliDriver.Java:675)
at org.Apache.hadoop.Hive.cli.CliDriver.main(CliDriver.Java:615)
at Sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at Sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.Java:57)
at Sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.Java:43)
at Java.lang.reflect.Method.invoke(Method.Java:601)
at org.Apache.hadoop.util.RunJar.run(RunJar.Java:221)
at org.Apache.hadoop.util.RunJar.main(RunJar.Java:136)
FAILED: Execution Error, return code -101 from org.Apache.hadoop.Hive.ql.exec.spark.SparkTask. org.Apache.spark.scheduler.LiveListenerBus.addListener(Lorg/Apache/spark/scheduler/SparkListener;)V
Hive (Koushik)>
上記の挿入を2回実行しましたが、どちらも失敗しました。今日生成されたHive.logを見つけてください。 Hive.log
私も同じ問題に直面していましたmy Ubuntu 14.4 VitualBox
。修正するために私が従った手順は次のとおりです。
Hive> set spark.home=/usr/local/spark;
Hive> set spark.master=local;
Hive> SET Hive.execution.engine=spark;
追加spark-Assembly jar
以下に示すファイル:
Hive> ADD jar /usr/local/spark/lib/spark-Assembly-1.4.0-hadoop2.6.0.jar;
このエラーの理由は、Hiveがsparkアセンブリjarを見つけることができないためです。
sPARK_HOME =/usr/local/src/sparkをエクスポートするか、Hivelibフォルダーにspark Assemblyjarを追加します。この問題は解決されます。