web-dev-qa-db-ja.com

SparkおよびJava:awaitResultでスローされた例外

仮想マシン内で実行されているSparkクラスターをJavaアプリケーション内からIP _10.20.30.50_およびポート_7077_で接続して、ワードカウントの例を実行しようとしています。 :

_SparkConf conf = new SparkConf().setMaster("spark://10.20.30.50:7077").setAppName("wordCount");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> textFile = sc.textFile("hdfs://localhost:8020/README.md");
String result = Long.toString(textFile.count());
JavaRDD<String> words = textFile.flatMap((FlatMapFunction<String, String>) s -> Arrays.asList(s.split(" ")).iterator());
JavaPairRDD<String, Integer> pairs = words.mapToPair((PairFunction<String, String, Integer>) s -> new Tuple2<>(s, 1));
JavaPairRDD<String, Integer> counts = pairs.reduceByKey((Function2<Integer, Integer, Integer>) (a, b) -> a + b);
counts.saveAsTextFile("hdfs://localhost:8020/tmp/output");
sc.stop();
return result;
_

Javaアプリケーションは、次のスタックトレースを表示します。

_Running Spark version 2.0.1
Unable to load native-hadoop library for your platform... using builtin-Java classes where applicable
Changing view acls to: lii5ka
Changing modify acls to: lii5ka
Changing view acls groups to:
Changing modify acls groups to:
SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(lii5ka); groups with view permissions: Set(); users  with modify permissions: Set(lii5ka); groups with modify permissions: Set()
Successfully started service 'sparkDriver' on port 61267.
Registering MapOutputTracker
Registering BlockManagerMaster
Created local directory at /private/var/folders/4k/h0sl02993_99bzt0dzv759000000gn/T/blockmgr-51de868d-3ba7-40be-8c53-f881f97ced63
MemoryStore started with capacity 2004.6 MB
Registering OutputCommitCoordinator
Logging initialized @48403ms
jetty-9.2.z-SNAPSHOT
Started o.s.j.s.ServletContextHandler@1316e7ec{/jobs,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@782de006{/jobs/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@2d0353{/jobs/job,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@381e24a0{/jobs/job/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@1c138dc8{/stages,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@b29739c{/stages/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@63f6de31{/stages/stage,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@2a04ddcb{/stages/stage/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@2af9688e{/stages/pool,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@6a0c5bde{/stages/pool/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@3f5e17f8{/storage,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@33b86f5d{/storage/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@5264dcbc{/storage/rdd,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@5a3ebf85{/storage/rdd/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@159082ed{/environment,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@6522c585{/environment/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@115774a1{/executors,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@3e3a3399{/executors/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@2f2c5959{/executors/threadDump,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@5c51afd4{/executors/threadDump/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@76893a83{/static,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@19c07930{/,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@54eb0dc0{/api,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@5953786{/stages/stage/kill,null,AVAILABLE}
Started ServerConnector@2eeb8bd6{HTTP/1.1}{0.0.0.0:4040}
Started @48698ms
Successfully started service 'SparkUI' on port 4040.
Bound SparkUI to 0.0.0.0, and started at http://192.168.0.104:4040
Connecting to master spark://10.20.30.50:7077...
Successfully created connection to /10.20.30.50:7077 after 25 ms (0 ms spent in bootstraps)
Connecting to master spark://10.20.30.50:7077...
Still have 2 requests outstanding when connection from /10.20.30.50:7077 is closed
Failed to connect to master 10.20.30.50:7077

org.Apache.spark.SparkException: Exception thrown in awaitResult
        at org.Apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77) ~[spark-core_2.11-2.0.1.jar:2.0.1]
        at org.Apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75) ~[spark-core_2.11-2.0.1.jar:2.0.1]
        at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) ~[scala-library-2.11.8.jar:na]
        at org.Apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) ~[spark-core_2.11-2.0.1.jar:2.0.1]
        at org.Apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) ~[spark-core_2.11-2.0.1.jar:2.0.1]
        at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) ~[scala-library-2.11.8.jar:na]
        at org.Apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83) ~[spark-core_2.11-2.0.1.jar:2.0.1]
        at org.Apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:88) ~[spark-core_2.11-2.0.1.jar:2.0.1]
        at org.Apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:96) ~[spark-core_2.11-2.0.1.jar:2.0.1]
        at org.Apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106) ~[spark-core_2.11-2.0.1.jar:2.0.1]
        at Java.util.concurrent.Executors$RunnableAdapter.call(Executors.Java:511) [na:1.8.0_102]
        at Java.util.concurrent.FutureTask.run(FutureTask.Java:266) [na:1.8.0_102]
        at Java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.Java:1142) [na:1.8.0_102]
        at Java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.Java:617) [na:1.8.0_102]
        at Java.lang.Thread.run(Thread.Java:745) [na:1.8.0_102]
Caused by: Java.io.IOException: Connection from /10.20.30.50:7077 closed
        at org.Apache.spark.network.client.TransportResponseHandler.channelInactive(TransportResponseHandler.Java:128) ~[spark-network-common_2.11-2.0.1.jar:2.0.1]
        at org.Apache.spark.network.server.TransportChannelHandler.channelInactive(TransportChannelHandler.Java:109) ~[spark-network-common_2.11-2.0.1.jar:2.0.1]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.Java:208) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.Java:194) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.Java:75) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.Java:257) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.Java:208) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.Java:194) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.Java:75) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.Java:208) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.Java:194) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.Java:75) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at org.Apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.Java:182) ~[spark-network-common_2.11-2.0.1.jar:2.0.1]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.Java:208) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.Java:194) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.Java:828) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.Java:621) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.Java:357) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.Java:357) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.Java:111) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
        ... 1 common frames omitted
_

Sparkマスターログ_10.20.30.50_で、次のエラーメッセージが表示されます。

_16/11/05 14:47:20 ERROR OneForOneStrategy: Error while decoding incoming Akka PDU of length: 1298
akka.remote.transport.AkkaProtocolException: Error while decoding incoming Akka PDU of length: 1298
Caused by: akka.remote.transport.PduCodecException: Decoding PDU failed.
    at akka.remote.transport.AkkaPduProtobufCodec$.decodePdu(AkkaPduCodec.scala:167)
    at akka.remote.transport.ProtocolStateActor.akka$remote$transport$ProtocolStateActor$$decodePdu(AkkaProtocolTransport.scala:580)
    at akka.remote.transport.ProtocolStateActor$$anonfun$4.applyOrElse(AkkaProtocolTransport.scala:375)
    at akka.remote.transport.ProtocolStateActor$$anonfun$4.applyOrElse(AkkaProtocolTransport.scala:343)
    at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
    at akka.actor.FSM$class.processEvent(FSM.scala:604)
    at akka.remote.transport.ProtocolStateActor.processEvent(AkkaProtocolTransport.scala:269)
    at akka.actor.FSM$class.akka$actor$FSM$$processMsg(FSM.scala:598)
    at akka.actor.FSM$$anonfun$receive$1.applyOrElse(FSM.scala:592)
    at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
    at akka.remote.transport.ProtocolStateActor.aroundReceive(AkkaProtocolTransport.scala:269)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
    at akka.actor.ActorCell.invoke(ActorCell.scala:487)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
    at akka.dispatch.Mailbox.run(Mailbox.scala:220)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.Java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.Java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.Java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.Java:107)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero).
    at com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.Java:89)
    at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.Java:108)
    at akka.remote.WireFormats$AkkaProtocolMessage.<init>(WireFormats.Java:6643)
    at akka.remote.WireFormats$AkkaProtocolMessage.<init>(WireFormats.Java:6607)
    at akka.remote.WireFormats$AkkaProtocolMessage$1.parsePartialFrom(WireFormats.Java:6703)
    at akka.remote.WireFormats$AkkaProtocolMessage$1.parsePartialFrom(WireFormats.Java:6698)
    at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.Java:141)
    at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.Java:176)
    at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.Java:188)
    at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.Java:193)
    at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.Java:49)
    at akka.remote.WireFormats$AkkaProtocolMessage.parseFrom(WireFormats.Java:6821)
    at akka.remote.transport.AkkaPduProtobufCodec$.decodePdu(AkkaPduCodec.scala:168)
    ... 19 more
_

追加情報

  • 代わりにnew SparkConf().setMaster("local")を使用すると、例は正常に動作します
  • 同じマシンで_spark-Shell --master spark://10.20.30.50:7077_でSpark Masterに接続できます
10
Michael Lihs

sparkのバージョンの不一致を装って、そもそもネットワークエラーのように見えますが(実際にはそうではありません。)正しいバージョンのspark jarアセンブリjar。

この問題は、Protobufferを使用したHadoop RPC呼び出しのバージョンミスマッチが原因で発生する可能性があります。

解析中のプロトコルメッセージが何らかの形で無効である場合、たとえば、不正な形式の変数または負のバイト長が含まれている場合

  • Protobufの私の経験InvalidProtocolBufferExceptionは、メッセージが解析できなかった場合にのみ発生します(protobufメッセージを解析している場合、メッセージの長さがゼロであるか、メッセージが破損している可能性があります...)。

  • Sparkは、マスター/ドライバーとワーカー間のメッセージの受け渡しにAkkaアクターを使用します 内部的にakkaはgoogles protobufを使用して通信します。AkkaPduCodec.scalaの以下のメソッドを参照してください)

    override def decodePdu(raw: ByteString): AkkaPdu = {
        try {
          val pdu = AkkaProtocolMessage.parseFrom(raw.toArray)
          if (pdu.hasPayload) Payload(ByteString(pdu.getPayload.asReadOnlyByteBuffer()))
          else if (pdu.hasInstruction) decodeControlPdu(pdu.getInstruction)
          else throw new PduCodecException("Error decoding Akka PDU: Neither message nor control message were contained", null)
        } catch {
          case e: InvalidProtocolBufferException ⇒ throw new PduCodecException("Decoding PDU failed.", e)
        }
      }
    

しかし、あなたの場合、そのバージョンの不一致のため、新しいprotobufバージョンのメッセージはパーサーの古いバージョンから解析することはできません...または...

Mavenのその他の依存関係を使用している場合は、pls。レビュー。

10
Ram Ghadiyaram

仮想マシンでSparkバージョン1.5.2を実行し、JavaのSparkライブラリのバージョン2.0.1を使用しました。適切なSpark私のpom.xmlのライブラリバージョン

<dependency>
    <groupId>org.Apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>1.5.2</version>
</dependency>

別の問題(後で発生)は、ライブラリのビルドに使用したScalaバージョン。これはartifactIdの_2.10サフィックスです。

基本的に@RamPrassadの答えは正しい方向を示してくれましたが、問題を解決するために私がすべきことを明確にアドバイスしませんでした。

ところで、仮想マシンでSparkを更新できませんでした。HortonWorksディストリビューションによってもたらされたためです...

2
Michael Lihs