web-dev-qa-db-ja.com

Spark SBTアプリケーションの実行時に「fork in run:= true」を追加する必要があるのはなぜですか?

シンプルなSparkアプリをsbtを使用して作成しました。これが私のコードです。

import org.Apache.spark.sql.SparkSession

object HelloWorld {
  def main(args: Array[String]): Unit = {
    val spark = SparkSession.builder().master("local").appName("BigApple").getOrCreate()

    import spark.implicits._

    val ds = Seq(1, 2, 3).toDS()
    ds.map(_ + 1).foreach(x => println(x))
  }
}

以下は私のbuild.sbtです

name := """sbt-sample-app"""

version := "1.0"

scalaVersion := "2.11.7"

libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test"
libraryDependencies += "org.Apache.spark" % "spark-sql_2.11" % "2.1.1"

sbt runを実行しようとすると、次のエラーが発生します。

$ sbt run
[info] Loading global plugins from /home/user/.sbt/0.13/plugins
[info] Loading project definition from /home/user/Projects/sample-app/project
[info] Set current project to sbt-sample-app (in build file:/home/user/Projects/sample-app/)
[info] Running HelloWorld 
Using Spark's default log4j profile: org/Apache/spark/log4j-defaults.properties
17/06/01 10:09:10 INFO SparkContext: Running Spark version 2.1.1
17/06/01 10:09:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-Java classes where applicable
17/06/01 10:09:11 WARN Utils: Your hostname, user-Vostro-15-3568 resolves to a loopback address: 127.0.1.1; using 127.0.0.1 instead (on interface enp3s0)
17/06/01 10:09:11 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
17/06/01 10:09:11 INFO SecurityManager: Changing view acls to: user
17/06/01 10:09:11 INFO SecurityManager: Changing modify acls to: user
17/06/01 10:09:11 INFO SecurityManager: Changing view acls groups to: 
17/06/01 10:09:11 INFO SecurityManager: Changing modify acls groups to: 
17/06/01 10:09:11 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(user); groups with view permissions: Set(); users  with modify permissions: Set(user); groups with modify permissions: Set()
17/06/01 10:09:12 INFO Utils: Successfully started service 'sparkDriver' on port 39662.
17/06/01 10:09:12 INFO SparkEnv: Registering MapOutputTracker
17/06/01 10:09:12 INFO SparkEnv: Registering BlockManagerMaster
17/06/01 10:09:12 INFO BlockManagerMasterEndpoint: Using org.Apache.spark.storage.DefaultTopologyMapper for getting topology information
17/06/01 10:09:12 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/06/01 10:09:12 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-c6db1535-6a00-4760-93dc-968722e3d596
17/06/01 10:09:12 INFO MemoryStore: MemoryStore started with capacity 408.9 MB
17/06/01 10:09:13 INFO SparkEnv: Registering OutputCommitCoordinator
17/06/01 10:09:13 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/06/01 10:09:13 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://127.0.0.1:4040
17/06/01 10:09:13 INFO Executor: Starting executor ID driver on Host localhost
17/06/01 10:09:13 INFO Utils: Successfully started service 'org.Apache.spark.network.netty.NettyBlockTransferService' on port 34488.
17/06/01 10:09:13 INFO NettyBlockTransferService: Server created on 127.0.0.1:34488
17/06/01 10:09:13 INFO BlockManager: Using org.Apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/06/01 10:09:13 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 127.0.0.1, 34488, None)
17/06/01 10:09:13 INFO BlockManagerMasterEndpoint: Registering block manager 127.0.0.1:34488 with 408.9 MB RAM, BlockManagerId(driver, 127.0.0.1, 34488, None)
17/06/01 10:09:13 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 127.0.0.1, 34488, None)
17/06/01 10:09:13 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 127.0.0.1, 34488, None)
17/06/01 10:09:14 INFO SharedState: Warehouse path is 'file:/home/user/Projects/sample-app/spark-warehouse'.
[error] (run-main-0) scala.ScalaReflectionException: class scala.Option in JavaMirror with ClasspathFilter(
[error]   parent = URLClassLoader with NativeCopyLoader with RawResources(
[error]   urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ...,/home/user/.ivy2/cache/org.Apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar),
[error]   parent = Java.net.URLClassLoader@7c4113ce,
[error]   resourceMap = Set(app.class.path, boot.class.path),
[error]   nativeTemp = /tmp/sbt_c2afce
[error] )
[error]   root = Sun.misc.Launcher$AppClassLoader@677327b6
[error]   cp = Set(/home/user/.ivy2/cache/org.glassfish.jersey.core/jersey-common/jars/jersey-common-2.22.2.jar, ..., /home/user/.ivy2/cache/net.razorvine/pyrolite/jars/pyrolite-4.13.jar)
[error] ) of type class sbt.classpath.ClasspathFilter with classpath [<unknown>] and parent being URLClassLoader with NativeCopyLoader with RawResources(
[error]   urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.Apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar),
[error]   parent = Java.net.URLClassLoader@7c4113ce,
[error]   resourceMap = Set(app.class.path, boot.class.path),
[error]   nativeTemp = /tmp/sbt_c2afce
[error] ) of type class sbt.classpath.ClasspathUtilities$$anon$1 with classpath [file:/home/user/Projects/sample-app/target/scala-2.11/classes/,...openjdk-AMD64/jre/lib/jfr.jar:/usr/lib/jvm/Java-8-openjdk-AMD64/jre/classes] not found.
scala.ScalaReflectionException: class scala.Option in JavaMirror with ClasspathFilter(
  parent = URLClassLoader with NativeCopyLoader with RawResources(
  urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.Apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar),
  parent = Java.net.URLClassLoader@7c4113ce,
  resourceMap = Set(app.class.path, boot.class.path),
  nativeTemp = /tmp/sbt_c2afce
)
  root = Sun.misc.Launcher$AppClassLoader@677327b6
  cp = Set(/home/user/.ivy2/cache/org.glassfish.jersey.core/jersey-common/jars/jersey-common-2.22.2.jar, ..., /home/user/.ivy2/cache/net.razorvine/pyrolite/jars/pyrolite-4.13.jar)
) of type class sbt.classpath.ClasspathFilter with classpath [<unknown>] and parent being URLClassLoader with NativeCopyLoader with RawResources(
  urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.Apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar),
  parent = Java.net.URLClassLoader@7c4113ce,
  resourceMap = Set(app.class.path, boot.class.path),
  nativeTemp = /tmp/sbt_c2afce
) of type class sbt.classpath.ClasspathUtilities$$anon$1 with classpath [file:/home/user/Projects/sample-app/target/scala-2.11/classes/,.../jre/lib/charsets.jar:/usr/lib/jvm/Java-8-openjdk-AMD64/jre/lib/jfr.jar:/usr/lib/jvm/Java-8-openjdk-AMD64/jre/classes] not found.
    at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:123)
    at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:22)
    at org.Apache.spark.sql.catalyst.ScalaReflection$$typecreator42$1.apply(ScalaReflection.scala:614)
    at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:232)
    at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:232)
    at org.Apache.spark.sql.catalyst.ScalaReflection$class.localTypeOf(ScalaReflection.scala:782)
    at org.Apache.spark.sql.catalyst.ScalaReflection$.localTypeOf(ScalaReflection.scala:39)
    at org.Apache.spark.sql.catalyst.ScalaReflection$.optionOfProductType(ScalaReflection.scala:614)
    at org.Apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:51)
    at org.Apache.spark.sql.Encoders$.scalaInt(Encoders.scala:281)
    at org.Apache.spark.sql.SQLImplicits.newIntEncoder(SQLImplicits.scala:54)
    at HelloWorld$.main(HelloWorld.scala:9)
    at HelloWorld.main(HelloWorld.scala)
    at Sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at Sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.Java:62)
    at Sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.Java:43)
    at Java.lang.reflect.Method.invoke(Method.Java:498)
[trace] Stack trace suppressed: run last compile:run for the full output.
17/06/01 10:09:15 ERROR ContextCleaner: Error in cleaning thread
Java.lang.InterruptedException
    at Java.lang.Object.wait(Native Method)
    at Java.lang.ref.ReferenceQueue.remove(ReferenceQueue.Java:143)
    at org.Apache.spark.ContextCleaner$$anonfun$org$Apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:181)
    at org.Apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245)
    at org.Apache.spark.ContextCleaner.org$Apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:178)
    at org.Apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:73)
17/06/01 10:09:15 ERROR Utils: uncaught error in thread SparkListenerBus, stopping SparkContext
Java.lang.InterruptedException
    at Java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.Java:998)
    at Java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.Java:1304)
    at Java.util.concurrent.Semaphore.acquire(Semaphore.Java:312)
    at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:80)
    at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
    at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78)
    at org.Apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245)
    at org.Apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77)
17/06/01 10:09:15 ERROR Utils: throw uncaught fatal error in thread SparkListenerBus
Java.lang.InterruptedException
    at Java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.Java:998)
    at Java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.Java:1304)
    at Java.util.concurrent.Semaphore.acquire(Semaphore.Java:312)
    at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:80)
    at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
    at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78)
    at org.Apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245)
    at org.Apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77)
17/06/01 10:09:15 INFO SparkUI: Stopped Spark web UI at http://127.0.0.1:4040
Java.lang.RuntimeException: Nonzero exit code: 1
    at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last compile:run for the full output.
[error] (compile:run) Nonzero exit code: 1
[error] Total time: 7 s, completed 1 Jun, 2017 10:09:15 AM

しかし、fork in run := truebuild.sbtを追加すると、アプリは正常に実行されます

新しいbuild.sbt

name := """sbt-sample-app"""

version := "1.0"

scalaVersion := "2.11.7"

libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test"
libraryDependencies += "org.Apache.spark" % "spark-sql_2.11" % "2.1.1"

fork in run := true

出力は次のとおりです。

$ sbt run
[info] Loading global plugins from /home/user/.sbt/0.13/plugins
[info] Loading project definition from /home/user/Projects/sample-app/project
[info] Set current project to sbt-sample-app (in build file:/home/user/Projects/sample-app/)
[success] Total time: 0 s, completed 1 Jun, 2017 10:15:43 AM
[info] Updating {file:/home/user/Projects/sample-app/}sample-app...
[info] Resolving jline#jline;2.12.1 ...
[info] Done updating.
[warn] Scala version was updated by one of library dependencies:
[warn]  * org.scala-lang:scala-library:(2.11.7, 2.11.0) -> 2.11.8
[warn] To force scalaVersion, add the following:
[warn]  ivyScala := ivyScala.value map { _.copy(overrideScalaVersion = true) }
[warn] Run 'evicted' to see detailed eviction warnings
[info] Compiling 1 Scala source to /home/user/Projects/sample-app/target/scala-2.11/classes...
[info] Running HelloWorld 
[error] Using Spark's default log4j profile: org/Apache/spark/log4j-defaults.properties
[error] 17/06/01 10:16:13 INFO SparkContext: Running Spark version 2.1.1
[error] 17/06/01 10:16:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-Java classes where applicable
[error] 17/06/01 10:16:14 WARN Utils: Your hostname, user-Vostro-15-3568 resolves to a loopback address: 127.0.1.1; using 127.0.0.1 instead (on interface enp3s0)
[error] 17/06/01 10:16:14 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing view acls to: user
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing modify acls to: user
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing view acls groups to: 
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing modify acls groups to: 
[error] 17/06/01 10:16:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(user); groups with view permissions: Set(); users  with modify permissions: Set(user); groups with modify permissions: Set()
[error] 17/06/01 10:16:14 INFO Utils: Successfully started service 'sparkDriver' on port 37747.
[error] 17/06/01 10:16:14 INFO SparkEnv: Registering MapOutputTracker
[error] 17/06/01 10:16:14 INFO SparkEnv: Registering BlockManagerMaster
[error] 17/06/01 10:16:14 INFO BlockManagerMasterEndpoint: Using org.Apache.spark.storage.DefaultTopologyMapper for getting topology information
[error] 17/06/01 10:16:14 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
[error] 17/06/01 10:16:14 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-edf40c39-a13e-4930-8e9a-64135bfa9770
[error] 17/06/01 10:16:14 INFO MemoryStore: MemoryStore started with capacity 1405.2 MB
[error] 17/06/01 10:16:14 INFO SparkEnv: Registering OutputCommitCoordinator
[error] 17/06/01 10:16:14 INFO Utils: Successfully started service 'SparkUI' on port 4040.
[error] 17/06/01 10:16:15 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://127.0.0.1:4040
[error] 17/06/01 10:16:15 INFO Executor: Starting executor ID driver on Host localhost
[error] 17/06/01 10:16:15 INFO Utils: Successfully started service 'org.Apache.spark.network.netty.NettyBlockTransferService' on port 39113.
[error] 17/06/01 10:16:15 INFO NettyBlockTransferService: Server created on 127.0.0.1:39113
[error] 17/06/01 10:16:15 INFO BlockManager: Using org.Apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
[error] 17/06/01 10:16:15 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 127.0.0.1, 39113, None)
[error] 17/06/01 10:16:15 INFO BlockManagerMasterEndpoint: Registering block manager 127.0.0.1:39113 with 1405.2 MB RAM, BlockManagerId(driver, 127.0.0.1, 39113, None)
[error] 17/06/01 10:16:15 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 127.0.0.1, 39113, None)
[error] 17/06/01 10:16:15 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 127.0.0.1, 39113, None)
[error] 17/06/01 10:16:15 INFO SharedState: Warehouse path is 'file:/home/user/Projects/sample-app/spark-warehouse/'.
[error] 17/06/01 10:16:18 INFO CodeGenerator: Code generated in 395.134683 ms
[error] 17/06/01 10:16:19 INFO CodeGenerator: Code generated in 9.077969 ms
[error] 17/06/01 10:16:19 INFO CodeGenerator: Code generated in 23.652705 ms
[error] 17/06/01 10:16:19 INFO SparkContext: Starting job: foreach at HelloWorld.scala:10
[error] 17/06/01 10:16:19 INFO DAGScheduler: Got job 0 (foreach at HelloWorld.scala:10) with 1 output partitions
[error] 17/06/01 10:16:19 INFO DAGScheduler: Final stage: ResultStage 0 (foreach at HelloWorld.scala:10)
[error] 17/06/01 10:16:19 INFO DAGScheduler: Parents of final stage: List()
[error] 17/06/01 10:16:19 INFO DAGScheduler: Missing parents: List()
[error] 17/06/01 10:16:19 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[3] at foreach at HelloWorld.scala:10), which has no missing parents
[error] 17/06/01 10:16:20 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.3 KB, free 1405.2 MB)
[error] 17/06/01 10:16:20 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.3 KB, free 1405.2 MB)
[error] 17/06/01 10:16:20 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 127.0.0.1:39113 (size: 3.3 KB, free: 1405.2 MB)
[error] 17/06/01 10:16:20 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:996
[error] 17/06/01 10:16:20 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at foreach at HelloWorld.scala:10)
[error] 17/06/01 10:16:20 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
[error] 17/06/01 10:16:20 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 6227 bytes)
[error] 17/06/01 10:16:20 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
[info] 2
[info] 3
[info] 4
[error] 17/06/01 10:16:20 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1231 bytes result sent to driver
[error] 17/06/01 10:16:20 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 152 ms on localhost (executor driver) (1/1)
[error] 17/06/01 10:16:20 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
[error] 17/06/01 10:16:20 INFO DAGScheduler: ResultStage 0 (foreach at HelloWorld.scala:10) finished in 0.181 s
[error] 17/06/01 10:16:20 INFO DAGScheduler: Job 0 finished: foreach at HelloWorld.scala:10, took 0.596960 s
[error] 17/06/01 10:16:20 INFO SparkContext: Invoking stop() from shutdown hook
[error] 17/06/01 10:16:20 INFO SparkUI: Stopped Spark web UI at http://127.0.0.1:4040
[error] 17/06/01 10:16:20 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
[error] 17/06/01 10:16:20 INFO MemoryStore: MemoryStore cleared
[error] 17/06/01 10:16:20 INFO BlockManager: BlockManager stopped
[error] 17/06/01 10:16:20 INFO BlockManagerMaster: BlockManagerMaster stopped
[error] 17/06/01 10:16:20 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
[error] 17/06/01 10:16:20 INFO SparkContext: Successfully stopped SparkContext
[error] 17/06/01 10:16:20 INFO ShutdownHookManager: Shutdown hook called
[error] 17/06/01 10:16:20 INFO ShutdownHookManager: Deleting directory /tmp/spark-77d00e78-9f76-4ab2-bc40-0b99940661ac
[success] Total time: 37 s, completed 1 Jun, 2017 10:16:20 AM

誰かがその背後にある理由を理解するのを手伝ってくれる?

15
himanshuIIITian

Shiti Saxenaによる「Scala向けSBT入門」からの抜粋

JVMをフォークする必要があるのはなぜですか?

ユーザーがrunまたはconsoleコマンドを使用してコードを実行すると、コードはSBTと同じ仮想マシンで実行されます。場合によっては、System.exit呼び出しや終了していないスレッドなど、コードの実行によりSBTがクラッシュすることがあります(たとえば、コードでテストを実行しているときにコードで同時に作業している場合)。

テストによってJVMがシャットダウンした場合は、SBTを再起動する必要があります。このようなシナリオを回避するには、JVMをフォークすることが重要です。

コードが次のようにリストされた制約に従う場合、コードを実行するためにJVMをフォークする必要はありません。そうでない場合、フォークされたJVMで実行する必要があります。

  • スレッドは作成されないか、ユーザーが作成したスレッドが自分で終了するとプログラムが終了します。
  • System.exitはプログラムを終了するために使用され、ユーザーが作成したスレッドは割り込み時に終了します
  • 逆シリアル化は行われません。または逆シリアル化コードにより、適切なクラスローダーが使用されます。
17
ZakukaZ

私はなぜ正確に見つけることができませんでした:

しかし、これは彼らのビルドファイルと推奨事項です:

https://github.com/deanwampler/spark-scala-tutorial/blob/master/project/Build.scala

誰かがより良い答えを与えることができることを願っています。

編集されたコード:

import org.Apache.spark.sql.SparkSession

object HelloWorld {
def main(args: Array[String]): Unit = {
   val spark = SparkSession.builder().master("local").appName("BigApple").getOrCreate()

import spark.implicits._

val ds = Seq(1, 2, 3).toDS()
ds.map(_ + 1).foreach(x => println(x))
}
}

build.sbt

name := """untitled"""

version := "1.0"

scalaVersion := "2.11.7"

libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test"
libraryDependencies += "org.Apache.spark" % "spark-sql_2.11" % "2.1.1"
1
Sam Upra

与えられたドキュメントから ここ

デフォルトでは、実行タスクはsbtと同じJVMで実行されます。ただし、特定の状況下では分岐が必要です。または、新しいタスクを実装するときにJavaプロセスをフォークすることもできます。

デフォルトでは、フォークされたプロセスは、現在のプロセスのビルドおよび作業ディレクトリとJVMオプションに使用されている同じJavaおよびScalaバージョンを使用します。このページ実行タスクとテストタスクの両方でフォークを有効にして構成する方法について説明します。以下で説明するように、関連するキーをスコープすることで、タスクの種類ごとに個別に構成できます。

実行時にフォークを有効にするには、単に

fork in run := true
1
koiralo