シンプルなSparkアプリをsbt
を使用して作成しました。これが私のコードです。
import org.Apache.spark.sql.SparkSession
object HelloWorld {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder().master("local").appName("BigApple").getOrCreate()
import spark.implicits._
val ds = Seq(1, 2, 3).toDS()
ds.map(_ + 1).foreach(x => println(x))
}
}
以下は私のbuild.sbt
です
name := """sbt-sample-app"""
version := "1.0"
scalaVersion := "2.11.7"
libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test"
libraryDependencies += "org.Apache.spark" % "spark-sql_2.11" % "2.1.1"
sbt run
を実行しようとすると、次のエラーが発生します。
$ sbt run
[info] Loading global plugins from /home/user/.sbt/0.13/plugins
[info] Loading project definition from /home/user/Projects/sample-app/project
[info] Set current project to sbt-sample-app (in build file:/home/user/Projects/sample-app/)
[info] Running HelloWorld
Using Spark's default log4j profile: org/Apache/spark/log4j-defaults.properties
17/06/01 10:09:10 INFO SparkContext: Running Spark version 2.1.1
17/06/01 10:09:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-Java classes where applicable
17/06/01 10:09:11 WARN Utils: Your hostname, user-Vostro-15-3568 resolves to a loopback address: 127.0.1.1; using 127.0.0.1 instead (on interface enp3s0)
17/06/01 10:09:11 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
17/06/01 10:09:11 INFO SecurityManager: Changing view acls to: user
17/06/01 10:09:11 INFO SecurityManager: Changing modify acls to: user
17/06/01 10:09:11 INFO SecurityManager: Changing view acls groups to:
17/06/01 10:09:11 INFO SecurityManager: Changing modify acls groups to:
17/06/01 10:09:11 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(user); groups with view permissions: Set(); users with modify permissions: Set(user); groups with modify permissions: Set()
17/06/01 10:09:12 INFO Utils: Successfully started service 'sparkDriver' on port 39662.
17/06/01 10:09:12 INFO SparkEnv: Registering MapOutputTracker
17/06/01 10:09:12 INFO SparkEnv: Registering BlockManagerMaster
17/06/01 10:09:12 INFO BlockManagerMasterEndpoint: Using org.Apache.spark.storage.DefaultTopologyMapper for getting topology information
17/06/01 10:09:12 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/06/01 10:09:12 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-c6db1535-6a00-4760-93dc-968722e3d596
17/06/01 10:09:12 INFO MemoryStore: MemoryStore started with capacity 408.9 MB
17/06/01 10:09:13 INFO SparkEnv: Registering OutputCommitCoordinator
17/06/01 10:09:13 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/06/01 10:09:13 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://127.0.0.1:4040
17/06/01 10:09:13 INFO Executor: Starting executor ID driver on Host localhost
17/06/01 10:09:13 INFO Utils: Successfully started service 'org.Apache.spark.network.netty.NettyBlockTransferService' on port 34488.
17/06/01 10:09:13 INFO NettyBlockTransferService: Server created on 127.0.0.1:34488
17/06/01 10:09:13 INFO BlockManager: Using org.Apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/06/01 10:09:13 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 127.0.0.1, 34488, None)
17/06/01 10:09:13 INFO BlockManagerMasterEndpoint: Registering block manager 127.0.0.1:34488 with 408.9 MB RAM, BlockManagerId(driver, 127.0.0.1, 34488, None)
17/06/01 10:09:13 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 127.0.0.1, 34488, None)
17/06/01 10:09:13 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 127.0.0.1, 34488, None)
17/06/01 10:09:14 INFO SharedState: Warehouse path is 'file:/home/user/Projects/sample-app/spark-warehouse'.
[error] (run-main-0) scala.ScalaReflectionException: class scala.Option in JavaMirror with ClasspathFilter(
[error] parent = URLClassLoader with NativeCopyLoader with RawResources(
[error] urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ...,/home/user/.ivy2/cache/org.Apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar),
[error] parent = Java.net.URLClassLoader@7c4113ce,
[error] resourceMap = Set(app.class.path, boot.class.path),
[error] nativeTemp = /tmp/sbt_c2afce
[error] )
[error] root = Sun.misc.Launcher$AppClassLoader@677327b6
[error] cp = Set(/home/user/.ivy2/cache/org.glassfish.jersey.core/jersey-common/jars/jersey-common-2.22.2.jar, ..., /home/user/.ivy2/cache/net.razorvine/pyrolite/jars/pyrolite-4.13.jar)
[error] ) of type class sbt.classpath.ClasspathFilter with classpath [<unknown>] and parent being URLClassLoader with NativeCopyLoader with RawResources(
[error] urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.Apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar),
[error] parent = Java.net.URLClassLoader@7c4113ce,
[error] resourceMap = Set(app.class.path, boot.class.path),
[error] nativeTemp = /tmp/sbt_c2afce
[error] ) of type class sbt.classpath.ClasspathUtilities$$anon$1 with classpath [file:/home/user/Projects/sample-app/target/scala-2.11/classes/,...openjdk-AMD64/jre/lib/jfr.jar:/usr/lib/jvm/Java-8-openjdk-AMD64/jre/classes] not found.
scala.ScalaReflectionException: class scala.Option in JavaMirror with ClasspathFilter(
parent = URLClassLoader with NativeCopyLoader with RawResources(
urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.Apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar),
parent = Java.net.URLClassLoader@7c4113ce,
resourceMap = Set(app.class.path, boot.class.path),
nativeTemp = /tmp/sbt_c2afce
)
root = Sun.misc.Launcher$AppClassLoader@677327b6
cp = Set(/home/user/.ivy2/cache/org.glassfish.jersey.core/jersey-common/jars/jersey-common-2.22.2.jar, ..., /home/user/.ivy2/cache/net.razorvine/pyrolite/jars/pyrolite-4.13.jar)
) of type class sbt.classpath.ClasspathFilter with classpath [<unknown>] and parent being URLClassLoader with NativeCopyLoader with RawResources(
urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.Apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar),
parent = Java.net.URLClassLoader@7c4113ce,
resourceMap = Set(app.class.path, boot.class.path),
nativeTemp = /tmp/sbt_c2afce
) of type class sbt.classpath.ClasspathUtilities$$anon$1 with classpath [file:/home/user/Projects/sample-app/target/scala-2.11/classes/,.../jre/lib/charsets.jar:/usr/lib/jvm/Java-8-openjdk-AMD64/jre/lib/jfr.jar:/usr/lib/jvm/Java-8-openjdk-AMD64/jre/classes] not found.
at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:123)
at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:22)
at org.Apache.spark.sql.catalyst.ScalaReflection$$typecreator42$1.apply(ScalaReflection.scala:614)
at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:232)
at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:232)
at org.Apache.spark.sql.catalyst.ScalaReflection$class.localTypeOf(ScalaReflection.scala:782)
at org.Apache.spark.sql.catalyst.ScalaReflection$.localTypeOf(ScalaReflection.scala:39)
at org.Apache.spark.sql.catalyst.ScalaReflection$.optionOfProductType(ScalaReflection.scala:614)
at org.Apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:51)
at org.Apache.spark.sql.Encoders$.scalaInt(Encoders.scala:281)
at org.Apache.spark.sql.SQLImplicits.newIntEncoder(SQLImplicits.scala:54)
at HelloWorld$.main(HelloWorld.scala:9)
at HelloWorld.main(HelloWorld.scala)
at Sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at Sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.Java:62)
at Sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.Java:43)
at Java.lang.reflect.Method.invoke(Method.Java:498)
[trace] Stack trace suppressed: run last compile:run for the full output.
17/06/01 10:09:15 ERROR ContextCleaner: Error in cleaning thread
Java.lang.InterruptedException
at Java.lang.Object.wait(Native Method)
at Java.lang.ref.ReferenceQueue.remove(ReferenceQueue.Java:143)
at org.Apache.spark.ContextCleaner$$anonfun$org$Apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:181)
at org.Apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245)
at org.Apache.spark.ContextCleaner.org$Apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:178)
at org.Apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:73)
17/06/01 10:09:15 ERROR Utils: uncaught error in thread SparkListenerBus, stopping SparkContext
Java.lang.InterruptedException
at Java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.Java:998)
at Java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.Java:1304)
at Java.util.concurrent.Semaphore.acquire(Semaphore.Java:312)
at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:80)
at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78)
at org.Apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245)
at org.Apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77)
17/06/01 10:09:15 ERROR Utils: throw uncaught fatal error in thread SparkListenerBus
Java.lang.InterruptedException
at Java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.Java:998)
at Java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.Java:1304)
at Java.util.concurrent.Semaphore.acquire(Semaphore.Java:312)
at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:80)
at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.Apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78)
at org.Apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245)
at org.Apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77)
17/06/01 10:09:15 INFO SparkUI: Stopped Spark web UI at http://127.0.0.1:4040
Java.lang.RuntimeException: Nonzero exit code: 1
at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last compile:run for the full output.
[error] (compile:run) Nonzero exit code: 1
[error] Total time: 7 s, completed 1 Jun, 2017 10:09:15 AM
しかし、fork in run := true
にbuild.sbt
を追加すると、アプリは正常に実行されます
新しいbuild.sbt
:
name := """sbt-sample-app"""
version := "1.0"
scalaVersion := "2.11.7"
libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test"
libraryDependencies += "org.Apache.spark" % "spark-sql_2.11" % "2.1.1"
fork in run := true
出力は次のとおりです。
$ sbt run
[info] Loading global plugins from /home/user/.sbt/0.13/plugins
[info] Loading project definition from /home/user/Projects/sample-app/project
[info] Set current project to sbt-sample-app (in build file:/home/user/Projects/sample-app/)
[success] Total time: 0 s, completed 1 Jun, 2017 10:15:43 AM
[info] Updating {file:/home/user/Projects/sample-app/}sample-app...
[info] Resolving jline#jline;2.12.1 ...
[info] Done updating.
[warn] Scala version was updated by one of library dependencies:
[warn] * org.scala-lang:scala-library:(2.11.7, 2.11.0) -> 2.11.8
[warn] To force scalaVersion, add the following:
[warn] ivyScala := ivyScala.value map { _.copy(overrideScalaVersion = true) }
[warn] Run 'evicted' to see detailed eviction warnings
[info] Compiling 1 Scala source to /home/user/Projects/sample-app/target/scala-2.11/classes...
[info] Running HelloWorld
[error] Using Spark's default log4j profile: org/Apache/spark/log4j-defaults.properties
[error] 17/06/01 10:16:13 INFO SparkContext: Running Spark version 2.1.1
[error] 17/06/01 10:16:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-Java classes where applicable
[error] 17/06/01 10:16:14 WARN Utils: Your hostname, user-Vostro-15-3568 resolves to a loopback address: 127.0.1.1; using 127.0.0.1 instead (on interface enp3s0)
[error] 17/06/01 10:16:14 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing view acls to: user
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing modify acls to: user
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing view acls groups to:
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing modify acls groups to:
[error] 17/06/01 10:16:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(user); groups with view permissions: Set(); users with modify permissions: Set(user); groups with modify permissions: Set()
[error] 17/06/01 10:16:14 INFO Utils: Successfully started service 'sparkDriver' on port 37747.
[error] 17/06/01 10:16:14 INFO SparkEnv: Registering MapOutputTracker
[error] 17/06/01 10:16:14 INFO SparkEnv: Registering BlockManagerMaster
[error] 17/06/01 10:16:14 INFO BlockManagerMasterEndpoint: Using org.Apache.spark.storage.DefaultTopologyMapper for getting topology information
[error] 17/06/01 10:16:14 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
[error] 17/06/01 10:16:14 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-edf40c39-a13e-4930-8e9a-64135bfa9770
[error] 17/06/01 10:16:14 INFO MemoryStore: MemoryStore started with capacity 1405.2 MB
[error] 17/06/01 10:16:14 INFO SparkEnv: Registering OutputCommitCoordinator
[error] 17/06/01 10:16:14 INFO Utils: Successfully started service 'SparkUI' on port 4040.
[error] 17/06/01 10:16:15 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://127.0.0.1:4040
[error] 17/06/01 10:16:15 INFO Executor: Starting executor ID driver on Host localhost
[error] 17/06/01 10:16:15 INFO Utils: Successfully started service 'org.Apache.spark.network.netty.NettyBlockTransferService' on port 39113.
[error] 17/06/01 10:16:15 INFO NettyBlockTransferService: Server created on 127.0.0.1:39113
[error] 17/06/01 10:16:15 INFO BlockManager: Using org.Apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
[error] 17/06/01 10:16:15 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 127.0.0.1, 39113, None)
[error] 17/06/01 10:16:15 INFO BlockManagerMasterEndpoint: Registering block manager 127.0.0.1:39113 with 1405.2 MB RAM, BlockManagerId(driver, 127.0.0.1, 39113, None)
[error] 17/06/01 10:16:15 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 127.0.0.1, 39113, None)
[error] 17/06/01 10:16:15 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 127.0.0.1, 39113, None)
[error] 17/06/01 10:16:15 INFO SharedState: Warehouse path is 'file:/home/user/Projects/sample-app/spark-warehouse/'.
[error] 17/06/01 10:16:18 INFO CodeGenerator: Code generated in 395.134683 ms
[error] 17/06/01 10:16:19 INFO CodeGenerator: Code generated in 9.077969 ms
[error] 17/06/01 10:16:19 INFO CodeGenerator: Code generated in 23.652705 ms
[error] 17/06/01 10:16:19 INFO SparkContext: Starting job: foreach at HelloWorld.scala:10
[error] 17/06/01 10:16:19 INFO DAGScheduler: Got job 0 (foreach at HelloWorld.scala:10) with 1 output partitions
[error] 17/06/01 10:16:19 INFO DAGScheduler: Final stage: ResultStage 0 (foreach at HelloWorld.scala:10)
[error] 17/06/01 10:16:19 INFO DAGScheduler: Parents of final stage: List()
[error] 17/06/01 10:16:19 INFO DAGScheduler: Missing parents: List()
[error] 17/06/01 10:16:19 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[3] at foreach at HelloWorld.scala:10), which has no missing parents
[error] 17/06/01 10:16:20 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.3 KB, free 1405.2 MB)
[error] 17/06/01 10:16:20 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.3 KB, free 1405.2 MB)
[error] 17/06/01 10:16:20 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 127.0.0.1:39113 (size: 3.3 KB, free: 1405.2 MB)
[error] 17/06/01 10:16:20 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:996
[error] 17/06/01 10:16:20 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at foreach at HelloWorld.scala:10)
[error] 17/06/01 10:16:20 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
[error] 17/06/01 10:16:20 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 6227 bytes)
[error] 17/06/01 10:16:20 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
[info] 2
[info] 3
[info] 4
[error] 17/06/01 10:16:20 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1231 bytes result sent to driver
[error] 17/06/01 10:16:20 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 152 ms on localhost (executor driver) (1/1)
[error] 17/06/01 10:16:20 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
[error] 17/06/01 10:16:20 INFO DAGScheduler: ResultStage 0 (foreach at HelloWorld.scala:10) finished in 0.181 s
[error] 17/06/01 10:16:20 INFO DAGScheduler: Job 0 finished: foreach at HelloWorld.scala:10, took 0.596960 s
[error] 17/06/01 10:16:20 INFO SparkContext: Invoking stop() from shutdown hook
[error] 17/06/01 10:16:20 INFO SparkUI: Stopped Spark web UI at http://127.0.0.1:4040
[error] 17/06/01 10:16:20 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
[error] 17/06/01 10:16:20 INFO MemoryStore: MemoryStore cleared
[error] 17/06/01 10:16:20 INFO BlockManager: BlockManager stopped
[error] 17/06/01 10:16:20 INFO BlockManagerMaster: BlockManagerMaster stopped
[error] 17/06/01 10:16:20 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
[error] 17/06/01 10:16:20 INFO SparkContext: Successfully stopped SparkContext
[error] 17/06/01 10:16:20 INFO ShutdownHookManager: Shutdown hook called
[error] 17/06/01 10:16:20 INFO ShutdownHookManager: Deleting directory /tmp/spark-77d00e78-9f76-4ab2-bc40-0b99940661ac
[success] Total time: 37 s, completed 1 Jun, 2017 10:16:20 AM
誰かがその背後にある理由を理解するのを手伝ってくれる?
Shiti Saxenaによる「Scala向けSBT入門」からの抜粋
JVMをフォークする必要があるのはなぜですか?
ユーザーがrunまたはconsoleコマンドを使用してコードを実行すると、コードはSBTと同じ仮想マシンで実行されます。場合によっては、System.exit呼び出しや終了していないスレッドなど、コードの実行によりSBTがクラッシュすることがあります(たとえば、コードでテストを実行しているときにコードで同時に作業している場合)。
テストによってJVMがシャットダウンした場合は、SBTを再起動する必要があります。このようなシナリオを回避するには、JVMをフォークすることが重要です。
コードが次のようにリストされた制約に従う場合、コードを実行するためにJVMをフォークする必要はありません。そうでない場合、フォークされたJVMで実行する必要があります。
- スレッドは作成されないか、ユーザーが作成したスレッドが自分で終了するとプログラムが終了します。
- System.exitはプログラムを終了するために使用され、ユーザーが作成したスレッドは割り込み時に終了します
- 逆シリアル化は行われません。または逆シリアル化コードにより、適切なクラスローダーが使用されます。
私はなぜ正確に見つけることができませんでした:
しかし、これは彼らのビルドファイルと推奨事項です:
https://github.com/deanwampler/spark-scala-tutorial/blob/master/project/Build.scala
誰かがより良い答えを与えることができることを願っています。
編集されたコード:
import org.Apache.spark.sql.SparkSession
object HelloWorld {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder().master("local").appName("BigApple").getOrCreate()
import spark.implicits._
val ds = Seq(1, 2, 3).toDS()
ds.map(_ + 1).foreach(x => println(x))
}
}
build.sbt
name := """untitled"""
version := "1.0"
scalaVersion := "2.11.7"
libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test"
libraryDependencies += "org.Apache.spark" % "spark-sql_2.11" % "2.1.1"
与えられたドキュメントから ここ
デフォルトでは、実行タスクはsbtと同じJVMで実行されます。ただし、特定の状況下では分岐が必要です。または、新しいタスクを実装するときにJavaプロセスをフォークすることもできます。
デフォルトでは、フォークされたプロセスは、現在のプロセスのビルドおよび作業ディレクトリとJVMオプションに使用されている同じJavaおよびScalaバージョンを使用します。このページ実行タスクとテストタスクの両方でフォークを有効にして構成する方法について説明します。以下で説明するように、関連するキーをスコープすることで、タスクの種類ごとに個別に構成できます。
実行時にフォークを有効にするには、単に
fork in run := true