大神求助,IntelliJ运行Scala程序问题
1个回答
展开全部
打成Jar包,直接用spark-submit 方式运行就可以成功,直接用IntelliJ就总是失败,不知道怎么回事,求大神指点。程序是
object HelloSpark {
def main(args:Array[String]): Unit ={
if(args.length!=2){
System.err.println("Usage:HelloSpark <Input> <Output>")
System.exit(1)
}
val conf = new SparkConf().setAppName("helloSpark").setMaster("spark://master:7077").set("spark.executor.memory", "1g")
.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
val sc = new SparkContext(conf)
sc.textFile(args(0)).map(_.split("\t")).filter(_.length==6).map(x=>(x(1),1)).reduceByKey(_+_).map(x=>(x._2,x._1)).
sortByKey(false).map(x=>(x._2,x._1)).saveAsTextFile(args(1))
sc.stop()
}
这是网上的一个实例,步骤都是一步一步来的。
16/02/23 16:39:53 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, worker1): java.lang.ClassNotFoundException: com.spark.firstApp.HelloSpark$$anonfun$2
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
......
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/02/23 16:39:53 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 2, worker0, partition 0,NODE_LOCAL, 2134 bytes)
16/02/23 16:39:53 INFO TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on executor worker2: java.lang.ClassNotFoundException (com.spark.firstApp.HelloSpark$$anonfun$2) [duplicate 1]
16/02/23 16:39:53 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 3, worker2, partition 1,NODE_LOCAL, 2134 bytes)
16/02/23 16:39:53 INFO TaskSetManager: Lost task 1.1 in stage 0.0 (TID 3) on executor worker2: java.lang.ClassNotFoundException (com.spark.firstApp.HelloSpark$$anonfun$2) [duplicate 2]
16/02/23 16:39:53 INFO TaskSetManager: Starting task 1.2 in stage 0.0 (TID 4, worker1, partition 1,NODE_LOCAL, 2134 bytes)
16/02/23 16:39:53 INFO TaskSetManager: Lost task 1.2 in stage 0.0 (TID 4) on executor worker1: java.lang.ClassNotFoundException (com.spark.firstApp.HelloSpark$$anonfun$2) [duplicate 3]
16/02/23 16:39:53 INFO TaskSetManager: Starting task 1.3 in stage 0.0 (TID 5, worker1, partition 1,NODE_LOCAL, 2134 bytes)
16/02/23 16:39:53 INFO TaskSetManager: Lost task 1.3 in stage 0.0 (TID 5) on executor worker1: java.lang.ClassNotFoundException (com.spark.firstApp.HelloSpark$$anonfun$2) [duplicate 4]
16/02/23 16:39:53 ERROR TaskSetManager: Task 1 in stage 0.0 failed 4 times; aborting job
16/02/23 16:39:53 INFO TaskSchedulerImpl: Cancelling stage 0
16/02/23 16:39:53 INFO TaskSchedulerImpl: Stage 0 was cancelled
16/02/23 16:39:53 INFO DAGScheduler: ShuffleMapStage 0 (map at HelloSpark.scala:20) failed in 10.015 s
16/02/23 16:39:53 INFO DAGScheduler: Job 0 failed: sortByKey at HelloSpark.scala:21, took 10.156984 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 5, worker1): java.lang.ClassNotFoundException: com.spark.firstApp.HelloSpark$$anonfun$2
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1896)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
......
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:64)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
......
Caused by: java.lang.ClassNotFoundException: com.spark.firstApp.HelloSpark$$anonfun$2
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
......
16/02/23 16:39:53 INFO SparkContext: Invoking stop() from shutdown hook
16/02/23 16:39:53 INFO SparkUI: Stopped Spark web UI at :4040
16/02/23 16:39:53 INFO SparkDeploySchedulerBackend: Shutting down all executors
......
Process finished with exit code 1
这个问题我解决了,在设置SparkConf()的时候,需要把本机生成的JAR包路径进行制定,如:
val conf = new SparkConf().setAppName("SogouResult").setMaster("spark://master:7077")
.setJars(List("D:\\IDEA workspace\\helloSpark\\out\\artifacts\\helloSpark_jar\\helloSpark.jar"))
object HelloSpark {
def main(args:Array[String]): Unit ={
if(args.length!=2){
System.err.println("Usage:HelloSpark <Input> <Output>")
System.exit(1)
}
val conf = new SparkConf().setAppName("helloSpark").setMaster("spark://master:7077").set("spark.executor.memory", "1g")
.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
val sc = new SparkContext(conf)
sc.textFile(args(0)).map(_.split("\t")).filter(_.length==6).map(x=>(x(1),1)).reduceByKey(_+_).map(x=>(x._2,x._1)).
sortByKey(false).map(x=>(x._2,x._1)).saveAsTextFile(args(1))
sc.stop()
}
这是网上的一个实例,步骤都是一步一步来的。
16/02/23 16:39:53 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, worker1): java.lang.ClassNotFoundException: com.spark.firstApp.HelloSpark$$anonfun$2
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
......
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/02/23 16:39:53 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 2, worker0, partition 0,NODE_LOCAL, 2134 bytes)
16/02/23 16:39:53 INFO TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on executor worker2: java.lang.ClassNotFoundException (com.spark.firstApp.HelloSpark$$anonfun$2) [duplicate 1]
16/02/23 16:39:53 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 3, worker2, partition 1,NODE_LOCAL, 2134 bytes)
16/02/23 16:39:53 INFO TaskSetManager: Lost task 1.1 in stage 0.0 (TID 3) on executor worker2: java.lang.ClassNotFoundException (com.spark.firstApp.HelloSpark$$anonfun$2) [duplicate 2]
16/02/23 16:39:53 INFO TaskSetManager: Starting task 1.2 in stage 0.0 (TID 4, worker1, partition 1,NODE_LOCAL, 2134 bytes)
16/02/23 16:39:53 INFO TaskSetManager: Lost task 1.2 in stage 0.0 (TID 4) on executor worker1: java.lang.ClassNotFoundException (com.spark.firstApp.HelloSpark$$anonfun$2) [duplicate 3]
16/02/23 16:39:53 INFO TaskSetManager: Starting task 1.3 in stage 0.0 (TID 5, worker1, partition 1,NODE_LOCAL, 2134 bytes)
16/02/23 16:39:53 INFO TaskSetManager: Lost task 1.3 in stage 0.0 (TID 5) on executor worker1: java.lang.ClassNotFoundException (com.spark.firstApp.HelloSpark$$anonfun$2) [duplicate 4]
16/02/23 16:39:53 ERROR TaskSetManager: Task 1 in stage 0.0 failed 4 times; aborting job
16/02/23 16:39:53 INFO TaskSchedulerImpl: Cancelling stage 0
16/02/23 16:39:53 INFO TaskSchedulerImpl: Stage 0 was cancelled
16/02/23 16:39:53 INFO DAGScheduler: ShuffleMapStage 0 (map at HelloSpark.scala:20) failed in 10.015 s
16/02/23 16:39:53 INFO DAGScheduler: Job 0 failed: sortByKey at HelloSpark.scala:21, took 10.156984 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 5, worker1): java.lang.ClassNotFoundException: com.spark.firstApp.HelloSpark$$anonfun$2
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1896)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
......
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:64)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
......
Caused by: java.lang.ClassNotFoundException: com.spark.firstApp.HelloSpark$$anonfun$2
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
......
16/02/23 16:39:53 INFO SparkContext: Invoking stop() from shutdown hook
16/02/23 16:39:53 INFO SparkUI: Stopped Spark web UI at :4040
16/02/23 16:39:53 INFO SparkDeploySchedulerBackend: Shutting down all executors
......
Process finished with exit code 1
这个问题我解决了,在设置SparkConf()的时候,需要把本机生成的JAR包路径进行制定,如:
val conf = new SparkConf().setAppName("SogouResult").setMaster("spark://master:7077")
.setJars(List("D:\\IDEA workspace\\helloSpark\\out\\artifacts\\helloSpark_jar\\helloSpark.jar"))
已赞过
已踩过<
评论
收起
你对这个回答的评价是?
推荐律师服务:
若未解决您的问题,请您详细描述您的问题,通过百度律临进行免费专业咨询