我创造了一个火花的环境通过使用-
(ns something
(:require [flambo.conf : conf]
[flambo.api :as f]))
(def c (-> (conf/spark-conf)
(conf/master "spark://formcept008.lan:7077")
(conf/app-name "clustering"))) ;; app-name
(def sc (f/spark-context c))然后我创建了一个RDD-
(f/parallelize sc DATA)现在,当我对这个数据执行一些操作时,比如(f/带rdd 3)等等,我得到了一个错误-
17/11/28 14:35:00错误Utils:遇到的异常org.apache.spark.SparkException:未能在org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:274) at org.apache.spark.serializer.KryoSerializerInstance.(KryoSerializer.scala:259) at org.apache.spark.serializer.KryoSerializer.newInstance上向Kryo注册类( org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1.apply$mcV$sp(ParallelCollectionRDD.scala:79) at org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1.apply(ParallelCollectionRDD.scala:70) at org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1.apply(ParallelCollectionRDD.scala:70) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1273) at org.apache.spark.rdd.ParallelCollectionPartition.readObject )( sun.reflect.NativeMethodAccessorImpl.invoke0(Native方法)在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62),sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43),java.lang.reflect.Method.invoke(Method.java:498),java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058),java.io.ObjectInputStream.readSerialDatajava.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(( java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) )在java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745),由: java.lang.ClassNotFoundException: flambo.kryo.BaseFlamboRegistrator at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357)在org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:124),org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:124),scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234),scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.,java.lang.Class.forName(Class.java:348)斯卡拉:234)在scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:124) .. 27多17/11/28 14:35:00错误执行器:阶段0.0中任务0.0中的异常(TID 0) java.lang.IllegalStateException: java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2449) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1385) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) atorg.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
请对此有任何想法。
发布于 2018-04-19 10:33:37
解决了。将项目的所有jars添加到火花配置中,使用-
(conf/jars (map #(.getPath % (.getURLs(java.lang.ClassLoader/getSystemClassLoader))))它将注册所有的课程。既然这个问题解决了,所以关闭它。
发布于 2017-11-28 18:36:45
flambo似乎不存在于你的类路径中,这就是为什么你会得到:
java.lang.ClassNotFoundException: flambo.kryo.BaseFlamboRegistrator
您是从REPL运行这个任务,还是使用lein或引导任务?
如果使用leiningen,请检查类路径(lein classpath)和依赖树(lein deps :tree)
此外,做一个lein clean来确保目标文件夹不会导致问题,也不会有任何伤害。
堆栈跟踪分析: Failed to register classes with Kryo是由于flambo.kryo.BaseFlamboRegistrator丢失而引起的
https://stackoverflow.com/questions/47528322
复制相似问题