首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何在Spark worker节点中连接到NEO4J?

如何在Spark worker节点中连接到NEO4J?
EN

Stack Overflow用户
提问于 2017-03-09 12:48:35
回答 1查看 414关注 0票数 2

我需要在spark map函数中得到一个小的子图。我试过使用AnormCypher和NEO4J-SPARK-CONNECTOR,但都不起作用。AnormCypher将导致java IOException错误(我在mapPartition函数中构建连接,在本地主机服务器上测试)。和Neo4j-spark-connector将导致下面的任务NOT SERIALIZABLE异常。

在Spark worker节点中有没有一个很好的方法来获得一个子图(或者连接到neo4j这样的图形数据库)?

代码语言:javascript
复制
Exception in thread "main" org.apache.spark.SparkException: Task not serializable
    at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
    at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
    at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
    at org.apache.spark.SparkContext.clean(SparkContext.scala:2094)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:793)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:792)
    at ....

我的代码片段使用了ne4j-spark-connector 2.0.0-m2:

代码语言:javascript
复制
val neo = Neo4j(sc) // this runs on the driver

// this runs by a map function
def someFunctionToBeMapped(p: List[Long]) = { 
  val metaGraph = neo.cypher("match p = (a:TourPlace) -[r:could_go_to] -> (b:TourPlace)" +
    "return a.id ,r.distance, b.id").loadRowRdd.map( row => ((row(0).asInstanceOf[Long],row(2).asInstanceOf[Long]), row(1).asInstanceOf[Double]) ).collect().toList

AnromCypher代码为:

代码语言:javascript
复制
def partitionMap(partition: Iterator[List[Long]]) = {
  import org.anormcypher._
  import play.api.libs.ws._
  // Provide an instance of WSClient
  val wsclient = ning.NingWSClient()
  // Setup the Rest Client
  // Need to add the Neo4jConnection type annotation so that the default
  // Neo4jConnection -> Neo4jTransaction conversion is in the implicit scope
  implicit val connection: Neo4jConnection = Neo4jREST("127.0.0.1", 7474, "neo4j", "000000")(wsclient)
  //
  // Provide an ExecutionContext
  implicit val ec = scala.concurrent.ExecutionContext.global

  val res = partition.filter( placeList => {

    val startPlace = Cypher("match p = (a:TourPlace) -[r:could_go_to] -> (b:TourPlace)"  +
      "return p")().flatMap( row => row.data )
  })
  wsclient.close()
  res
}
EN

回答 1

Stack Overflow用户

发布于 2017-12-30 10:44:46

我已经使用了spark独立模式,并能够连接neo4j数据库

使用的版本:

spark 2.1.0

Ne4j-火花连接器2.1.0-m2

我的代码:-

代码语言:javascript
复制
val sparkConf = new SparkConf().setAppName("Neo$j").setMaster("local")
    val sc = new SparkContext(sparkConf)
    println("***Getting Started ****")
    val neo = Neo4j(sc)
    val rdd = neo.cypher("MATCH (n) RETURN id(n) as id").loadDataFrame
    println(rdd.count)

Spark submit:- spark-submit --class package.classname --jars pathofneo4jsparkconnectoryJAR --conf spark.neo4j.bolt.password=***** targetJarFile.jar

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/42686987

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档