我有以下结构的数据:
{"id": "1", "name": "A. I. Lazarev", "org": "United States Department of State", "tags": [{"t": "Infrared"}, {"t": "Near-infrared spectroscopy"}, {"t": "Infrared astronomy"}, {"t": "Data collection"}], "pubs": [{"i": "1542417502", "r": 6}], }
{"id": "2", "name": "Stevan Spremo", "tags": [{"t": "Micro-g environment"}, {"t": "Antibiotics"}, {"t": "Bacteriology"}], "pubs": [{"i": "222163962", "r": 0}], }
{"id": "3", "name": "Bricchi G", "pubs": [{"i": "2417067698", "r": 1}, {"i": "2406980973", "r": 1}]}有些行有标记,有些行有组织,有些行两者都有,有些行两者都没有。
我想增加作者和标签之间的关系,(2)作者和组织之间的关系,(3)作者和出版物之间的关系。我已经将发布作为节点,因此一旦获得(1)和(2),获得(3)应该是相当简单的。
我一直试图使用以下代码:
CALL apoc.periodic.iterate(
"CALL apoc.load.json('file:/test.txt') YIELD value AS q RETURN q",
"UNWIND q.id as id
CREATE (a:Author {id:id, name:q.name, citations:q.n_citation, publications:q.n_pubs})
WITH q, a
UNWIND q.tags as tags
MERGE (t:Tag {{name: tags.t}})
CREATE (a)-[:HAS_TAGS]->(t)
WITH q, a
WHERE q.org is not null
MERGE (o:Organization {name: q.org})
CREATE (a)-[:AFFILIATED_WITH]->(o)",
{batchSize:10000, iterateList:true, parallel:false})标记和组织在数据中显示多次,但每个节点应该只有一个,因此我使用MERGE为这些节点创建了唯一的节点。
下面的代码的问题在于它创建了重复的AFFILIATED_WITH关系--它实际上创建了与有标记相同数量的AFFILIATED_WITH关系。
如何更改密码查询,使其不创建重复关系?
发布于 2019-02-21 21:55:53
在本条款之后:
UNWIND q.tags as tags您的查询将有与当前q的标记数相同的数据行(每行都有q, a, id, tags值)。随后的操作将在每个数据行执行一次。这就是为什么您要创建太多的AFFILIATED_WITH关系。
为了解决您的问题,您必须在适当的时间适当地减少数据行的数量(这也将加快您的处理速度,因为不必要的重复操作将被避免)。在您的示例中,只需将第二个WITH q, a子句更改为WITH DISTINCT q, a
CALL apoc.periodic.iterate(
"CALL apoc.load.json('file:///test.txt') YIELD value AS q RETURN q",
"CREATE (a:Author {id:q.id, name:q.name, citations:q.n_citation, publications:q.n_pubs})
WITH q, a
UNWIND q.tags as tags
MERGE (t:Tag {name: tags.t})
CREATE (a)-[:HAS_TAGS]->(t)
WITH DISTINCT q, a
WHERE q.org is not null
MERGE (o:Organization {name: q.org})
CREATE (a)-[:AFFILIATED_WITH]->(o)",
{batchSize:10000, iterateList:true, parallel:false}
)我还通过删除不必要的UNWIND q.id as id子句简化了查询,并修复了一些语法问题。
已更新
如果您想要添加AUTHORED关系(按照这个答案的注释中的请求),您应该在创建AFFILIATED_WITH关系之前这样做--因为WHERE q.org is not null子句将过滤掉一些q节点。此外,每当您使用CREATE创建关系时,Cypher都需要为关系指定一个方向。
CALL apoc.periodic.iterate(
"CALL apoc.load.json('file:///test.txt') YIELD value AS q RETURN q",
"CREATE (a:Author {id:q.id, name:q.name, citations:q.n_citation, publications:q.n_pubs})
WITH q, a
UNWIND q.tags as tags
MERGE (t:Tag {name: tags.t})
CREATE (a)-[:HAS_TAGS]->(t)
WITH DISTINCT q, a
UNWIND q.pubs as pubs
MERGE (p:Quanta {id: pubs.i})
CREATE (a)-[r:AUTHORED {rank: pubs.r}]->(p)
WITH q, a
WHERE q.org is not null
MERGE (o:Organization {name: q.org})
CREATE (a)-[:AFFILIATED_WITH]->(o)",
{batchSize:10000, iterateList:true, parallel:false}
)https://stackoverflow.com/questions/54802054
复制相似问题