我正在尝试使用load从CSV加载大约100万行。我在Windows机器上使用Neo4j企业3.2.2。我增加了我的头部堆栈到7g,但仍在击中
Neo.TransientError.General.OutOfMemoryError对于如何使用当前的密码查询加载这个CSV,有什么建议吗?
查询:
using periodic commit 200 load csv with headers from "file:///LabsTab.Txt" as csvLine fieldterminator '\t' with csvLine where csvLine.ObservationName <> "Cancellation Reason"
optional match (visit:Visit {VisitID: csvLine.VisitID})
merge (provider:Provider {ProviderName: csvLine.ProviderName}) on create set provider.ProviderID = csvLine.OrderingProviderID
merge (vlo:VisitLabOrder) on create set vlo.ProviderID = csvLine.OrderingProviderID on create set vlo.FillerOrderNo = csvLine.FillerOrderNo on create set vlo.OrderStartDtTm = apoc.date.parse(csvLine.OrderStartDtTm, "s", "yyyy/mm/dd hh:mm")
on create set vlo.OrderStart = csvLine.OrderStartDtTm
merge(lab:Lab{FillerOrderNo: csvLine.FillerOrderNo, OrderingProviderID: csvLine.OrderingProviderID, OrderingProvider: csvLine.ProviderName})
on create set lab.SpecimenCollectionDtTm = apoc.date.parse(csvLine.SpecimenCollectionDtTm, "s", "yyyy/mm/dd hh:mm")
on create set lab.SpecimentReceivedDtTm = apoc.date.parse(csvLine.SpecimenReceivedDtTm, "s", "yyyy/mm/dd hh:mm")
on create set lab.AnalysisDtTm= apoc.date.parse(csvLine.AnalysisDtTm, "s", "yyyy/mm/dd hh:mm")
merge(vlr:VisitLabResult{FillerOrderNo: csvLine.FillerOrderNo, ProviderID: csvLine.ProviderID}) on create set vlr.ResultStatusChangeDtTm = apoc.date.parse(csvLine.ResultStatusChangeDtTm, "s", "yyyy/mm/dd hh:mm")
on create set vlr.ResultStatusChange = csvLine.ResultStatusChangeDtTm
merge (labobs:LabObservation{UniversalServiceName: csvLine.UniversalServiceName, UniversalServiceID: csvLine.UniversalServiceID, ObservationName: csvLine.ObservationName, ObservationValue: csvLine.ObservationValue, Units: csvLine.Units})
//Merge (visit)-[r:Lab_tested]->(vlo)-[:Lab_tested]->(lab)-[:Observation_result]->(labobs)
//merge (lab)-[:Lab_resulted]->(vlr)-[:Lab_resulted]->(visit)
//merge (vlr)<-[:Ordered]-(provider)-[:Ordered]->(vlo)发布于 2017-08-02 17:47:08
您应该将LOAD CSV与USING PERIODIC COMMIT一起使用。
来自医生们
如果CSV文件包含大量行(接近数十万或数百万行),则可以使用
USING PERIODIC COMMIT来指示Neo4j在多行之后执行提交。这减少了事务状态的内存开销。默认情况下,提交将每1000行进行一次。
可以在USING PERIODIC COMMIT之后更改指定所需编号的默认行为,如下所示:
USING PERIODIC COMMIT 500
LOAD CSV FROM 'https://neo4j.com/docs/developer-manual/3.2/csv/artists.csv' AS line
CREATE (:Artist { name: line[1], year: toInt(line[2])})另外,ON CREATE SET可以由MERGE指定一次。每个作业都可以用,分隔。我不知道这些更改是否会产生影响,但请尝试:)
using periodic commit 200 load csv with headers from "file:///LabsTab.Txt" as csvLine fieldterminator '\t'
with csvLine where csvLine.ObservationName <> "Cancellation Reason"
optional match (visit:Visit {VisitID: csvLine.VisitID})
merge (provider:Provider {ProviderName: csvLine.ProviderName})
on create set provider.ProviderID = csvLine.OrderingProviderID
merge (vlo:VisitLabOrder)
on create set vlo.ProviderID = csvLine.OrderingProviderID,
vlo.FillerOrderNo = csvLine.FillerOrderNo,
vlo.OrderStartDtTm = apoc.date.parse(csvLine.OrderStartDtTm, "s", "yyyy/mm/dd hh:mm"),
vlo.OrderStart = csvLine.OrderStartDtTm
merge(lab:Lab{FillerOrderNo: csvLine.FillerOrderNo, OrderingProviderID: csvLine.OrderingProviderID, OrderingProvider: csvLine.ProviderName})
on create set lab.SpecimenCollectionDtTm = apoc.date.parse(csvLine.SpecimenCollectionDtTm, "s", "yyyy/mm/dd hh:mm"),
lab.SpecimentReceivedDtTm = apoc.date.parse(csvLine.SpecimenReceivedDtTm, "s", "yyyy/mm/dd hh:mm"),
lab.AnalysisDtTm= apoc.date.parse(csvLine.AnalysisDtTm, "s", "yyyy/mm/dd hh:mm")
merge(vlr:VisitLabResult{FillerOrderNo: csvLine.FillerOrderNo, ProviderID: csvLine.ProviderID})
on create set vlr.ResultStatusChangeDtTm = apoc.date.parse(csvLine.ResultStatusChangeDtTm, "s", "yyyy/mm/dd hh:mm"),
vlr.ResultStatusChange = csvLine.ResultStatusChangeDtTm
merge (labobs:LabObservation{UniversalServiceName: csvLine.UniversalServiceName, UniversalServiceID: csvLine.UniversalServiceID, ObservationName: csvLine.ObservationName, ObservationValue: csvLine.ObservationValue, Units: csvLine.Units})
//Merge (visit)-[r:Lab_tested]->(vlo)-[:Lab_tested]->(lab)-[:Observation_result]->(labobs)
//merge (lab)-[:Lab_resulted]->(vlr)-[:Lab_resulted]->(visit)
//merge (vlr)<-[:Ordered]-(provider)-[:Ordered]->(vlo)https://stackoverflow.com/questions/45467625
复制相似问题