py2neo

Question

我有一个关系数据库，我将表格转换为 csv 文件。我导入了其中的 2 个，并通过指定要选择的列来创建节点，如下面的代码所示：

import csv
from py2neo import neo4j, authenticate, Graph, Node, cypher, rel, Relationship
authenticate("localhost:7474", "neo4j", "my_password")
graph_db = Graph()
graph_db.delete_all()

"""import all rows and columns of csv files"""

with open('File1.csv', "rb") as abc_file, open('File2.csv', "rb") as efg_file:
data1 = csv.reader(abc_file, delimiter=';')
data2 = csv.reader(efg_file, delimiter=';')
data1.next()
data2.next()

"""Create the nodes for the all the rows of "Contact Email" column of abc_file"""
rownum = 0
for row in abc_file:
    nodes1 = Node("Contact_Email", email=row[0])
    contact_graph = graph_db.create(nodes1)

"""Create the nodes for the all the rows of "Building_Name" and "Person_Created" 
   columns of efg_file"""
rownum = 0
for row in efg_file:
    nodes2 = Node("Building_Name", name=row[0])
    nodes3 = Node("Person_Created", name=row[1])
    building_graph = graph_db.create(nodes2, nodes3)

假设 "File1.csv" 的 "Contact_Email" 列下有 60 封电子邮件，即 Primary_Key。它在 "Person_Created" 列下的 "File2.csv" 中用作 Foreign_Key。在 "Building Name" 下指定了 14 个建筑物，在 "Person_Created" 列中有相应的电子邮件。我的问题是：

1) 如何将 File2.csv "Person_Created" 列中的 14 封电子邮件与 File1.csv "Contact Email" 列中的电子邮件进行匹配以避免重复

2) 以及如何在 "Building Names"（在 File2.csv 中）和 "Person_Created"（在 File1.csv 中）之间建立关系而没有任何重复...... "Building1234 is DESIGNED_BY abc@xyz.com"

我如何在 py2neo with/without 密码中做到这一点？

Answer 1

为联系人电子邮件创建索引或唯一约束。

命名节点的属性（例如电子邮件）可能是个好主意。

在遍历 Person_Created 时，使用 email 外键值创建一个 Contact Email 节点，属性为 email。

由于索引/约束到位，将有条件地创建节点

同时在此迭代中创建 Person Created 和 Contact Email 之间的关系。

Answer 2

Py2neo 为此提供了许多唯一性函数。看看this page 看看merge_one 和朋友。然后可以存储由此返回的节点值并将其用作唯一关系和路径。

请注意，为了获得更高的性能，您可能需要查看 Cypher 事务或批处理。如果没有这些，每个操作都需要调用服务器，并且在规模上，这很慢。

py2neo - 匹配并合并来自两个不同 csv 的两个节点，并创建关系

py2neo - Match and Merge two nodes coming from two different csv, and create relationship

python

neo4j

cypher