在 cypher 中进行批处理或者从 Neo4j 浏览器上传多个文件
Batch processing in cypher Or upload multiple files from Neo4j browser
我正在使用以下查询将数据从 csv 加载到 Neo4j:
CREATE CONSTRAINT ON (e:Entity) ASSERT e.entity IS UNIQUE;
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:/file1.csv' AS line FIELDTERMINATOR '|'
WITH line
MERGE (e0:Entity {entity: line.entities_0_entity})
ON CREATE SET e0.confidence = toFloat(line.entities_0_confidence)
MERGE (e1:Entity {entity: line.entities_1_entity})
ON CREATE SET e1.confidence = toFloat(line.entities_1_confidence)
MERGE (e0)-[r:REL {name: line.relation_relation, confidence: toFloat(line.relation_confidence)}]->(e1)
RETURN *
谁能告诉等效查询从 Neo4j 命令行加载数据或在浏览器中动态更改文件名或像 "file:/file*" 一样传递它的方法...?
您可以简单地将所有文件放在 neo4j 的导入目录中,然后使用 bash 脚本将它们全部加载:
#!bin/sh
for file in /Users/ikwattro/dev/_graphs/310/import/*
do
curl -H "Content-Type: application/json" \
-d '{"statements": [{"statement": "LOAD CSV WITH HEADERS FROM file:///$file AS row ..."}]' \
http://localhost:7474/db/data/transaction/commit
done
Neo4j 本身没有标准的方法来指定要导入的多个文件。
如果要多次处理同一个Cypher语句,每次调整一个或多个值,可以使用APOC过程apoc.periodic.iterate。
在您的示例中,您希望事先执行 CREATE CONSTRAINT
语句(并且只执行一次)。
例如:
CALL apoc.periodic.iterate(
"
WITH ['file1', 'x', 'y'] AS filenames,
UNWIND filenames AS name
RETURN name;
",
"
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:/' + {name} + '.csv' AS line FIELDTERMINATOR '|'
WITH line
MERGE (e0:Entity {entity: line.entities_0_entity})
ON CREATE SET e0.confidence = toFloat(line.entities_0_confidence)
MERGE (e1:Entity {entity: line.entities_1_entity})
ON CREATE SET e1.confidence = toFloat(line.entities_1_confidence)
MERGE (e0)-[r:REL {name: line.relation_relation, confidence: toFloat(line.relation_confidence)}]->(e1);
",
{});
此查询将执行 LOAD CSV
语句 3 次(顺序执行,因为过程的 parallel
选项默认为 false
),传递其中一个字符串 ("file1"、"y" 和 "z") 每次作为 name
参数。
我正在使用以下查询将数据从 csv 加载到 Neo4j:
CREATE CONSTRAINT ON (e:Entity) ASSERT e.entity IS UNIQUE;
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:/file1.csv' AS line FIELDTERMINATOR '|'
WITH line
MERGE (e0:Entity {entity: line.entities_0_entity})
ON CREATE SET e0.confidence = toFloat(line.entities_0_confidence)
MERGE (e1:Entity {entity: line.entities_1_entity})
ON CREATE SET e1.confidence = toFloat(line.entities_1_confidence)
MERGE (e0)-[r:REL {name: line.relation_relation, confidence: toFloat(line.relation_confidence)}]->(e1)
RETURN *
谁能告诉等效查询从 Neo4j 命令行加载数据或在浏览器中动态更改文件名或像 "file:/file*" 一样传递它的方法...?
您可以简单地将所有文件放在 neo4j 的导入目录中,然后使用 bash 脚本将它们全部加载:
#!bin/sh
for file in /Users/ikwattro/dev/_graphs/310/import/*
do
curl -H "Content-Type: application/json" \
-d '{"statements": [{"statement": "LOAD CSV WITH HEADERS FROM file:///$file AS row ..."}]' \
http://localhost:7474/db/data/transaction/commit
done
Neo4j 本身没有标准的方法来指定要导入的多个文件。
如果要多次处理同一个Cypher语句,每次调整一个或多个值,可以使用APOC过程apoc.periodic.iterate。
在您的示例中,您希望事先执行 CREATE CONSTRAINT
语句(并且只执行一次)。
例如:
CALL apoc.periodic.iterate(
"
WITH ['file1', 'x', 'y'] AS filenames,
UNWIND filenames AS name
RETURN name;
",
"
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:/' + {name} + '.csv' AS line FIELDTERMINATOR '|'
WITH line
MERGE (e0:Entity {entity: line.entities_0_entity})
ON CREATE SET e0.confidence = toFloat(line.entities_0_confidence)
MERGE (e1:Entity {entity: line.entities_1_entity})
ON CREATE SET e1.confidence = toFloat(line.entities_1_confidence)
MERGE (e0)-[r:REL {name: line.relation_relation, confidence: toFloat(line.relation_confidence)}]->(e1);
",
{});
此查询将执行 LOAD CSV
语句 3 次(顺序执行,因为过程的 parallel
选项默认为 false
),传递其中一个字符串 ("file1"、"y" 和 "z") 每次作为 name
参数。