Java 中的 OrientDB 并发图形操作
OrientDB concurrent graph operations in Java
我正在尝试在多线程环境 (Java 8) 中使用 orientdb (v2.1.2),在其中我从多个线程中更新顶点。我知道 orientdb 正在使用 MVCC,因此这些操作可能会失败,必须再次执行。
我写了一个小单元测试,试图通过等待我派生的线程中的循环障碍来引发这种情况。不幸的是,测试失败并出现一个我不理解的模糊异常:
Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
INFO: OrientDB auto-config DISKCACHE=10,427MB (heap=3,566MB os=16,042MB disk=31,720MB)
Thread [0] running
Thread [1] running
Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
WARNING: {db=tinkerpop} Requested command 'create edge type 'testedge_1442840424480' as subclass of 'E'' must be executed outside active transaction: the transaction will be committed and reopen right after it. To avoid this behavior execute it outside a transaction
Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
WARNING: {db=tinkerpop} Requested command 'create edge type 'testedge_1442840424480' as subclass of 'E'' must be executed outside active transaction: the transaction will be committed and reopen right after it. To avoid this behavior execute it outside a transaction
Exception in thread "Thread-4" com.orientechnologies.orient.core.exception.OSchemaException: Cluster with id 11 already belongs to class testedge_1442840424480
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.checkClustersAreAbsent(OSchemaShared.java:1264)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.doCreateClass(OSchemaShared.java:983)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:415)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:400)
at com.orientechnologies.orient.core.metadata.schema.OSchemaProxy.createClass(OSchemaProxy.java:100)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.call(OrientBaseGraph.java:1387)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.call(OrientBaseGraph.java:1384)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.executeOutsideTx(OrientBaseGraph.java:1739)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1384)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1368)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1353)
at com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:928)
at com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:832)
at com.gentics.test.orientdb.OrientDBTinkerpopMultithreadingTest.lambda[=11=](OrientDBTinkerpopMultithreadingTest.java:31)
at com.gentics.test.orientdb.OrientDBTinkerpopMultithreadingTest$$Lambda/1446001495.run(Unknown Source)
at java.lang.Thread.run(Thread.java:745)
测试使用的是简单的内存数据库。我不明白为什么 orientdb 正在检查一些集群操作:
Cluster with id 11 already belongs to class testedge
不知怎的,只有当我尝试用相同的标签创建两条边时才会出现这个问题。
private OrientGraphFactory factory = new OrientGraphFactory("memory:tinkerpop").setupPool(5, 20);
@Test
public void testConcurrentGraphModifications() throws InterruptedException {
OrientGraph graph = factory.getTx();
Vertex v = graph.addVertex(null);
graph.commit();
CyclicBarrier barrier = new CyclicBarrier(2);
List<Thread> threads = new ArrayList<>();
// Spawn two threads
for (int i = 0; i < 2; i++) {
final int threadNo = i;
threads.add(run(() -> {
System.out.println("Running thread [" + threadNo + "]");
// Start a new transaction and modify vertex v
OrientGraph tx = factory.getTx();
Vertex v2 = tx.addVertex(null);
v.addEdge("testedge", v2);
try {
barrier.await();
} catch (Exception e) {
e.printStackTrace();
}
tx.commit();
}));
}
// Wait for all spawned threads
for (Thread thread : threads) {
thread.join();
}
}
protected Thread run(Runnable runnable) {
Thread thread = new Thread(runnable);
thread.start();
return thread;
}
总的来说,我会非常感谢一个示例,该示例演示了在嵌入式多线程 java 环境中使用 orientdb 时如何处理 MVCC 冲突。
更新:
我注意到当我通过 tx.getVertex(vertex.getId())(而不是通过 .reload())在我的线程中重新加载顶点时,问题不再发生。当我将顶点对象引用传递给我的线程并在那里使用它时,我遇到了各种错误。我假设 OrientVertex class 不是线程安全的。
- 你是对的所有图元素都不是线程安全的。
- 你的例外原因是当你创建边缘时,你在图形数据库的下面创建了文档 class 等于边缘的标签。如果 class 不存在,事务将自动提交并创建新的 class 内部模式。当您同时添加边缘时,每个 class 都映射到数据库中的集群(它就像一个 table),您同时创建了相同的 class,结果创建了相同的集群。因此,一个线程获胜,另一个线程失败,但已创建具有给定名称的集群除外。实际上,我建议您在运行时添加边之前尽可能创建所有 classes aka 边标签。
再提一个建议。您应该将 OrientGraph 实例视为与服务器的连接。最佳用法如下:
- OrientGraphFactory 中的设置池
- 交易前获取图实例。
- 执行交易。
- 调用.shutdown(),不创建长期存在的图实例。
我正在尝试在多线程环境 (Java 8) 中使用 orientdb (v2.1.2),在其中我从多个线程中更新顶点。我知道 orientdb 正在使用 MVCC,因此这些操作可能会失败,必须再次执行。
我写了一个小单元测试,试图通过等待我派生的线程中的循环障碍来引发这种情况。不幸的是,测试失败并出现一个我不理解的模糊异常:
Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
INFO: OrientDB auto-config DISKCACHE=10,427MB (heap=3,566MB os=16,042MB disk=31,720MB)
Thread [0] running
Thread [1] running
Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
WARNING: {db=tinkerpop} Requested command 'create edge type 'testedge_1442840424480' as subclass of 'E'' must be executed outside active transaction: the transaction will be committed and reopen right after it. To avoid this behavior execute it outside a transaction
Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
WARNING: {db=tinkerpop} Requested command 'create edge type 'testedge_1442840424480' as subclass of 'E'' must be executed outside active transaction: the transaction will be committed and reopen right after it. To avoid this behavior execute it outside a transaction
Exception in thread "Thread-4" com.orientechnologies.orient.core.exception.OSchemaException: Cluster with id 11 already belongs to class testedge_1442840424480
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.checkClustersAreAbsent(OSchemaShared.java:1264)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.doCreateClass(OSchemaShared.java:983)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:415)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:400)
at com.orientechnologies.orient.core.metadata.schema.OSchemaProxy.createClass(OSchemaProxy.java:100)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.call(OrientBaseGraph.java:1387)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.call(OrientBaseGraph.java:1384)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.executeOutsideTx(OrientBaseGraph.java:1739)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1384)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1368)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1353)
at com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:928)
at com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:832)
at com.gentics.test.orientdb.OrientDBTinkerpopMultithreadingTest.lambda[=11=](OrientDBTinkerpopMultithreadingTest.java:31)
at com.gentics.test.orientdb.OrientDBTinkerpopMultithreadingTest$$Lambda/1446001495.run(Unknown Source)
at java.lang.Thread.run(Thread.java:745)
测试使用的是简单的内存数据库。我不明白为什么 orientdb 正在检查一些集群操作:
Cluster with id 11 already belongs to class testedge
不知怎的,只有当我尝试用相同的标签创建两条边时才会出现这个问题。
private OrientGraphFactory factory = new OrientGraphFactory("memory:tinkerpop").setupPool(5, 20);
@Test
public void testConcurrentGraphModifications() throws InterruptedException {
OrientGraph graph = factory.getTx();
Vertex v = graph.addVertex(null);
graph.commit();
CyclicBarrier barrier = new CyclicBarrier(2);
List<Thread> threads = new ArrayList<>();
// Spawn two threads
for (int i = 0; i < 2; i++) {
final int threadNo = i;
threads.add(run(() -> {
System.out.println("Running thread [" + threadNo + "]");
// Start a new transaction and modify vertex v
OrientGraph tx = factory.getTx();
Vertex v2 = tx.addVertex(null);
v.addEdge("testedge", v2);
try {
barrier.await();
} catch (Exception e) {
e.printStackTrace();
}
tx.commit();
}));
}
// Wait for all spawned threads
for (Thread thread : threads) {
thread.join();
}
}
protected Thread run(Runnable runnable) {
Thread thread = new Thread(runnable);
thread.start();
return thread;
}
总的来说,我会非常感谢一个示例,该示例演示了在嵌入式多线程 java 环境中使用 orientdb 时如何处理 MVCC 冲突。
更新:
我注意到当我通过 tx.getVertex(vertex.getId())(而不是通过 .reload())在我的线程中重新加载顶点时,问题不再发生。当我将顶点对象引用传递给我的线程并在那里使用它时,我遇到了各种错误。我假设 OrientVertex class 不是线程安全的。
- 你是对的所有图元素都不是线程安全的。
- 你的例外原因是当你创建边缘时,你在图形数据库的下面创建了文档 class 等于边缘的标签。如果 class 不存在,事务将自动提交并创建新的 class 内部模式。当您同时添加边缘时,每个 class 都映射到数据库中的集群(它就像一个 table),您同时创建了相同的 class,结果创建了相同的集群。因此,一个线程获胜,另一个线程失败,但已创建具有给定名称的集群除外。实际上,我建议您在运行时添加边之前尽可能创建所有 classes aka 边标签。
再提一个建议。您应该将 OrientGraph 实例视为与服务器的连接。最佳用法如下:
- OrientGraphFactory 中的设置池
- 交易前获取图实例。
- 执行交易。
- 调用.shutdown(),不创建长期存在的图实例。