Java 中的 OrientDB 并发图形操作

OrientDB concurrent graph operations in Java

我正在尝试在多线程环境 (Java 8) 中使用 orientdb (v2.1.2),在其中我从多个线程中更新顶点。我知道 orientdb 正在使用 MVCC,因此这些操作可能会失败,必须再次执行。

我写了一个小单元测试,试图通过等待我派生的线程中的循环障碍来引发这种情况。不幸的是,测试失败并出现一个我不理解的模糊异常:

Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
INFO: OrientDB auto-config DISKCACHE=10,427MB (heap=3,566MB os=16,042MB disk=31,720MB)
Thread [0] running 
Thread [1] running 
Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
WARNING: {db=tinkerpop} Requested command 'create edge type 'testedge_1442840424480' as subclass of 'E'' must be executed outside active transaction: the transaction will be committed and reopen right after it. To avoid this behavior execute it outside a transaction
Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
WARNING: {db=tinkerpop} Requested command 'create edge type 'testedge_1442840424480' as subclass of 'E'' must be executed outside active transaction: the transaction will be committed and reopen right after it. To avoid this behavior execute it outside a transaction
Exception in thread "Thread-4" com.orientechnologies.orient.core.exception.OSchemaException: Cluster with id 11 already belongs to class testedge_1442840424480
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.checkClustersAreAbsent(OSchemaShared.java:1264)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.doCreateClass(OSchemaShared.java:983)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:415)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:400)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaProxy.createClass(OSchemaProxy.java:100)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.call(OrientBaseGraph.java:1387)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.call(OrientBaseGraph.java:1384)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.executeOutsideTx(OrientBaseGraph.java:1739)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1384)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1368)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1353)
    at com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:928)
    at com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:832)
    at com.gentics.test.orientdb.OrientDBTinkerpopMultithreadingTest.lambda[=11=](OrientDBTinkerpopMultithreadingTest.java:31)
    at com.gentics.test.orientdb.OrientDBTinkerpopMultithreadingTest$$Lambda/1446001495.run(Unknown Source)
    at java.lang.Thread.run(Thread.java:745)

测试使用的是简单的内存数据库。我不明白为什么 orientdb 正在检查一些集群操作:

Cluster with id 11 already belongs to class testedge

不知怎的,只有当我尝试用相同的标签创建两条边时才会出现这个问题。

private OrientGraphFactory factory = new OrientGraphFactory("memory:tinkerpop").setupPool(5, 20);

@Test
public void testConcurrentGraphModifications() throws InterruptedException {
    OrientGraph graph = factory.getTx();
    Vertex v = graph.addVertex(null);
    graph.commit();
    CyclicBarrier barrier = new CyclicBarrier(2);

    List<Thread> threads = new ArrayList<>();

    // Spawn two threads
    for (int i = 0; i < 2; i++) {
        final int threadNo = i;
        threads.add(run(() -> {
            System.out.println("Running thread [" + threadNo + "]");
            // Start a new transaction and modify vertex v
            OrientGraph tx = factory.getTx();
            Vertex v2 = tx.addVertex(null);
            v.addEdge("testedge", v2);
            try {
                barrier.await();
            } catch (Exception e) {
                e.printStackTrace();
            }
            tx.commit();
        }));
    }

    // Wait for all spawned threads
    for (Thread thread : threads) {
        thread.join();
    }
}

protected Thread run(Runnable runnable) {
    Thread thread = new Thread(runnable);
    thread.start();
    return thread;
}

总的来说,我会非常感谢一个示例,该示例演示了在嵌入式多线程 java 环境中使用 orientdb 时如何处理 MVCC 冲突。


更新:

我注意到当我通过 tx.getVertex(vertex.getId())(而不是通过 .reload())在我的线程中重新加载顶点时,问题不再发生。当我将顶点对象引用传递给我的线程并在那里使用它时,我遇到了各种错误。我假设 OrientVertex class 不是线程安全的。

  1. 你是对的所有图元素都不是线程安全的。
  2. 你的例外原因是当你创建边缘时,你在图形数据库的下面创建了文档 class 等于边缘的标签。如果 class 不存在,事务将自动提交并创建新的 class 内部模式。当您同时添加边缘时,每个 class 都映射到数据库中的集群(它就像一个 table),您同时创建了相同的 class,结果创建了相同的集群。因此,一个线程获胜,另一个线程失败,但已创建具有给定名称的集群除外。实际上,我建议您在运行时添加边之前尽可能创建所有 classes aka 边标签。

再提一个建议。您应该将 OrientGraph 实例视为与服务器的连接。最佳用法如下:

  1. OrientGraphFactory 中的设置池
  2. 交易前获取图实例。
  3. 执行交易。
  4. 调用.shutdown(),不创建长期存在的图实例。