AQL 查询真的很慢(~20 秒)

AQL Query Really Slow (~20 seconds)

执行以下查询大约需要 20 秒

FOR p IN PATHS(locations, connections, "outbound", { maxLength: 1 }) FILTER p.source._key == "26094" RETURN p.vertices[*].name

我相信这是一个简单的查询(并且数据库不是那么大)并且它应该执行得相当快......我一定是做错了什么......这是查询结果:

==> [object ArangoQueryCursor - count: 286, hasMore: false]

locations(顶点)集合有 23753 个文档,connections(边)集合有 123414 个文档。

我也尝试过按 _id 过滤,但性能有些相同。

我可以做些什么来获得更好的性能吗?

这是查询的 .explain() 报告:

 { 
  "plan" : { 
    "nodes" : [ 
      { 
        "type" : "SingletonNode", 
        "dependencies" : [ ], 
        "id" : 1, 
        "estimatedCost" : 1, 
        "estimatedNrItems" : 1 
      }, 
      { 
        "type" : "CalculationNode", 
        "dependencies" : [ 
          1 
        ], 
        "id" : 2, 
        "estimatedCost" : 2, 
        "estimatedNrItems" : 1, 
        "expression" : { 
          "type" : "function call", 
          "name" : "PATHS", 
          "subNodes" : [ 
            { 
              "type" : "array", 
              "subNodes" : [ 
                { 
                  "type" : "collection", 
                  "name" : "locations" 
                }, 
                { 
                  "type" : "collection", 
                  "name" : "connections" 
                }, 
                { 
                  "type" : "value", 
                  "value" : "outbound" 
                }, 
                { 
                  "type" : "object", 
                  "subNodes" : [ 
                    { 
                      "type" : "object element", 
                      "name" : "maxLength", 
                      "subNodes" : [ 
                        { 
                          "type" : "value", 
                          "value" : 1 
                        } 
                      ] 
                    } 
                  ] 
                } 
              ] 
            } 
          ] 
        }, 
        "outVariable" : { 
          "id" : 2, 
          "name" : "2" 
        }, 
        "canThrow" : true 
      }, 
      { 
        "type" : "EnumerateListNode", 
        "dependencies" : [ 
          2 
        ], 
        "id" : 3, 
        "estimatedCost" : 102, 
        "estimatedNrItems" : 100, 
        "inVariable" : { 
          "id" : 2, 
          "name" : "2" 
        }, 
        "outVariable" : { 
          "id" : 0, 
          "name" : "p" 
        } 
      }, 
      { 
        "type" : "CalculationNode", 
        "dependencies" : [ 
          3 
        ], 
        "id" : 4, 
        "estimatedCost" : 202, 
        "estimatedNrItems" : 100, 
        "expression" : { 
          "type" : "compare ==", 
          "subNodes" : [ 
            { 
              "type" : "attribute access", 
              "name" : "_key", 
              "subNodes" : [ 
                { 
                  "type" : "attribute access", 
                  "name" : "source", 
                  "subNodes" : [ 
                    { 
                      "type" : "reference", 
                      "name" : "p", 
                      "id" : 0 
                    } 
                  ] 
                } 
              ] 
            }, 
            { 
              "type" : "value", 
              "value" : "26094" 
            } 
          ] 
        }, 
        "outVariable" : { 
          "id" : 3, 
          "name" : "3" 
        }, 
        "canThrow" : false 
      }, 
      { 
        "type" : "FilterNode", 
        "dependencies" : [ 
          4 
        ], 
        "id" : 5, 
        "estimatedCost" : 302, 
        "estimatedNrItems" : 100, 
        "inVariable" : { 
          "id" : 3, 
          "name" : "3" 
        } 
      }, 
      { 
        "type" : "CalculationNode", 
        "dependencies" : [ 
          5 
        ], 
        "id" : 6, 
        "estimatedCost" : 402, 
        "estimatedNrItems" : 100, 
        "expression" : { 
          "type" : "expand", 
          "subNodes" : [ 
            { 
              "type" : "iterator", 
              "subNodes" : [ 
                { 
                  "type" : "variable", 
                  "name" : "1_", 
                  "id" : 1 
                }, 
                { 
                  "type" : "attribute access", 
                  "name" : "vertices", 
                  "subNodes" : [ 
                    { 
                      "type" : "reference", 
                      "name" : "p", 
                      "id" : 0 
                    } 
                  ] 
                } 
              ] 
            }, 
            { 
              "type" : "attribute access", 
              "name" : "name", 
              "subNodes" : [ 
                { 
                  "type" : "reference", 
                  "name" : "1_", 
                  "id" : 1 
                } 
              ] 
            } 
          ] 
        }, 
        "outVariable" : { 
          "id" : 4, 
          "name" : "4" 
        }, 
        "canThrow" : false 
      }, 
      { 
        "type" : "ReturnNode", 
        "dependencies" : [ 
          6 
        ], 
        "id" : 7, 
        "estimatedCost" : 502, 
        "estimatedNrItems" : 100, 
        "inVariable" : { 
          "id" : 4, 
          "name" : "4" 
        } 
      } 
    ], 
    "rules" : [ 
      "move-calculations-up", 
      "move-filters-up", 
      "move-calculations-up-2", 
      "move-filters-up-2" 
    ], 
    "collections" : [ 
      { 
        "name" : "connections", 
        "type" : "read" 
      }, 
      { 
        "name" : "locations", 
        "type" : "read" 
      } 
    ], 
    "variables" : [ 
      { 
        "id" : 0, 
        "name" : "p" 
      }, 
      { 
        "id" : 1, 
        "name" : "1_" 
      }, 
      { 
        "id" : 2, 
        "name" : "2" 
      }, 
      { 
        "id" : 3, 
        "name" : "3" 
      }, 
      { 
        "id" : 4, 
        "name" : "4" 
      } 
    ], 
    "estimatedCost" : 502, 
    "estimatedNrItems" : 100 
  }, 
  "warnings" : [ ], 
  "stats" : { 
    "rulesExecuted" : 21, 
    "rulesSkipped" : 0, 
    "plansCreated" : 1 
  } 
}

PATHS() 将构建图形的所有路径,然后使用 _key 属性上的 FILTER 过滤结果 post。在过滤掉所有不匹配项之前,这可能会首先创建一个巨大的结果集(对于所有路径)。

如果只需要在深度 1 上找到相连的顶点,我认为这样做会更有效率:

  • 正在使用 TRAVERSAL 进行查询:

    这样效率更高,因为它将构建图中的所有路径,但只构建从指定起始顶点开始的路径:

    FOR p IN TRAVERSAL(locations, connections, "1", "outbound", { minDepth: 1, maxDepth: 1, paths: true }) 
      RETURN p.path.vertices[*].name
    
  • 使用NEIGHBORS查询直接邻居:

    这可能会稍微更有效率,因为它会构建一个较小的中间结果。 此外,它不会 return 起始顶点 (26094),而是直接连接到它的所有顶点:

    FOR p IN NEIGHBORS(locations, connections, "26094", "outbound") 
      RETURN p.vertex.name
    
  • 直接查询边(不使用图函数)

    终于可以直接查询边集合了。 同样,这不会 return 起始顶点 (26094),而是直接连接到它的所有顶点:

    FOR edge IN connections
      FILTER edge._from == "locations/26094"
      FOR vertex IN locations
        FILTER vertex._id == edge._to
        RETURN vertex.name