为 "items whose children satisfy" 优化看似简单的 couchbase 查询

Question

我正在开发一个系统来使用 couchbase 存储我们的翻译。

我的存储桶中大约有 15,000 个条目，如下所示：

{
  "classifications": [
    {
      "documentPath": "Test Vendor/Test Project/Ordered",
      "position": 1
    }
  ],
  "id": "message-Test Vendor/Test Project:first",
  "key": "first",
  "projectId": "project-Test Vendor/Test Project",
  "translations": {
    "en-US": [
      {
        "default": {
          "owner": "414d6352-c26b-493e-835e-3f0cf37f1f3c",
          "text": "first"
        }
      }
    ]
  },
  "type": "message",
  "vendorId": "vendor-Test Vendor"
},

例如，我想查找分类为 "documentPath" 为 "Test Vendor/Test Project/Ordered" 的所有邮件。

我使用这个查询：

SELECT message.*
FROM couchlate message UNNEST message.classifications classification
WHERE classification.documentPath = "Test Vendor/Test Project/Ordered"
      AND message.type="message"
ORDER BY classification.position

但我很惊讶查询需要 2 秒才能执行！

查看 query execution plan，似乎 couchbase 正在遍历所有消息，然后过滤 "documentPath"。

我希望它首先过滤 "documentPath"（因为实际上只有 2 个 documentPaths 匹配我的查询）然后找到消息。

我尝试在 "classifications" 上创建索引，但它没有任何改变。

我的索引设置有问题吗？或者我应该以不同的方式构建我的数据以获得快速结果吗？

如果重要的话，我正在使用 couchbase 4.5 beta。

Answer 1

您的查询过滤了 documentPath 字段，因此分类索引实际上没有帮助。您需要使用 Couchbase 4.5 上的新数组索引语法在 documentPath 字段本身上创建数组索引：

CREATE INDEX ix_documentPath ON myBucket ( DISTINCT ARRAY c.documentPath FOR c IN classifications END ) ;

然后您可以使用如下查询在 documentPath 上进行查询：

SELECT * FROM myBucket WHERE ANY c IN classifications SATISFIES c.documentPath = "your path here" END ;

将 EXPLAIN 添加到查询的开头以查看执行计划并确认它确实在使用索引 ix_documentPath。

此处有更多详细信息和示例：http://developer.couchbase.com/documentation/server/4.5-dp/indexing-arrays.html

为 "items whose children satisfy" 优化看似简单的 couchbase 查询

Optimizing seemingly simple couchbase query for "items whose children satisfy"

query-optimization

couchbase

n1ql