ES查询匹配数组中的所有元素

Question

所以我得到了这份文件我想使用此查询过滤的嵌套数组。

我希望 ES return 所有项目都有更改的所有文档 = 0 且仅此。如果 document 在列表中甚至有一个项目的 change = 1，那将被丢弃。

有什么方法可以从我已经编写的查询开始实现这一点？或者我应该改用脚本？

文件：

{
    "id": "abc",
    "_source" : {
        "trips" : [
            {
                "type" : "home",
                "changes" : 0
            },
            {
                "type" : "home",
                "changes" : 1
            }
        ]
    }
},
{
        "id": "def",
        "_source" : {
            "trips" : [
                {
                    "type" : "home",
                    "changes" : 0
                },
                {
                    "type" : "home",
                    "changes" : 0
                }
            ]
        }
    }

查询：

GET trips_solutions/_search

    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "id": {
                  "value": "abc"
                }
              }
            },
            {
              "nested": {
                "path": "trips",
                "query": {
                  "range": {
                    "trips.changes": {
                      "gt": -1,
                      "lt": 1
                    }
                  }
                }
              }
            }
          ]
        }
      }
    }

预期结果：

{
            "id": "def",
            "_source" : {
                "trips" : [
                    {
                        "type" : "home",
                        "changes" : 0
                    },
                    {
                        "type" : "home",
                        "changes" : 0
                    }
                ]
            }
        }

Elasticsearch 版本：7.6.2

已阅读此答案，但对我没有帮助： https://discuss.elastic.co/t/how-to-match-all-item-in-nested-array/163873

Answer 1

首先，如果您按 id: abc 过滤，您显然无法得到 id: def。

其次，由于 nested 字段被视为单独的子文档的性质，您无法查询所有 changes 等于 0 的 trips -- 连接在个人旅行之间迷路了，他们“彼此不认识”。

您可以做的是return只有使用inner_hits:

匹配嵌套查询的行程

GET trips_solutions/_search
{
  "_source": "false",
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "inner_hits": {},
            "path": "trips",
            "query": {
              "term": {
                "trips.changes": {
                  "value": 0
                }
              }
            }
          }
        }
      ]
    }
  }
}

然后最简单的解决方案是将此嵌套信息动态保存在父对象上并在结果数组上使用 range/term 查询。

编辑：

以下是使用 copy_to 到文档顶层的方法：

PUT trips_solutions
{
  "mappings": {
    "properties": {
      "trips_changes": {
        "type": "integer"
      },
      "trips": {
        "type": "nested",
        "properties": {
          "changes": {
            "type": "integer",
            "copy_to": "trips_changes"
          }
        }
      }
    }
  }
}

trips_changes 将是一个数字数组——我假设它们是整数，但 more types are available.

然后同步几个文档：

POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":1}]}

POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":0}]}

最后查询：

GET trips_solutions/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "trips",
            "query": {
              "term": {
                "trips.changes": {
                  "value": 0
                }
              }
            }
          }
        },
        {
          "script": {
            "script": {
              "source": "doc.trips_changes.stream().filter(val -> val != 0).count() == 0"
            }
          }
        }
      ]
    }
  }
}

请注意，我们首先通常使用嵌套术语查询进行过滤，以缩小我们的搜索范围（脚本很慢，所以这很有用）。然后我们检查在累积的 top-level 变化中是否有任何 non-zero changes 并拒绝那些适用的。

ES查询匹配数组中的所有元素

ES query to match all elements in array

elasticsearch

kibana

elastic-stack

kibana-7