根据查询从 ElasticSearch 中的数组中删除对象

Remove an object from array in ElasticSearch based on a query

我想从 elastic doc 的嵌套结构中删除一个对象, 这就是我的弹性文档在索引 'submissions' 中的样子。 根据条件我想从所有文档中删除一个对象。

{
  "took": 21,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 11,
    "max_score": 1,
    "hits": [
      {
        "_index": "submissions",
        "_type": "_doc",
        "_id": "15_12069",
        "_score": 1,
        "_source": {
          "id": "15_12069",
          "account_id": 2,
          "survey_id": 15,
          "submission_id": 12069,
          "answers": [
            {
              "question_id": 142,     //
              "skipped": false,       //<------ remove object with question_id: 142
              "answer_txt": "product" //
            },
            {
              "question_id": 153,
              "skipped": false,
              "answer_txt": "happy"
            }
          ]
        }
      },
      {
        "_index": "submissions",
        "_type": "_doc",
        "_id": "15_12073",
        "_score": 1,
        "_source": {
          "id": "15_12073",
          "account_id": 2,
          "survey_id": 15,
          "submission_id": 12073,
          "answers": [
            {
              "question_id": 142,       //
              "skipped": false,         //<------ remove object with question_id: 142
              "answer_txt": "coherent"  //
            },
            {
              "question_id": 153,
              "skipped": false,
              "answer_txt": "cool"
            }
          ]
        }
      }
    ]
  }
}

我想试试 updateByQuery api ( _update_by_query ) 和 ctx._source.remove 查询

{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "must": [
              {
                "match": {
                  "account_id": 2
                }
              },
              {
                "match": {
                  "survey_id": 15
                }
              }
            ]
          }
        },
        {
          "nested": {
            "path": "answers",
            "query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "answers.question_id": 142
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

对此有任何见解或我有更好的方法吗?

您可以使用Update By Query API,方法如下

添加具有索引数据、映射和查询的工作示例

索引映射:

{
  "mappings": {
    "properties": {
      "answers": {
        "type": "nested"
      }
    }
  }
}

索引数据:

    {
  "id": "15_12069",
  "account_id": 2,
  "survey_id": 15,
  "submission_id": 12069,
  "answers": [
    {
      "question_id": 142, 
      "skipped": false, 
      "answer_txt": "product" 
    },
    {
      "question_id": 153,
      "skipped": false,
      "answer_txt": "happy"
    }
  ]
}
{
      "id": "15_12073",
      "account_id": 2,
      "survey_id": 16,
      "submission_id": 12073,
      "answers": [
        {
          "question_id": 142,
          "skipped": false,
          "answer_txt": "coherent"
        },
        {
          "question_id": 153,
          "skipped": false,
          "answer_txt": "cool"
        }
      ]
    }
    

查询:

  POST /index/_update_by_query
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "must": [
              {
                "match": {
                  "account_id": 2
                }
              },
              {
                "match": {
                  "survey_id": 15
                }
              }
            ]
          }
        },
        {
          "nested": {
            "path": "answers",
            "query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "answers.question_id": 142
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  },
  "script": {
    "source": "ctx._source.answers.removeIf(question_id -> question_id.question_id == params.remove_id);",
    "params": {
      "remove_id": 142
    }
  }
}

执行上述查询后,满足查询所有条件的文档即 "account_id": 2 AND "survey_id": 15 AND "answers.question_id": 142,来自该文档对象question_id: 142 已删除。

因此,从第一个文档(如上索引)中,删除包含 "answers.question_id": 142 的文档,现在文档包含以下数据(在 运行 查询之后)

{
  "_index": "64898361",
  "_type": "_doc",
  "_id": "1",
  "_version": 8,
  "_seq_no": 13,
  "_primary_term": 1,
  "found": true,
  "_source": {
    "survey_id": 15,
    "submission_id": 12069,
    "account_id": 2,
    "answers": [
      {
        "answer_txt": "happy",
        "question_id": 153,
        "skipped": false
      }
    ],
    "id": "15_12069"
  }
}

第二个文档不会有任何变化,因为它不满足所有的查询条件