Elasticsearch 5.4 嵌套搜索 return 仅匹配的嵌套数据

Elasticsearch 5.4 nested search return only nested data which matches

运行 Elasticsearch 5.4 版。

有了这个映射:

PUT pizzas
{
  "mappings": {
    "pizza": {
      "properties": {
        "name": {
          "type": "keyword"
        },
        "types": {
          "type": "nested",
          "properties": {
            "topping": {
              "type": "keyword"
            },
            "base": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}

而这个数据:

PUT pizzas/pizza/1
{
  "name": "meat",
  "types": [
    {
      "topping": "bacon",
      "base": "normal"
    },
    {
      "topping": "bacon",
      "base": "sour dough"
    },
    {
      "topping": "pepperoni",
      "base": "sour dough"
    }
  ]
}

如果我运行这个查询:

GET pizzas/_search
{
  "query": {
    "nested": {
      "path": "types",
      "query": {
        "bool": {
          "filter": {
            "term": {
              "types.topping": "bacon"
            }
          }
        }
      }
    }
  }
}

我得到:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0,
    "hits": [
      {
        "_index": "pizzas",
        "_type": "pizza",
        "_id": "1",
        "_score": 0,
        "_source": {
          "name": "meat",
          "types": [
            {
              "topping": "bacon",
              "base": "normal"
            },
            {
              "topping": "bacon",
              "base": "sour dough"
            },
            {
              "topping": "pepperoni",
              "base": "sour dough"
            }
          ]
        }
      }
    ]
  }
}

但我真正想要的是:

"hits": [
  {
    "_index": "pizzas",
    "_type": "pizza",
    "_id": "1",
    "_score": 0,
    "_source": {
      "name": "meat",
      "types": [
        {
          "topping": "bacon",
          "base": "normal"
        }
      ]
    }
  },
  {
    "_index": "pizzas",
    "_type": "pizza",
    "_id": "1",
    "_score": 0,
    "_source": {
      "name": "meat",
      "types": [
        {
          "topping": "bacon",
          "base": "sour dough"
        }
      ]
    }
  }
]

我想这样做,如果用户搜索 "bacon",他们将获得一份他们可以选择的比萨饼选项列表,其中包括该配料。

Elasticsearch 甚至支持吗?我可以通过编程方式分离我的结果,但我希望它是内置的。

感谢您的宝贵时间。

解决此问题的一种可能方法是使用 _parent_child 关系并将比萨饼从它们的类型中分离出来:

PUT pizzas
{
  "mappings": {
    "pizza": {
      "properties": {
        "name": {
          "type": "keyword"
        },
        "rating": {
          "type": "integer"
        }
      }
    },
    "type": {
      "_parent": {
        "type": "pizza" 
      },
      "properties": {
        "types": {
          "properties": {
            "topping": {
              "type": "keyword"
            },
            "base": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}

PUT pizzas/pizza/1
{
  "name": "meat",
  "rating": 5
}

PUT pizzas/type/1?parent=1
{
  "topping": "bacon",
  "base": "normal"
}

PUT pizzas/type/2?parent=1
{
  "topping": "bacon",
  "base": "sour dough"
}

PUT pizzas/type/3?parent=1
{
  "topping": "pepperoni",
  "base": "sour dough"
}

然后您可以只搜索子项,还可以查看它与哪个父项相关。

查询:

GET pizzas/type/_search
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "topping": "bacon"
        }
      }
    }
  }
}

结果:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": [
      {
        "_index": "pizzas",
        "_type": "type",
        "_id": "1",
        "_score": 0,
        "_routing": "1",
        "_parent": "1",
        "_source": {
          "topping": "bacon",
          "base": "normal"
        }
      },
      {
        "_index": "pizzas",
        "_type": "type",
        "_id": "2",
        "_score": 0,
        "_routing": "1",
        "_parent": "1",
        "_source": {
          "topping": "bacon",
          "base": "sour dough"
        }
      }
    ]
  }
}

在您的代码中,您可以结合数据来创建所需的原始数据结构。

注意事项

像这样更改结构有几个问题:

一:普通排序不能设置子项,如果需要按子项排序父项(source)。

二:如果您还需要根据其他字段进行筛选,则最终需要 运行 查询,例如:

GET pizzas/pizza/_search
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "rating": 5
        }
      },
      "must": {
        "has_child": {
          "type": "type",
          "query": {
            "bool": {
              "filter": {
                "term": {
                  "topping": "bacon"
                }
              }
            }
          }
        }
      }
    }
  }
}

接着是对那些需要重新附加到父项的特定子项的另一个查询。

您可以只使用 "inner_hits" 在嵌套搜索中获取特定匹配的匹配项:

查询:

GET pizzas/_search
{
  "query": {
    "nested": {
      "path": "types",
      "query": {
        "bool": {
          "filter": {
            "term": {
              "types.topping": "bacon"
            }
          }
        }
      },
      "inner_hits": {
          "size": 10
      }
    }
  }
}

请注意,"inner_hits" 将 return 3 个结果,除非特别告知 return 不同的数量。可以看到选项here.

似乎没有不设置 size 的选项,您只需将其设置为高于您将拥有的 inner_hits 的最大数量即可。

结果:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0,
    "hits": [
      {
        "_index": "pizzas",
        "_type": "pizza",
        "_id": "1",
        "_score": 0,
        "_source": {
          "name": "meat",
          "types": [
            {
              "topping": "bacon",
              "base": "normal"
            },
            {
              "topping": "bacon",
              "base": "sour dough"
            },
            {
              "topping": "pepperoni",
              "base": "sour dough"
            }
          ]
        },
        "inner_hits": {
          "types": {
            "hits": {
              "total": 2,
              "max_score": 0,
              "hits": [
                {
                  "_nested": {
                    "field": "types",
                    "offset": 1
                  },
                  "_score": 0,
                  "_source": {
                    "topping": "bacon",
                    "base": "sour dough"
                  }
                },
                {
                  "_nested": {
                    "field": "types",
                    "offset": 0
                  },
                  "_score": 0,
                  "_source": {
                    "topping": "bacon",
                    "base": "normal"
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

使用您的代码,然后您可以将命中和 inner_hits 结合在一起,因此唯一 returned 的类型是相关的。