Elasticsearch - Return 嵌套结果的子集
Elasticsearch - Return a subset of nested results
Elasticsearch 7.7 我正在使用官方 php 客户端与服务器交互。
My issue was somewhat solved here: https://discuss.elastic.co/t/need-to-return-part-of-a-doc-from-a-search-query-filter-is-parent-child-the-way-to-go/64514/2
However "Types are deprecated in APIs in 7.0+" https://www.elastic.co/guide/en/elasticsearch/reference/7.x/removal-of-types.html
这是我的文档:
{
"offering_id": "1190",
"account_id": "362353",
"service_id": "20087",
"title": "Quick Brown Mammal",
"slug": "Quick Brown Fox",
"summary": "Quick Brown Fox"
"header_thumb_path": "uploads/test/test.png",
"duration": "30",
"alter_ids": [
"59151",
"58796",
"58613",
"54286",
"51812",
"50052",
"48387",
"37927",
"36685",
"36554",
"28807",
"23154",
"22356",
"21480",
"220",
"1201",
"1192"
],
"premium": "f",
"featured": "f",
"events": [
{
"event_id": "9999",
"start_date": "2020-07-01 14:00:00",
"registration_count": "22",
"description": "boo"
},
{
"event_id": "9999",
"start_date": "2020-07-01 14:00:00",
"registration_count": "22",
"description": "xyz"
},
{
"event_id": "9999",
"start_date": "2020-08-11 11:30:00",
"registration_count": "41",
"description": "test"
}
]
}
注意对象如何可能有一个或多个“事件”
基于事件数据的搜索是最常见的用例。
例如:
- 查找在中午 12 点之前开始的活动
- 查找描述为“xyz”的事件
- 列出开始日期在未来 10 天内的查找活动。
我不想return任何与查询不匹配的事件!
因此,例如 Find events with a description of "xyz" for a given service
{
"query": {
"bool": {
"must": {
"match": {
"events.description": "xyz"
}
},
"filter": {
"bool": {
"must": [
{
"term": {
"service_id": 20087
}
}
]
}
}
}
}
}
我希望结果如下所示:
{
"offering_id": "1190",
"account_id": "362353",
"service_id": "20087",
"title": "Quick Brown Mammal",
"slug": "Quick Brown Fox",
"summary": "Quick Brown Fox"
"header_thumb_path": "uploads/test/test.png",
"duration": "30",
"alter_ids": [
"59151",
"58796",
"58613",
"54286",
"51812",
"50052",
"48387",
"37927",
"36685",
"36554",
"28807",
"23154",
"22356",
"21480",
"220",
"1201",
"1192"
],
"premium": "f",
"featured": "f",
"events": [
{
"event_id": "9999",
"start_date": "2020-07-01 14:00:00",
"registration_count": "22",
"description": "xyz"
}
]
}
但是,它只是 return 包含所有事件的整个文档。
是否可以只 return 数据的一个子集?也许与聚合?
- 现在,我们正在对应用程序中的结果集(在本例中为 php)执行一组“额外”过滤,以去除与所需结果不匹配的事件块。
- 最好让 elastic 直接提供所需的内容,而不是对结果进行额外处理以提取适用的事件。
- 考虑过重组数据以使其基于“事件”,但随后我会复制数据,因为每个产品也会有父数据。
这曾经在 SQL 中,那里有一个关系,而不是像这样嵌套数据。
可以使用嵌套聚合和过滤器聚合返回嵌套数据的子集
要了解有关这些聚合的更多信息,请参阅这些官方文档:
索引映射:
{
"mappings": {
"properties": {
"offering_id": {
"type": "integer"
},
"account_id": {
"type": "integer"
},
"service_id": {
"type": "integer"
},
"title": {
"type": "text"
},
"slug": {
"type": "text"
},
"summary": {
"type": "text"
},
"header_thumb_path": {
"type": "keyword"
},
"duration": {
"type": "integer"
},
"alter_ids": {
"type": "integer"
},
"premium": {
"type": "text"
},
"featured": {
"type": "text"
},
"events": {
"type": "nested",
"properties": {
"event_id": {
"type": "integer"
},
"registration_count": {
"type": "integer"
},
"description": {
"type": "text"
}
}
}
}
}
}
搜索查询:
{
"size": 0,
"aggs": {
"nested": {
"nested": {
"path": "events"
},
"aggs": {
"filter": {
"filter": {
"match": { "events.description": "xyz" }
},
"aggs": {
"total": {
"top_hits": {
"size": 10
}
}
}
}
}
}
}
}
搜索结果:
"hits": [
{
"_index": "foo21",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "events",
"offset": 1
},
"_score": 1.0,
"_source": {
"event_id": "9999",
"start_date": "2020-07-01 14:00:00",
"registration_count": "22",
"description": "xyz"
}
}
]
第二种方法:
{
"query": {
"bool": {
"must": [
{
"match": {
"service_id": "20087"
}
},
{
"nested": {
"path": "events",
"query": {
"bool": {
"must": [
{
"match": {
"events.description": "xyz"
}
}
]
}
},
"inner_hits": {
}
}
}
]
}
}
}
你甚至可以通过这个 SO 答案:
Elasticsearch 7.7 我正在使用官方 php 客户端与服务器交互。
My issue was somewhat solved here: https://discuss.elastic.co/t/need-to-return-part-of-a-doc-from-a-search-query-filter-is-parent-child-the-way-to-go/64514/2
However "Types are deprecated in APIs in 7.0+" https://www.elastic.co/guide/en/elasticsearch/reference/7.x/removal-of-types.html
这是我的文档:
{
"offering_id": "1190",
"account_id": "362353",
"service_id": "20087",
"title": "Quick Brown Mammal",
"slug": "Quick Brown Fox",
"summary": "Quick Brown Fox"
"header_thumb_path": "uploads/test/test.png",
"duration": "30",
"alter_ids": [
"59151",
"58796",
"58613",
"54286",
"51812",
"50052",
"48387",
"37927",
"36685",
"36554",
"28807",
"23154",
"22356",
"21480",
"220",
"1201",
"1192"
],
"premium": "f",
"featured": "f",
"events": [
{
"event_id": "9999",
"start_date": "2020-07-01 14:00:00",
"registration_count": "22",
"description": "boo"
},
{
"event_id": "9999",
"start_date": "2020-07-01 14:00:00",
"registration_count": "22",
"description": "xyz"
},
{
"event_id": "9999",
"start_date": "2020-08-11 11:30:00",
"registration_count": "41",
"description": "test"
}
]
}
注意对象如何可能有一个或多个“事件”
基于事件数据的搜索是最常见的用例。
例如:
- 查找在中午 12 点之前开始的活动
- 查找描述为“xyz”的事件
- 列出开始日期在未来 10 天内的查找活动。
我不想return任何与查询不匹配的事件!
因此,例如 Find events with a description of "xyz" for a given service
{
"query": {
"bool": {
"must": {
"match": {
"events.description": "xyz"
}
},
"filter": {
"bool": {
"must": [
{
"term": {
"service_id": 20087
}
}
]
}
}
}
}
}
我希望结果如下所示:
{
"offering_id": "1190",
"account_id": "362353",
"service_id": "20087",
"title": "Quick Brown Mammal",
"slug": "Quick Brown Fox",
"summary": "Quick Brown Fox"
"header_thumb_path": "uploads/test/test.png",
"duration": "30",
"alter_ids": [
"59151",
"58796",
"58613",
"54286",
"51812",
"50052",
"48387",
"37927",
"36685",
"36554",
"28807",
"23154",
"22356",
"21480",
"220",
"1201",
"1192"
],
"premium": "f",
"featured": "f",
"events": [
{
"event_id": "9999",
"start_date": "2020-07-01 14:00:00",
"registration_count": "22",
"description": "xyz"
}
]
}
但是,它只是 return 包含所有事件的整个文档。
是否可以只 return 数据的一个子集?也许与聚合?
- 现在,我们正在对应用程序中的结果集(在本例中为 php)执行一组“额外”过滤,以去除与所需结果不匹配的事件块。
- 最好让 elastic 直接提供所需的内容,而不是对结果进行额外处理以提取适用的事件。
- 考虑过重组数据以使其基于“事件”,但随后我会复制数据,因为每个产品也会有父数据。
这曾经在 SQL 中,那里有一个关系,而不是像这样嵌套数据。
可以使用嵌套聚合和过滤器聚合返回嵌套数据的子集
要了解有关这些聚合的更多信息,请参阅这些官方文档:
索引映射:
{
"mappings": {
"properties": {
"offering_id": {
"type": "integer"
},
"account_id": {
"type": "integer"
},
"service_id": {
"type": "integer"
},
"title": {
"type": "text"
},
"slug": {
"type": "text"
},
"summary": {
"type": "text"
},
"header_thumb_path": {
"type": "keyword"
},
"duration": {
"type": "integer"
},
"alter_ids": {
"type": "integer"
},
"premium": {
"type": "text"
},
"featured": {
"type": "text"
},
"events": {
"type": "nested",
"properties": {
"event_id": {
"type": "integer"
},
"registration_count": {
"type": "integer"
},
"description": {
"type": "text"
}
}
}
}
}
}
搜索查询:
{
"size": 0,
"aggs": {
"nested": {
"nested": {
"path": "events"
},
"aggs": {
"filter": {
"filter": {
"match": { "events.description": "xyz" }
},
"aggs": {
"total": {
"top_hits": {
"size": 10
}
}
}
}
}
}
}
}
搜索结果:
"hits": [
{
"_index": "foo21",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "events",
"offset": 1
},
"_score": 1.0,
"_source": {
"event_id": "9999",
"start_date": "2020-07-01 14:00:00",
"registration_count": "22",
"description": "xyz"
}
}
]
第二种方法:
{
"query": {
"bool": {
"must": [
{
"match": {
"service_id": "20087"
}
},
{
"nested": {
"path": "events",
"query": {
"bool": {
"must": [
{
"match": {
"events.description": "xyz"
}
}
]
}
},
"inner_hits": {
}
}
}
]
}
}
}
你甚至可以通过这个 SO 答案: