Elasticsearch:每个关键字的前 k 个结果
Elasticsearch: Top k results per keyword
我们在elasticsearch中有如下文档。
class Query(DocType):
text = Text(analyzer='snowball', fields={'raw': Keyword()})
src = Keyword()
现在我们想要每个 src 的前 k 个结果。我们怎样才能做到这一点?
示例:- 假设我们索引以下内容:
# src: place_order
Query(text="I want to order food", src="place_order")
Query(text="Take my order", src="place_order")
...
# src: payment
Query(text="How to pay ?", src="payment")
Query(text="Do you accept credit card ?", src="payment")
...
现在,如果用户写了一个查询请接受我的订单以及信用卡详细信息,k =1,那么我们应该return下面两个结果
[{"text": "Take my order", "src": "place_order", },
{"text": "Do you accept credit card ?", "src": "payment"}
]
这里因为 k=1,我们 return 为每个 src 获取唯一的结果。
您可以尝试 top hits 聚合,这将 return 在聚合中每个桶的前 N 个匹配文档。
对于您 post 中的示例,查询可能如下所示:
POST queries/query/_search
{
"query": {
"match": {
"text": "take my order please along with the credit card details"
}
},
"aggs": {
"src types": {
"terms": {
"field": "src"
},
"aggs": {
"best hit": {
"top_hits": {
"size": 1
}
}
}
}
}
}
对文本查询的搜索限制了聚合的文档集。 "src types"
聚合将匹配文档中找到的所有 src
值分组,"best hit"
每个桶选择一个最相关的文档(size
参数可以根据您的需要更改)。
查询结果如下:
{
"hits": {
"total": 3,
"max_score": 1.3862944,
"hits": [
{
"_index": "queries",
"_type": "query",
"_id": "VD7QVmABl04oXt2HGbGB",
"_score": 1.3862944,
"_source": {
"text": "Do you accept credit card ?",
"src": "payment"
}
},
{
"_index": "queries",
"_type": "query",
"_id": "Uj7PVmABl04oXt2HlLFI",
"_score": 0.8630463,
"_source": {
"text": "Take my order",
"src": "place_order"
}
},
{
"_index": "queries",
"_type": "query",
"_id": "UT7PVmABl04oXt2HKLFy",
"_score": 0.6931472,
"_source": {
"text": "I want to order food",
"src": "place_order"
}
}
]
},
"aggregations": {
"src types": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "place_order",
"doc_count": 2,
"best hit": {
"hits": {
"total": 2,
"max_score": 0.8630463,
"hits": [
{
"_index": "queries",
"_type": "query",
"_id": "Uj7PVmABl04oXt2HlLFI",
"_score": 0.8630463,
"_source": {
"text": "Take my order",
"src": "place_order"
}
}
]
}
}
},
{
"key": "payment",
"doc_count": 1,
"best hit": {
"hits": {
"total": 1,
"max_score": 1.3862944,
"hits": [
{
"_index": "queries",
"_type": "query",
"_id": "VD7QVmABl04oXt2HGbGB",
"_score": 1.3862944,
"_source": {
"text": "Do you accept credit card ?",
"src": "payment"
}
}
]
}
}
}
]
}
}
}
希望对您有所帮助!
我们在elasticsearch中有如下文档。
class Query(DocType):
text = Text(analyzer='snowball', fields={'raw': Keyword()})
src = Keyword()
现在我们想要每个 src 的前 k 个结果。我们怎样才能做到这一点?
示例:- 假设我们索引以下内容:
# src: place_order
Query(text="I want to order food", src="place_order")
Query(text="Take my order", src="place_order")
...
# src: payment
Query(text="How to pay ?", src="payment")
Query(text="Do you accept credit card ?", src="payment")
...
现在,如果用户写了一个查询请接受我的订单以及信用卡详细信息,k =1,那么我们应该return下面两个结果
[{"text": "Take my order", "src": "place_order", },
{"text": "Do you accept credit card ?", "src": "payment"}
]
这里因为 k=1,我们 return 为每个 src 获取唯一的结果。
您可以尝试 top hits 聚合,这将 return 在聚合中每个桶的前 N 个匹配文档。
对于您 post 中的示例,查询可能如下所示:
POST queries/query/_search
{
"query": {
"match": {
"text": "take my order please along with the credit card details"
}
},
"aggs": {
"src types": {
"terms": {
"field": "src"
},
"aggs": {
"best hit": {
"top_hits": {
"size": 1
}
}
}
}
}
}
对文本查询的搜索限制了聚合的文档集。 "src types"
聚合将匹配文档中找到的所有 src
值分组,"best hit"
每个桶选择一个最相关的文档(size
参数可以根据您的需要更改)。
查询结果如下:
{
"hits": {
"total": 3,
"max_score": 1.3862944,
"hits": [
{
"_index": "queries",
"_type": "query",
"_id": "VD7QVmABl04oXt2HGbGB",
"_score": 1.3862944,
"_source": {
"text": "Do you accept credit card ?",
"src": "payment"
}
},
{
"_index": "queries",
"_type": "query",
"_id": "Uj7PVmABl04oXt2HlLFI",
"_score": 0.8630463,
"_source": {
"text": "Take my order",
"src": "place_order"
}
},
{
"_index": "queries",
"_type": "query",
"_id": "UT7PVmABl04oXt2HKLFy",
"_score": 0.6931472,
"_source": {
"text": "I want to order food",
"src": "place_order"
}
}
]
},
"aggregations": {
"src types": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "place_order",
"doc_count": 2,
"best hit": {
"hits": {
"total": 2,
"max_score": 0.8630463,
"hits": [
{
"_index": "queries",
"_type": "query",
"_id": "Uj7PVmABl04oXt2HlLFI",
"_score": 0.8630463,
"_source": {
"text": "Take my order",
"src": "place_order"
}
}
]
}
}
},
{
"key": "payment",
"doc_count": 1,
"best hit": {
"hits": {
"total": 1,
"max_score": 1.3862944,
"hits": [
{
"_index": "queries",
"_type": "query",
"_id": "VD7QVmABl04oXt2HGbGB",
"_score": 1.3862944,
"_source": {
"text": "Do you accept credit card ?",
"src": "payment"
}
}
]
}
}
}
]
}
}
}
希望对您有所帮助!