Elasticsearch 搜索所有文档,按相关性分数排序
Elasticsearch search all the documents, in order of relavancy score
我在新闻标题文档列表中有一个复杂的匹配查询,
{
"bool" : {
"should" : [
{
"multi_match" : {
"query" : " Reliance gets shareholders, creditors nod for hiving off O2C business into separate unit - The HinduBillionaire Mukesh Ambani's Reliance Industries Ltd on Friday said it has secured approval of its shareholders and creditors for hiving off its oil-to-chemical (O2C) business into a separate unit.",
"fields" : [
"article.description^1.0",
"article.title^1.0"
],
"type" : "best_fields",
"operator" : "OR",
"slop" : 0,
"prefix_length" : 0,
"max_expansions" : 50,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"fuzzy_transpositions" : true,
"boost" : 3.0
}
},
{
"match" : {
"article.author" : {
"query" : " PTI",
"operator" : "OR",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"boost" : 1.0
}
}
},
{
"match" : {
"article.source.name" : {
"query" : " The Hindu",
"operator" : "OR",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
问题是 elasticsearch return 只有相关文档,我想要尽可能多的文档,按照相关分数的降序排列。 return 所有文档都可以,但应该按此顺序排序。我找不到更好的方法 elasticsearch returns 在新闻存储库中的 5-10 个文档上,而我有 1000 篇文章。
默认情况下 elasticsearch return 只有 10 个文档。如果要return超过10个文件,需要设置size
参数。
修改后的查询将是
{
"size": 1000, // note this
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": " Reliance gets shareholders, creditors nod for hiving off O2C business into separate unit - The HinduBillionaire Mukesh Ambani's Reliance Industries Ltd on Friday said it has secured approval of its shareholders and creditors for hiving off its oil-to-chemical (O2C) business into a separate unit.",
"fields": [
"article.description^1.0",
"article.title^1.0"
],
"type": "best_fields",
"operator": "OR",
"slop": 0,
"prefix_length": 0,
"max_expansions": 50,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"fuzzy_transpositions": true,
"boost": 3.0
}
},
{
"match": {
"article.author": {
"query": " PTI",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1.0
}
}
},
{
"match": {
"article.source.name": {
"query": " The Hindu",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
}
}
我在新闻标题文档列表中有一个复杂的匹配查询,
{
"bool" : {
"should" : [
{
"multi_match" : {
"query" : " Reliance gets shareholders, creditors nod for hiving off O2C business into separate unit - The HinduBillionaire Mukesh Ambani's Reliance Industries Ltd on Friday said it has secured approval of its shareholders and creditors for hiving off its oil-to-chemical (O2C) business into a separate unit.",
"fields" : [
"article.description^1.0",
"article.title^1.0"
],
"type" : "best_fields",
"operator" : "OR",
"slop" : 0,
"prefix_length" : 0,
"max_expansions" : 50,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"fuzzy_transpositions" : true,
"boost" : 3.0
}
},
{
"match" : {
"article.author" : {
"query" : " PTI",
"operator" : "OR",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"boost" : 1.0
}
}
},
{
"match" : {
"article.source.name" : {
"query" : " The Hindu",
"operator" : "OR",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
问题是 elasticsearch return 只有相关文档,我想要尽可能多的文档,按照相关分数的降序排列。 return 所有文档都可以,但应该按此顺序排序。我找不到更好的方法 elasticsearch returns 在新闻存储库中的 5-10 个文档上,而我有 1000 篇文章。
默认情况下 elasticsearch return 只有 10 个文档。如果要return超过10个文件,需要设置size
参数。
修改后的查询将是
{
"size": 1000, // note this
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": " Reliance gets shareholders, creditors nod for hiving off O2C business into separate unit - The HinduBillionaire Mukesh Ambani's Reliance Industries Ltd on Friday said it has secured approval of its shareholders and creditors for hiving off its oil-to-chemical (O2C) business into a separate unit.",
"fields": [
"article.description^1.0",
"article.title^1.0"
],
"type": "best_fields",
"operator": "OR",
"slop": 0,
"prefix_length": 0,
"max_expansions": 50,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"fuzzy_transpositions": true,
"boost": 3.0
}
},
{
"match": {
"article.author": {
"query": " PTI",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1.0
}
}
},
{
"match": {
"article.source.name": {
"query": " The Hindu",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
}
}