Elasticsearch 自动完成按长度排序
Elasticsearch autocomplete sort by length
我想用elasticsearch做一个自动完成
我试过了
- 朴素的前缀匹配,
- https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-phrase.html
- http://davewalk.net/2015/04/13/address-autocomplete-in-go-and-elasticsearch-part-1.html
然而都不符合我的预期
假设我有这样的数据:
PHP Programing
php prado framework
OOP PHP Programming
PHPMyAdmin
PHP
Php
每当我查询 PHP
时,结果都会像上面的列表一样 ^
如何让PHP先显示?而不是最后
为什么 PHP 编程比等于查询的 PHP 具有更高的权重?
注意:我已经添加了小写过滤器,因此查询被视为区分大小写,这就是为什么 php, Php, PHP
都匹配查询
我不知道你在做什么,所以提供更多信息会有所帮助。
但我得到了以下结果,不是建议示例,但它显示了如何使用分数进行排序
@Test
public void es() throws Exception {
insert("value", "foo foo");
insert("value", "foo");
insert("value", "fooa");
insert("value", "fao");
insert("value", "foo potato foo bar");
insert("value", "foo potato bar");
insert("value", "foo potato");
insert("value", "foo vegetable");
insert("value", "foo vegetable");
Thread.sleep(1000);
SearchResponse searchResponse =
getClient().prepareSearch()
.setQuery(QueryBuilders.matchPhraseQuery("value", "foo"))
.addSort(SortBuilders.scoreSort()
.order(SortOrder.DESC))
.execute().actionGet();
Arrays.stream(searchResponse.getHits().getHits())
.forEach(h -> System.out.println(h.getSource().get("value") + ": " + h.getScore()));
}
输出:
foo: 1.1177831
foo foo: 0.98798996
foo potato foo bar: 0.790392
foo potato: 0.6986144
foo vegetable: 0.6986144
foo vegetable: 0.6986144
foo potato bar: 0.55889153
要实现所需的行为,您需要使用 edgengrams(https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html) on a field which is using edgengram analyzer. To rank exact matches on top of any other prefix matches maintain an additional field which is not analyzed and use it in a should clause to increase its relevance(https://www.elastic.co/guide/en/elasticsearch/guide/current/query-scoring.html)
我想用elasticsearch做一个自动完成
我试过了
- 朴素的前缀匹配,
- https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-phrase.html
- http://davewalk.net/2015/04/13/address-autocomplete-in-go-and-elasticsearch-part-1.html
然而都不符合我的预期
假设我有这样的数据:
PHP Programing
php prado framework
OOP PHP Programming
PHPMyAdmin
PHP
Php
每当我查询 PHP
时,结果都会像上面的列表一样 ^
如何让PHP先显示?而不是最后
为什么 PHP 编程比等于查询的 PHP 具有更高的权重?
注意:我已经添加了小写过滤器,因此查询被视为区分大小写,这就是为什么 php, Php, PHP
都匹配查询
我不知道你在做什么,所以提供更多信息会有所帮助。
但我得到了以下结果,不是建议示例,但它显示了如何使用分数进行排序
@Test
public void es() throws Exception {
insert("value", "foo foo");
insert("value", "foo");
insert("value", "fooa");
insert("value", "fao");
insert("value", "foo potato foo bar");
insert("value", "foo potato bar");
insert("value", "foo potato");
insert("value", "foo vegetable");
insert("value", "foo vegetable");
Thread.sleep(1000);
SearchResponse searchResponse =
getClient().prepareSearch()
.setQuery(QueryBuilders.matchPhraseQuery("value", "foo"))
.addSort(SortBuilders.scoreSort()
.order(SortOrder.DESC))
.execute().actionGet();
Arrays.stream(searchResponse.getHits().getHits())
.forEach(h -> System.out.println(h.getSource().get("value") + ": " + h.getScore()));
}
输出:
foo: 1.1177831
foo foo: 0.98798996
foo potato foo bar: 0.790392
foo potato: 0.6986144
foo vegetable: 0.6986144
foo vegetable: 0.6986144
foo potato bar: 0.55889153
要实现所需的行为,您需要使用 edgengrams(https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html) on a field which is using edgengram analyzer. To rank exact matches on top of any other prefix matches maintain an additional field which is not analyzed and use it in a should clause to increase its relevance(https://www.elastic.co/guide/en/elasticsearch/guide/current/query-scoring.html)