弹性搜索全字符串匹配不起作用
Elastic Search full string match not working
我正在使用 Elastic builder npm
使用esb.termQuery(Email, "test")
映射:
"CompanyName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
数据库字段:
"Email": "test@mycompany.com",
"CompanyName": "my company"
查询 JSON:{ term: { CompanyName: 'my' } }
。或 { term: { Email: 'test' } }
结果:
"Email": "test@mycompany.com",
"CompanyName": "my company"
预期:
没有结果,需要全文匹配,这里的Match相当于'like'或者queryStringQuery.
我有 3 个过滤器前缀、完全匹配、包含。
The standard analyzer is the default analyzer which is used if none is
specified. It provides grammar based tokenization
在您的示例中,也许您没有在索引映射中明确指定任何分析器,因此默认分析文本字段,标准分析器是它们的默认分析器。
请参阅此 ,以获得对此的详细解释。
如果未定义分析器,则会生成以下标记。
POST/_analyze
{
"analyzer" : "standard",
"text" : "test@mycompany.com"
}
代币是:
{
"tokens": [
{
"token": "test",
"start_offset": 0,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "mycompany.com",
"start_offset": 5,
"end_offset": 18,
"type": "<ALPHANUM>",
"position": 1
}
]
}
如果你想要full-text搜索那么你可以定义一个带有小写过滤器的自定义分析器,小写过滤器将确保在索引文档和搜索之前所有的字母都变成小写.
The normalizer property of keyword fields is similar to analyzer
except that it guarantees that the analysis chain produces a single
token.
The uax_url_email tokenizer is like the standard tokenizer except that
it recognises URLs and email addresses as single tokens.
索引映射:
{
"settings": {
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"filter": [
"lowercase"
]
}
},
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "uax_url_email"
}
}
}
},
"mappings": {
"properties": {
"CompanyName": {
"type": "keyword",
"normalizer": "my_normalizer"
},
"Email": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
索引数据:
{
"Email": "test@mycompany.com",
"CompanyName": "my company"
}
搜索查询:
{
"query": {
"bool": {
"should": [
{
"match": {
"CompanyName": "My Company"
}
},
{
"match": {
"Email": "test"
}
}
],
"minimum_should_match": 1
}
}
}
搜索结果:
"hits": [
{
"_index": "stof_64220291",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"Email": "test@mycompany.com",
"CompanyName": "my company"
}
}
]
我正在使用 Elastic builder npm
使用esb.termQuery(Email, "test")
映射:
"CompanyName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
数据库字段:
"Email": "test@mycompany.com",
"CompanyName": "my company"
查询 JSON:{ term: { CompanyName: 'my' } }
。或 { term: { Email: 'test' } }
结果:
"Email": "test@mycompany.com",
"CompanyName": "my company"
预期: 没有结果,需要全文匹配,这里的Match相当于'like'或者queryStringQuery.
我有 3 个过滤器前缀、完全匹配、包含。
The standard analyzer is the default analyzer which is used if none is specified. It provides grammar based tokenization
在您的示例中,也许您没有在索引映射中明确指定任何分析器,因此默认分析文本字段,标准分析器是它们的默认分析器。
请参阅此
如果未定义分析器,则会生成以下标记。
POST/_analyze
{
"analyzer" : "standard",
"text" : "test@mycompany.com"
}
代币是:
{
"tokens": [
{
"token": "test",
"start_offset": 0,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "mycompany.com",
"start_offset": 5,
"end_offset": 18,
"type": "<ALPHANUM>",
"position": 1
}
]
}
如果你想要full-text搜索那么你可以定义一个带有小写过滤器的自定义分析器,小写过滤器将确保在索引文档和搜索之前所有的字母都变成小写.
The normalizer property of keyword fields is similar to analyzer except that it guarantees that the analysis chain produces a single token.
The uax_url_email tokenizer is like the standard tokenizer except that it recognises URLs and email addresses as single tokens.
索引映射:
{
"settings": {
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"filter": [
"lowercase"
]
}
},
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "uax_url_email"
}
}
}
},
"mappings": {
"properties": {
"CompanyName": {
"type": "keyword",
"normalizer": "my_normalizer"
},
"Email": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
索引数据:
{
"Email": "test@mycompany.com",
"CompanyName": "my company"
}
搜索查询:
{
"query": {
"bool": {
"should": [
{
"match": {
"CompanyName": "My Company"
}
},
{
"match": {
"Email": "test"
}
}
],
"minimum_should_match": 1
}
}
}
搜索结果:
"hits": [
{
"_index": "stof_64220291",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"Email": "test@mycompany.com",
"CompanyName": "my company"
}
}
]