Elasticsearch 查询和 Kibana 未按预期工作
Elasticsearch query and Kibana not working as expected
我正在尝试学习 Elasticsearch,并且正在使用 Kibana 来可视化事物。不过,我似乎无法弄清楚我的映射和查询有什么问题。
我正在尝试存储照片元数据(iptc 数据)。我有以下映射:
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": [],
"type": "custom",
"tokenizer": "keyword"
},
"edge_ngram_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "edge_ngram_tokenizer"
},
"edge_ngram_search_analyzer": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 5,
"token_chars": [
"letter"
]
}
}
}
}
},
"mappings": {
"doc": {
"properties": {
"photo_added": {
"type": "date",
"index": true,
"format": "yyyy-MM-dd' 'H:m:s"
},
"photo_id": {
"type": "long",
"index": true
},
"photo_owner": {
"type": "long",
"index": true
},
"project": {
"type": "long",
"index": true
},
"iptc": {
"type": "nested",
"properties": {
"caption/abstract": {
"type": "text",
"index": true
},
"copyright notice": {
"type": "text",
"index": true
},
"keywords": {
"type": "text",
"index": true,
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
},
"completion": {
"type": "completion"
},
"keyword": {
"type": "keyword"
}
}
},
"object name": {
"type": "text",
"index": true
},
"province/state": {
"type": "text",
"index": true
},
"sub-location": {
"type": "text",
"index": true
},
"time created": {
"type": "text",
"index": true
},
"urgency": {
"type": "text",
"index": true
},
"writer/editor": {
"type": "text",
"index": true
}
}
}
}
}
}
}
问题是:我想要一个通过关键字和标题搜索 search-text 是否存在的查询。每当在关键字中找到 search-text 时,分数就会提高,因为这表明照片具有更高的相关性。所以我制定了以下查询(其中值是 search-text):
GET /photos/_search
{
"query": {
"dis_max": {
"queries": [
{
"fuzzy": {
"iptc.keywords": {
"value": "value",
"fuzziness": 1,
"boost": 1
}
}
},
{
"fuzzy": {
"iptc.caption/abstract": {
"value": "value",
"fuzziness": 1
}
}
}
]
}
}
}
然而,尽管该值在文档中,但它似乎没有找到任何匹配项……而且我似乎无法构建一个简单的匹配查询来匹配准确的文本……例如:
GET /photos/doc/_search?error_trace=true
{
"query": {
"match": {
"iptc.caption/abstract": "exact value from one of the documents"
}
}
}
将 return 0 个结果... search-text 然而恰好在文档中.. 我不知道该怎么做。更糟糕的是(对我来说,由于我的挫败感,我快秃顶了)Kibana 似乎表现得很好。我几乎可以肯定它真的很简单(文档日期在 5 年内)但是当过滤精确复制粘贴值 returns 0 结果...如屏幕截图所示...
我快要疯了。有人知道如何解决这个问题或者我做错了什么吗?
我在 Elastic 的文档中找到了解决方案。
Because nested documents are indexed as separate documents, they can only be accessed within the scope of the nested query, the nested/reverse_nested aggregations, or nested inner hits.
所以我构建了以下有效的查询。
{
"query": {
"nested": {
"path": "iptc",
"query": {
"bool": {
"should": [
{
"dis_max": {
"queries": [
{
"fuzzy": {
"iptc.keywords": {
"value": "Feyenoord",
"boost": 1
}
}
},
{
"fuzzy": {
"iptc.caption/abstract": {
"value": "Feyenoord",
"fuzziness": 1
}
}
}
]
}
}
]
}
}
}
}
我正在尝试学习 Elasticsearch,并且正在使用 Kibana 来可视化事物。不过,我似乎无法弄清楚我的映射和查询有什么问题。
我正在尝试存储照片元数据(iptc 数据)。我有以下映射:
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": [],
"type": "custom",
"tokenizer": "keyword"
},
"edge_ngram_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "edge_ngram_tokenizer"
},
"edge_ngram_search_analyzer": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 5,
"token_chars": [
"letter"
]
}
}
}
}
},
"mappings": {
"doc": {
"properties": {
"photo_added": {
"type": "date",
"index": true,
"format": "yyyy-MM-dd' 'H:m:s"
},
"photo_id": {
"type": "long",
"index": true
},
"photo_owner": {
"type": "long",
"index": true
},
"project": {
"type": "long",
"index": true
},
"iptc": {
"type": "nested",
"properties": {
"caption/abstract": {
"type": "text",
"index": true
},
"copyright notice": {
"type": "text",
"index": true
},
"keywords": {
"type": "text",
"index": true,
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
},
"completion": {
"type": "completion"
},
"keyword": {
"type": "keyword"
}
}
},
"object name": {
"type": "text",
"index": true
},
"province/state": {
"type": "text",
"index": true
},
"sub-location": {
"type": "text",
"index": true
},
"time created": {
"type": "text",
"index": true
},
"urgency": {
"type": "text",
"index": true
},
"writer/editor": {
"type": "text",
"index": true
}
}
}
}
}
}
}
问题是:我想要一个通过关键字和标题搜索 search-text 是否存在的查询。每当在关键字中找到 search-text 时,分数就会提高,因为这表明照片具有更高的相关性。所以我制定了以下查询(其中值是 search-text):
GET /photos/_search
{
"query": {
"dis_max": {
"queries": [
{
"fuzzy": {
"iptc.keywords": {
"value": "value",
"fuzziness": 1,
"boost": 1
}
}
},
{
"fuzzy": {
"iptc.caption/abstract": {
"value": "value",
"fuzziness": 1
}
}
}
]
}
}
}
然而,尽管该值在文档中,但它似乎没有找到任何匹配项……而且我似乎无法构建一个简单的匹配查询来匹配准确的文本……例如:
GET /photos/doc/_search?error_trace=true
{
"query": {
"match": {
"iptc.caption/abstract": "exact value from one of the documents"
}
}
}
将 return 0 个结果... search-text 然而恰好在文档中.. 我不知道该怎么做。更糟糕的是(对我来说,由于我的挫败感,我快秃顶了)Kibana 似乎表现得很好。我几乎可以肯定它真的很简单(文档日期在 5 年内)但是当过滤精确复制粘贴值 returns 0 结果...如屏幕截图所示...
我快要疯了。有人知道如何解决这个问题或者我做错了什么吗?
我在 Elastic 的文档中找到了解决方案。
Because nested documents are indexed as separate documents, they can only be accessed within the scope of the nested query, the nested/reverse_nested aggregations, or nested inner hits.
所以我构建了以下有效的查询。
{
"query": {
"nested": {
"path": "iptc",
"query": {
"bool": {
"should": [
{
"dis_max": {
"queries": [
{
"fuzzy": {
"iptc.keywords": {
"value": "Feyenoord",
"boost": 1
}
}
},
{
"fuzzy": {
"iptc.caption/abstract": {
"value": "Feyenoord",
"fuzziness": 1
}
}
}
]
}
}
]
}
}
}
}