python elasticsearch:如何在所有字段上查询一个字符串

python elasticsearch: how to query a string on all fields

我有以下 python 代码,它工作正常,给我带来了 50 个预期的结果:

elastic = settings.ELASTIC
indexes = u'nginx-access-2769z-2018.11.26.16'
filter_by_client = [
    {'match_phrase': {'client_id': '2769z'}},
]
range_for_search = {
    'gte': str(1543248611),
    'lte': str(1543249511),
    'format': 'epoch_second',
}
query_body = {
    'from': 0,
    'size': 50,
    'query': {
        'bool': {
            'must': filter_by_client,
            'filter': {'range': {'@timestamp': range_for_search}},
        },
    }
}
search_result = elastic.search(index=indexes, body=query_body)
results = [result['_source'] for result in search_result['hits']['hits']]

现在如果我添加另一个过滤器,例如

...
filter_by_client = [
    {'match_phrase': {'client_id': '2769z'}},
    {'match': {'remote_address': '181.220.174.189'}}
]
...

它也很好用!将其缩小到 5 个结果。

我的问题是:如何在 所有字段 上查询该字符串?如果该字符串位于字段的 start/end 处,如果它是大写字母,如果该字段实际上是 integer/float 而不是字符串,那么对我来说无关紧要,...

已经尝试过像这样使用“_all”关键字

...
filter_by_client = [
    {'match_phrase': {'client_id': '2769z'}},
    {'match': {'_all': '181.220.174.189'}}
]
...

但它给了我 0 个结果。尝试重现通过 Kibana 界面发生的相同行为。

Nishant 提到的是使用 copy_to 字段的最佳解决方案,但是如果您无法控制更改映射,那么您可以尝试看看以下任何方法是否有帮助。

使用查询字符串查询

您可以使用 Query String Query,您的查询如下:

...
filter_by_client = [
    {'match_phrase': {'client_id': '2769z'}},
    {'query_string': {'query': '181.220.174.189'}}
]
... 

一个重要的注意事项是 query_string 默认搜索所有字段。我提到的 link 说明如下:

The default field for query terms if no prefix field is specified. Defaults to the index.query.default_field index settings, which in turn defaults to *. * extracts all fields in the mapping that are eligible to term queries and filters the metadata fields.

另外我提到这一点是因为我希望您在决定使用 query_string.

之前了解使用 query_string 与简单匹配 Match vs Query-String 的区别

The match family of queries does not go through a "query parsing" process. It does not support field name prefixes, wildcard characters, or other "advanced" features. For this reason, chances of it failing are very small / non existent, and it provides an excellent behavior when it comes to just analyze and run that text as a query behavior (which is usually what a text search box does). Also, the phrase_prefix type can provide a great "as you type" behavior to automatically load search results.

使用多重匹配

下面是另一种可能的解决方案,如果您不想更改映射,它会使用 multi-match 查询

...
filter_by_client = [
    {'match_phrase': {'client_id': '2769z'}},
    {'multi_match': {'query': '181.220.174.189', 'fields': ['url', 'field_2']}}
]
...

查看您需要如何在查询时明确提及要考虑的字段。但一定要validate/test彻底了解它。

如果有帮助请告诉我!