Elasticsearch - 查询没有指定时区的日期
Elasticsearch - query dates without a specified timezone
我有一个具有以下映射的索引 - 日期的标准格式。在下面的第二条记录中,指定的时间实际上是当地时间 - 但 ES 将其视为 UTC。
尽管 ES 在内部将所有解析的日期时间转换为 UTC,但显然它也必须存储原始字符串。
我的问题是是否(以及如何)查询 scheduledDT 值未明确指定时区的所有记录。
{
"curator_v3": {
"mappings": {
"published": {
"analyzer": "classic",
"numeric_detection": true,
"properties": {
"Id": {
"type": "string",
"index": "not_analyzed",
"include_in_all": false
},
"createDT": {
"type": "date",
"format": "dateOptionalTime",
"include_in_all": false
},
"scheduleDT": {
"type": "date",
"format": "dateOptionalTime",
"include_in_all": false
},
"title": {
"type": "string",
"fields": {
"english": {
"type": "string",
"analyzer": "english"
},
"raw": {
"type": "string",
"index": "not_analyzed"
},
"shingle": {
"type": "string",
"analyzer": "shingle"
},
"spanish": {
"type": "string",
"analyzer": "spanish"
}
},
"include_in_all": false
}
}
}
}
}
}
我们使用 .NET 作为 ElasticSearch 的客户端,并且在为 scheduleDT 字段指定时区方面并不一致。
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 12,
"successful": 12,
"failed": 0
},
"hits": {
"total": 32,
"max_score": null,
"hits": [
{
"_index": "curator_v3",
"_type": "published",
"_id": "29651227",
"_score": null,
"fields": {
"Id": [
"29651227"
],
"scheduleDT": [
"2015-11-21T22:17:51.0946798-06:00"
],
"title": [
"97 Year-Old Woman Cries Tears Of Joy After Finally Getting Her High School Diploma"
],
"createDT": [
"2015-11-21T22:13:32.3597142-06:00"
]
},
"sort": [
1448165871094
]
},
{
"_index": "curator_v3",
"_type": "published",
"_id": "210466413",
"_score": null,
"fields": {
"Id": [
"210466413"
],
"scheduleDT": [
"2015-11-22T12:00:00"
],
"title": [
"6 KC treats to bring to Thanksgiving"
],
"createDT": [
"2015-11-20T15:08:25.4282-06:00"
]
},
"sort": [
1448193600000
]
}
]
},
"aggregations": {
"ScheduleDT": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 27,
"buckets": [
{
"key": 1448165871094,
"key_as_string": "2015-11-22T04:17:51.094Z",
"doc_count": 1
},
{
"key": 1448193600000,
"key_as_string": "2015-11-22T12:00:00.000Z",
"doc_count": 4
}
]
}
}
}
您可以通过查询字段长度小于 20 个字符的 scheduleDT
的文档(例如 2015-11-22T12:00:00
)来执行此操作。所有具有指定时区的日期字段都会更长。
应该这样做:
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "doc.scheduleDT.value.size() < 20"
}
}
}
}
}
但是请注意,为了使您的查询更容易创建,您应该始终尝试将所有时间戳转换为 UTC ,然后再 为您的文档编制索引。
最后,还要确保您有 dynamic scripting enabled 以便 运行 上述查询。
更新
实际上,如果您直接在脚本中使用 _source
它会起作用,因为它将 return 来自源的真实值,就像文档被索引时一样:
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "_source.scheduleDT.size() < 20"
}
}
}
}
}
我有一个具有以下映射的索引 - 日期的标准格式。在下面的第二条记录中,指定的时间实际上是当地时间 - 但 ES 将其视为 UTC。
尽管 ES 在内部将所有解析的日期时间转换为 UTC,但显然它也必须存储原始字符串。
我的问题是是否(以及如何)查询 scheduledDT 值未明确指定时区的所有记录。
{
"curator_v3": {
"mappings": {
"published": {
"analyzer": "classic",
"numeric_detection": true,
"properties": {
"Id": {
"type": "string",
"index": "not_analyzed",
"include_in_all": false
},
"createDT": {
"type": "date",
"format": "dateOptionalTime",
"include_in_all": false
},
"scheduleDT": {
"type": "date",
"format": "dateOptionalTime",
"include_in_all": false
},
"title": {
"type": "string",
"fields": {
"english": {
"type": "string",
"analyzer": "english"
},
"raw": {
"type": "string",
"index": "not_analyzed"
},
"shingle": {
"type": "string",
"analyzer": "shingle"
},
"spanish": {
"type": "string",
"analyzer": "spanish"
}
},
"include_in_all": false
}
}
}
}
}
}
我们使用 .NET 作为 ElasticSearch 的客户端,并且在为 scheduleDT 字段指定时区方面并不一致。
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 12,
"successful": 12,
"failed": 0
},
"hits": {
"total": 32,
"max_score": null,
"hits": [
{
"_index": "curator_v3",
"_type": "published",
"_id": "29651227",
"_score": null,
"fields": {
"Id": [
"29651227"
],
"scheduleDT": [
"2015-11-21T22:17:51.0946798-06:00"
],
"title": [
"97 Year-Old Woman Cries Tears Of Joy After Finally Getting Her High School Diploma"
],
"createDT": [
"2015-11-21T22:13:32.3597142-06:00"
]
},
"sort": [
1448165871094
]
},
{
"_index": "curator_v3",
"_type": "published",
"_id": "210466413",
"_score": null,
"fields": {
"Id": [
"210466413"
],
"scheduleDT": [
"2015-11-22T12:00:00"
],
"title": [
"6 KC treats to bring to Thanksgiving"
],
"createDT": [
"2015-11-20T15:08:25.4282-06:00"
]
},
"sort": [
1448193600000
]
}
]
},
"aggregations": {
"ScheduleDT": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 27,
"buckets": [
{
"key": 1448165871094,
"key_as_string": "2015-11-22T04:17:51.094Z",
"doc_count": 1
},
{
"key": 1448193600000,
"key_as_string": "2015-11-22T12:00:00.000Z",
"doc_count": 4
}
]
}
}
}
您可以通过查询字段长度小于 20 个字符的 scheduleDT
的文档(例如 2015-11-22T12:00:00
)来执行此操作。所有具有指定时区的日期字段都会更长。
应该这样做:
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "doc.scheduleDT.value.size() < 20"
}
}
}
}
}
但是请注意,为了使您的查询更容易创建,您应该始终尝试将所有时间戳转换为 UTC ,然后再 为您的文档编制索引。
最后,还要确保您有 dynamic scripting enabled 以便 运行 上述查询。
更新
实际上,如果您直接在脚本中使用 _source
它会起作用,因为它将 return 来自源的真实值,就像文档被索引时一样:
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "_source.scheduleDT.size() < 20"
}
}
}
}
}