在没有统计聚合或重复聚合错误的情况下获取范围的最小值和最大值?
Get Min and Max of a Range Without Stats Agg or Duplicate Aggs Error?
我觉得很难让管道聚合在实践中做我想做的事。
我会post我有什么,但是想法:
- 设定一个日期范围,并为该范围内过去 10 个月的每个月设定一个桶。明白了。
- 获取每个桶的“量级”字段的最小值和最大值。我只能弄清楚如何使用“stats”agg 来做到这一点,因为如果我尝试将两者都作为单独的 aggs 来做,我会得到重复的错误。然而,我不想要其他统计数据。我可以避免对此进行统计汇总吗?
- 总分。我到底该怎么做?那是在踢我的尾巴。我不知道你是否可以对_score字段求和。
所以这是我根据常见的地震概念练习的索引:
PUT _bulk
{ "index" : { "_index" : "earthquakes", "_id" : "1" } }
{ "date": "30-09-2020", "magnitude": "3.4", "lon": "74.12", "lat": "43.67" }
{ "index" : { "_index" : "earthquakes", "_id" : "2" } }
{ "date": "30-09-2020", "magnitude": "1.2", "lon": "78.02", "lat": "103.07" }
{ "index" : { "_index" : "earthquakes", "_id" : "3" } }
{ "date": "15-10-2020", "magnitude": "2.5", "lon": "178.02", "lat": "98.41" }
{ "index" : { "_index" : "earthquakes", "_id" : "4" } }
{ "date": "19-11-2020", "magnitude": "1.9", "lon": "14.67", "lat": "100.35" }
{ "index" : { "_index" : "earthquakes", "_id" : "5" } }
{ "date": "13-12-2020", "magnitude": "6.2", "lon": "123.93", "lat": "56.05" }
{ "index" : { "_index" : "earthquakes", "_id" : "6" } }
{ "date": "21-12-2020", "magnitude": "0.2", "lon": "130.31", "lat": "83.41" }
{ "index" : { "_index" : "earthquakes", "_id" : "7" } }
{ "date": "17-01-2021", "magnitude": "0.2", "lon": "10.31", "lat": "98.00" }
{ "index" : { "_index" : "earthquakes", "_id" : "8" } }
{ "date": "23-01-2021", "magnitude": "4.6", "lon": "112.31", "lat": "69.96" }
{ "index" : { "_index" : "earthquakes", "_id" : "9" } }
{ "date": "31-01-2021", "magnitude": "0.4", "lon": "79.43", "lat": "72.14" }
{ "index" : { "_index" : "earthquakes", "_id" : "10" } }
{ "date": "03-02-2021", "magnitude": "7.1", "lon": "120.80", "lat": "50.22" }
这是我汇总的内容。注意:在尝试对 _score 字段求和之前,我将命中数设置为 10...但并没有发生:
GET earthquakes/_search
{
"size": 0,
"aggs": {
"range_mag": {
"date_range": {
"field": "date",
"ranges": [
{
"from": "now-10M",
"to": "now"
}
]
},
"aggs": {
"by_month_mag": {
"date_histogram": {
"field": "date",
"calendar_interval": "month"
},
"aggs": {
"stat_mag": {
"stats": {
"field": "magnitude"
}
}
}
}
}
}
}
}
^ 这行得通,但要获得最小值和最大值,但要添加我不需要的数据。我没有给出我的分数总和,因为它让我抓狂。有没有更好的方法来完成我想做的事情?
还是谢谢你。在我可以轻松输入或阅读文档的所有内容中,聚合只是我认为我会得到但不知何故被卡住了的一件事。
对于min/max,您可以对同一领域进行单独的汇总,然后对您所做的分数求和
"score_sum":{
"sum": {
"script": "_score"
}
}
最终查询
GET earthquakes/_search
{
"size": 0,
"aggs": {
"score_sum": {
"sum": {
"script": "_score"
}
},
"range_mag": {
"date_range": {
"field": "date",
"ranges": [
{
"from": "now-10M",
"to": "now"
}
]
},
"aggs": {
"by_month_mag": {
"date_histogram": {
"field": "date",
"calendar_interval": "month"
},
"aggs": {
"min_mag": {
"min": {
"field": "magnitude"
}
},
"max_mag": {
"max": {
"field": "magnitude"
}
}
}
}
}
}
}
}
我觉得很难让管道聚合在实践中做我想做的事。 我会post我有什么,但是想法:
- 设定一个日期范围,并为该范围内过去 10 个月的每个月设定一个桶。明白了。
- 获取每个桶的“量级”字段的最小值和最大值。我只能弄清楚如何使用“stats”agg 来做到这一点,因为如果我尝试将两者都作为单独的 aggs 来做,我会得到重复的错误。然而,我不想要其他统计数据。我可以避免对此进行统计汇总吗?
- 总分。我到底该怎么做?那是在踢我的尾巴。我不知道你是否可以对_score字段求和。
所以这是我根据常见的地震概念练习的索引:
PUT _bulk
{ "index" : { "_index" : "earthquakes", "_id" : "1" } }
{ "date": "30-09-2020", "magnitude": "3.4", "lon": "74.12", "lat": "43.67" }
{ "index" : { "_index" : "earthquakes", "_id" : "2" } }
{ "date": "30-09-2020", "magnitude": "1.2", "lon": "78.02", "lat": "103.07" }
{ "index" : { "_index" : "earthquakes", "_id" : "3" } }
{ "date": "15-10-2020", "magnitude": "2.5", "lon": "178.02", "lat": "98.41" }
{ "index" : { "_index" : "earthquakes", "_id" : "4" } }
{ "date": "19-11-2020", "magnitude": "1.9", "lon": "14.67", "lat": "100.35" }
{ "index" : { "_index" : "earthquakes", "_id" : "5" } }
{ "date": "13-12-2020", "magnitude": "6.2", "lon": "123.93", "lat": "56.05" }
{ "index" : { "_index" : "earthquakes", "_id" : "6" } }
{ "date": "21-12-2020", "magnitude": "0.2", "lon": "130.31", "lat": "83.41" }
{ "index" : { "_index" : "earthquakes", "_id" : "7" } }
{ "date": "17-01-2021", "magnitude": "0.2", "lon": "10.31", "lat": "98.00" }
{ "index" : { "_index" : "earthquakes", "_id" : "8" } }
{ "date": "23-01-2021", "magnitude": "4.6", "lon": "112.31", "lat": "69.96" }
{ "index" : { "_index" : "earthquakes", "_id" : "9" } }
{ "date": "31-01-2021", "magnitude": "0.4", "lon": "79.43", "lat": "72.14" }
{ "index" : { "_index" : "earthquakes", "_id" : "10" } }
{ "date": "03-02-2021", "magnitude": "7.1", "lon": "120.80", "lat": "50.22" }
这是我汇总的内容。注意:在尝试对 _score 字段求和之前,我将命中数设置为 10...但并没有发生:
GET earthquakes/_search
{
"size": 0,
"aggs": {
"range_mag": {
"date_range": {
"field": "date",
"ranges": [
{
"from": "now-10M",
"to": "now"
}
]
},
"aggs": {
"by_month_mag": {
"date_histogram": {
"field": "date",
"calendar_interval": "month"
},
"aggs": {
"stat_mag": {
"stats": {
"field": "magnitude"
}
}
}
}
}
}
}
}
^ 这行得通,但要获得最小值和最大值,但要添加我不需要的数据。我没有给出我的分数总和,因为它让我抓狂。有没有更好的方法来完成我想做的事情? 还是谢谢你。在我可以轻松输入或阅读文档的所有内容中,聚合只是我认为我会得到但不知何故被卡住了的一件事。
对于min/max,您可以对同一领域进行单独的汇总,然后对您所做的分数求和
"score_sum":{
"sum": {
"script": "_score"
}
}
最终查询
GET earthquakes/_search
{
"size": 0,
"aggs": {
"score_sum": {
"sum": {
"script": "_score"
}
},
"range_mag": {
"date_range": {
"field": "date",
"ranges": [
{
"from": "now-10M",
"to": "now"
}
]
},
"aggs": {
"by_month_mag": {
"date_histogram": {
"field": "date",
"calendar_interval": "month"
},
"aggs": {
"min_mag": {
"min": {
"field": "magnitude"
}
},
"max_mag": {
"max": {
"field": "magnitude"
}
}
}
}
}
}
}
}