在 ElasticSearch 中更新记录
Update records in ElasticSearch
我想为特定索引中的所有记录更新 logdate
列。到目前为止,从我读到的内容来看,这似乎是不可能的?我是对的?
这是文档示例:
{
"_index": "logstash-01-2015",
"_type": "ufdb",
"_id": "AU__EvrALg15uxY1Wxf9",
"_score": 1,
"_source": {
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"@version": "1",
"@timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-08-14T04:50:05.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
}
您可以使用 partial update API.
为了测试它,我创建了一个简单的索引:
PUT /test_index
然后创建了一个文档:
PUT /test_index/doc/1
{
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"@version": "1",
"@timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-08-14T04:50:05.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
现在我可以对文档进行部分更新:
POST /test_index/doc/1/_update
{
"doc": {
"logdate": "2015-09-25T12:20:00.000Z"
}
}
如果我检索文档:
GET /test_index/doc/1
我会看到 logdate
属性 已更新:
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_version": 2,
"found": true,
"_source": {
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"@version": "1",
"@timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-09-25T12:20:00.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
}
这是我用来测试它的代码:
http://sense.qbox.io/gist/236bf271df6d867f5f0c87eacab592e41d3095cf
你是对的,那是不可能的。
有一个问题问了 Update by Query 很长时间了,我不确定它是否会很快实现,因为它对底层的 lucene 引擎有很大的问题。它需要删除所有文档并重新编制索引。
Update by Query Plugin 在 github 上可用,但它是实验性的,我从未尝试过。
更新 2018-05-02
原来的答案已经很老了。 Update By Query 现在支持。
我想为特定索引中的所有记录更新 logdate
列。到目前为止,从我读到的内容来看,这似乎是不可能的?我是对的?
这是文档示例:
{
"_index": "logstash-01-2015",
"_type": "ufdb",
"_id": "AU__EvrALg15uxY1Wxf9",
"_score": 1,
"_source": {
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"@version": "1",
"@timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-08-14T04:50:05.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
}
您可以使用 partial update API.
为了测试它,我创建了一个简单的索引:
PUT /test_index
然后创建了一个文档:
PUT /test_index/doc/1
{
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"@version": "1",
"@timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-08-14T04:50:05.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
现在我可以对文档进行部分更新:
POST /test_index/doc/1/_update
{
"doc": {
"logdate": "2015-09-25T12:20:00.000Z"
}
}
如果我检索文档:
GET /test_index/doc/1
我会看到 logdate
属性 已更新:
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_version": 2,
"found": true,
"_source": {
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"@version": "1",
"@timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-09-25T12:20:00.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
}
这是我用来测试它的代码:
http://sense.qbox.io/gist/236bf271df6d867f5f0c87eacab592e41d3095cf
你是对的,那是不可能的。
有一个问题问了 Update by Query 很长时间了,我不确定它是否会很快实现,因为它对底层的 lucene 引擎有很大的问题。它需要删除所有文档并重新编制索引。
Update by Query Plugin 在 github 上可用,但它是实验性的,我从未尝试过。
更新 2018-05-02
原来的答案已经很老了。 Update By Query 现在支持。