在单个 filebeat 消息中的有效 json 行之间发送连续的无效 json 行
send consecutive invalid json lines between valid json lines in a single filebeat message
我有一个文件,其中包含行分隔的 json 对象以及非 json 数据(stderr 堆栈跟踪)。
{"timestamp": "20170104T17:10:39", "retry": 0, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:40", "retry": 1, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:41", "retry": 2, "level": "info", "event": "failed to download"}
Traceback (most recent call last):
File "a.py", line 12, in <module>
foo()
File "a.py", line 10, in foo
bar()
File "a.py", line 4, in bar
raise Exception("This was unexpected")
Exception: This was unexpected
{"timestamp": "20170104T17:10:42", "retry": 3, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:43", "retry": 4, "level": "info", "event": "failed to download"}
使用以下配置,我能够正确获取有效的 json 行,但无效的 json 正在单独发送(逐行)。
filebeat.yml
filebeat.prospectors:
- input_type: log
document_type: mytype
json:
message_key: event
add_error_key: true
paths:
- /tmp/*.log
output:
console:
pretty: true
file:
path: "/tmp/filebeat"
filename: filebeat
输出:
{
"@timestamp": "2017-01-04T12:03:36.659Z",
"beat": {
"hostname": "...", "name": "...", "version": "5.1.1"
},
"input_type": "log",
"json": {
"event": "failed to download",
"level": "info",
"retry": 2,
"timestamp": "20170104T17:10:41"
},
"offset": 285,
"source": "/tmp/test.log",
"type": "mytype"
}
{
"@timestamp": "2017-01-04T12:03:36.659Z",
"beat": {
"hostname": "...", "name": "...", "version": "5.1.1"
},
"input_type": "log",
"json": {
"event": "Traceback (most recent call last):",
"json_error": "Error decoding JSON: invalid character 'T' looking for beginning of value"
},
"offset": 320,
"source": "/tmp/test.log",
"type": "mytype"
}
I want to club all the non json lines until a new json line into one
message.
使用多行,我尝试了以下方法
filebeat.prospectors:
- input_type: log
document_type: mytype
json:
message_key: event
add_error_key: true
paths:
- /tmp/*.log
multiline:
pattern: '^{'
negate: true
match: after
output:
console:
pretty: true
file:
path: "/tmp/filebeat"
filename: filebeat
但它似乎不起作用。它对在 json.message_key
.
中指定的 event
键的值执行多行规则
从 docs here 我明白为什么会这样
json.message_key
-
JSON key on which to apply the line filtering and multiline settings.
This key must be top level and its value must be string, otherwise it
is ignored. If no text key is defined, the line filtering and
multiline features cannot be used.
有没有其他方法可以将连续的非 json 行组合成一条消息?
我希望在将其发送到 logstash 之前捕获整个堆栈跟踪。
Filebeat 在 之后应用多行分组 JSON 解析,因此多行模式不能基于构成 JSON 对象的字符(例如{
).
在 Filebeat 中,还有另一种方法可以进行 JSON 解析,这样 JSON 解析发生在多行分组之后,因此您的模式可以包含 JSON 对象字符。您需要 Filebeat 5.2(即将发布),因为 target
字段已添加到 decode_json_fields 处理器,因此您可以指定解码的 json 字段将添加到事件的位置。
filebeat.prospectors:
- paths: [input.txt]
multiline:
pattern: '^({|Traceback)'
negate: true
match: after
processors:
- decode_json_fields:
when.regexp:
message: '^{'
fields: message
target:
- drop_fields:
when.regexp:
message: '^{'
fields: message
我使用 Golang 游乐场测试了多行模式 here。
Filebeat 产生以下输出(使用您在上面提供的日志行作为输入)。 (我使用了 master 分支的构建。)
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":95,"retry":0,"source":"input.txt","timestamp":"20170104T17:10:39","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":190,"retry":1,"source":"input.txt","timestamp":"20170104T17:10:40","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":285,"retry":2,"source":"input.txt","timestamp":"20170104T17:10:41","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"input_type":"log","message":"Traceback (most recent call last):\n File \"a.py\", line 12, in \u003cmodule\u003e\n foo()\n File \"a.py\", line 10, in foo\n bar()\n File \"a.py\", line 4, in bar\n raise Exception(\"This was unexpected\")\nException: This was unexpected","offset":511,"source":"input.txt","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":606,"retry":3,"source":"input.txt","timestamp":"20170104T17:10:42","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":702,"retry":4,"source":"input.txt","timestamp":"20170104T17:10:43","type":"log"}
我有一个文件,其中包含行分隔的 json 对象以及非 json 数据(stderr 堆栈跟踪)。
{"timestamp": "20170104T17:10:39", "retry": 0, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:40", "retry": 1, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:41", "retry": 2, "level": "info", "event": "failed to download"}
Traceback (most recent call last):
File "a.py", line 12, in <module>
foo()
File "a.py", line 10, in foo
bar()
File "a.py", line 4, in bar
raise Exception("This was unexpected")
Exception: This was unexpected
{"timestamp": "20170104T17:10:42", "retry": 3, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:43", "retry": 4, "level": "info", "event": "failed to download"}
使用以下配置,我能够正确获取有效的 json 行,但无效的 json 正在单独发送(逐行)。
filebeat.yml
filebeat.prospectors:
- input_type: log
document_type: mytype
json:
message_key: event
add_error_key: true
paths:
- /tmp/*.log
output:
console:
pretty: true
file:
path: "/tmp/filebeat"
filename: filebeat
输出:
{
"@timestamp": "2017-01-04T12:03:36.659Z",
"beat": {
"hostname": "...", "name": "...", "version": "5.1.1"
},
"input_type": "log",
"json": {
"event": "failed to download",
"level": "info",
"retry": 2,
"timestamp": "20170104T17:10:41"
},
"offset": 285,
"source": "/tmp/test.log",
"type": "mytype"
}
{
"@timestamp": "2017-01-04T12:03:36.659Z",
"beat": {
"hostname": "...", "name": "...", "version": "5.1.1"
},
"input_type": "log",
"json": {
"event": "Traceback (most recent call last):",
"json_error": "Error decoding JSON: invalid character 'T' looking for beginning of value"
},
"offset": 320,
"source": "/tmp/test.log",
"type": "mytype"
}
I want to club all the non json lines until a new json line into one message.
使用多行,我尝试了以下方法
filebeat.prospectors:
- input_type: log
document_type: mytype
json:
message_key: event
add_error_key: true
paths:
- /tmp/*.log
multiline:
pattern: '^{'
negate: true
match: after
output:
console:
pretty: true
file:
path: "/tmp/filebeat"
filename: filebeat
但它似乎不起作用。它对在 json.message_key
.
event
键的值执行多行规则
从 docs here 我明白为什么会这样
json.message_key
-
JSON key on which to apply the line filtering and multiline settings. This key must be top level and its value must be string, otherwise it is ignored. If no text key is defined, the line filtering and multiline features cannot be used.
有没有其他方法可以将连续的非 json 行组合成一条消息?
我希望在将其发送到 logstash 之前捕获整个堆栈跟踪。
Filebeat 在 之后应用多行分组 JSON 解析,因此多行模式不能基于构成 JSON 对象的字符(例如{
).
在 Filebeat 中,还有另一种方法可以进行 JSON 解析,这样 JSON 解析发生在多行分组之后,因此您的模式可以包含 JSON 对象字符。您需要 Filebeat 5.2(即将发布),因为 target
字段已添加到 decode_json_fields 处理器,因此您可以指定解码的 json 字段将添加到事件的位置。
filebeat.prospectors:
- paths: [input.txt]
multiline:
pattern: '^({|Traceback)'
negate: true
match: after
processors:
- decode_json_fields:
when.regexp:
message: '^{'
fields: message
target:
- drop_fields:
when.regexp:
message: '^{'
fields: message
我使用 Golang 游乐场测试了多行模式 here。
Filebeat 产生以下输出(使用您在上面提供的日志行作为输入)。 (我使用了 master 分支的构建。)
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":95,"retry":0,"source":"input.txt","timestamp":"20170104T17:10:39","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":190,"retry":1,"source":"input.txt","timestamp":"20170104T17:10:40","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":285,"retry":2,"source":"input.txt","timestamp":"20170104T17:10:41","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"input_type":"log","message":"Traceback (most recent call last):\n File \"a.py\", line 12, in \u003cmodule\u003e\n foo()\n File \"a.py\", line 10, in foo\n bar()\n File \"a.py\", line 4, in bar\n raise Exception(\"This was unexpected\")\nException: This was unexpected","offset":511,"source":"input.txt","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":606,"retry":3,"source":"input.txt","timestamp":"20170104T17:10:42","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":702,"retry":4,"source":"input.txt","timestamp":"20170104T17:10:43","type":"log"}