在单个 filebeat 消息中的有效 json 行之间发送连续的无效 json 行

send consecutive invalid json lines between valid json lines in a single filebeat message

我有一个文件,其中包含行分隔的 json 对象以及非 json 数据(stderr 堆栈跟踪)。

{"timestamp": "20170104T17:10:39", "retry": 0, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:40", "retry": 1, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:41", "retry": 2, "level": "info", "event": "failed to download"}
Traceback (most recent call last):
  File "a.py", line 12, in <module>
    foo()
  File "a.py", line 10, in foo
    bar()
  File "a.py", line 4, in bar
    raise Exception("This was unexpected")
Exception: This was unexpected
{"timestamp": "20170104T17:10:42", "retry": 3, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:43", "retry": 4, "level": "info", "event": "failed to download"}

使用以下配置,我能够正确获取有效的 json 行,但无效的 json 正在单独发送(逐行)。

filebeat.yml

filebeat.prospectors:
  - input_type: log
    document_type: mytype
    json:
      message_key: event
      add_error_key: true
    paths:
        - /tmp/*.log

output:
  console:
    pretty: true

  file:
    path: "/tmp/filebeat"
    filename: filebeat

输出:

{
  "@timestamp": "2017-01-04T12:03:36.659Z",
  "beat": {
    "hostname": "...", "name": "...", "version": "5.1.1"
  },
  "input_type": "log",
  "json": {
    "event": "failed to download",
    "level": "info",
    "retry": 2,
    "timestamp": "20170104T17:10:41"
  },
  "offset": 285,
  "source": "/tmp/test.log",
  "type": "mytype"
}
{
  "@timestamp": "2017-01-04T12:03:36.659Z",
  "beat": {
    "hostname": "...", "name": "...", "version": "5.1.1"
  },
  "input_type": "log",
  "json": {
    "event": "Traceback (most recent call last):",
    "json_error": "Error decoding JSON: invalid character 'T' looking for beginning of value"
  },
  "offset": 320,
  "source": "/tmp/test.log",
  "type": "mytype"
}

I want to club all the non json lines until a new json line into one message.

使用多行,我尝试了以下方法

filebeat.prospectors:
  - input_type: log
    document_type: mytype
    json:
      message_key: event
      add_error_key: true
    paths:
        - /tmp/*.log
    multiline:
      pattern: '^{'
      negate: true
      match: after

output:
  console:
    pretty: true

  file:
    path: "/tmp/filebeat"
    filename: filebeat

但它似乎不起作用。它对在 json.message_key.

中指定的 event 键的值执行多行规则

docs here 我明白为什么会这样 json.message_key -

JSON key on which to apply the line filtering and multiline settings. This key must be top level and its value must be string, otherwise it is ignored. If no text key is defined, the line filtering and multiline features cannot be used.

有没有其他方法可以将连续的非 json 行组合成一条消息?

我希望在将其发送到 logstash 之前捕获整个堆栈跟踪。

Filebeat 在 之后应用多行分组 JSON 解析,因此多行模式不能基于构成 JSON 对象的字符(例如{).

在 Filebeat 中,还有另一种方法可以进行 JSON 解析,这样 JSON 解析发生在多行分组之后,因此您的模式可以包含 JSON 对象字符。您需要 Filebeat 5.2(即将发布),因为 target 字段已添加到 decode_json_fields 处理器,因此您可以指定解码的 json 字段将添加到事件的位置。

filebeat.prospectors:
- paths: [input.txt]
  multiline:
    pattern: '^({|Traceback)'
    negate:  true
    match:   after

processors:
- decode_json_fields:
    when.regexp:
      message: '^{'
    fields: message
    target:
- drop_fields:
    when.regexp:
      message: '^{'
    fields: message

我使用 Golang 游乐场测试了多行模式 here

Filebeat 产生以下输出(使用您在上面提供的日志行作为输入)。 (我使用了 master 分支的构建。)

{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":95,"retry":0,"source":"input.txt","timestamp":"20170104T17:10:39","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":190,"retry":1,"source":"input.txt","timestamp":"20170104T17:10:40","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":285,"retry":2,"source":"input.txt","timestamp":"20170104T17:10:41","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"input_type":"log","message":"Traceback (most recent call last):\n  File \"a.py\", line 12, in \u003cmodule\u003e\n    foo()\n  File \"a.py\", line 10, in foo\n    bar()\n  File \"a.py\", line 4, in bar\n    raise Exception(\"This was unexpected\")\nException: This was unexpected","offset":511,"source":"input.txt","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":606,"retry":3,"source":"input.txt","timestamp":"20170104T17:10:42","type":"log"}
{"@timestamp":"2017-01-05T20:34:18.862Z","beat":{"hostname":"host.example.com","name":"host.example.com","version":"5.2.0-SNAPSHOT"},"event":"failed to download","input_type":"log","level":"info","offset":702,"retry":4,"source":"input.txt","timestamp":"20170104T17:10:43","type":"log"}