提取最后 word/text 多个匹配的 logstash

Question

我有一个用于摄取整个文件的 Logstash 管道，使用了多行代码。我想获取所有匹配事件并仅提取最后一个单词或文本。我无法使用任何正则表达式。

文件内容

some line extract this 875846 85746,857
some other line
some other line with more text
let's extract this 887362        24153,44737
some other final line

要求匹配

查找所有包含 "extract this" 的行并检索最后 word/text

预期输出

{
    "patternmatch1" => [
        [0] [
            [0] "85746,857"
        ],
        [1] [
            [0] "24153,44737"
        ]
    ],
       "@timestamp" => 2020-01-14T11:15:34.304Z
}

Logstash 管道

input {
    file{
        path => "C:/file.txt"
        start_position => "beginning"
        sincedb_path => NUL
        codec => multiline { 
            pattern => "^nomatching"
            negate => true
            what => previous
            auto_flush_interval => 1
            multiline_tag => ""
        }
    }
}
filter {
  ruby { code => 'event.set("patternmatch1",event.get("message").scan(/extract this([^\r]*)/))' }
}
output {   
  stdout { codec => rubydebug } 
}

当前输出

{
    "patternmatch1" => [],
     "message" => "some line extract this 875846 85746,857\r\nsome other line\r\nsome other line with more text\r\nlet's extract this 887362        24153,44737\r\nsome other final line\r\n\r",
   "@timestamp" => 2020-01-14T11:44:50.140Z
}

Answer 1

您可以使用以下正则表达式：

/extract this.*?(\d[\d,]*)\r?$/

它将匹配

extract this - 字面意思
.*? - 除换行字符外的任何 0+ 个字符尽可能少
(\d[\d,]*) - 第 1 组（什么 scan returns）：一个数字后跟 0+ 个数字或逗号
\r? - 一个可选的 CR（回车 return）
$ - 一行结束。

请注意，由于文件中的行结尾是 CRLF，因此不能仅用 $ 来匹配行尾位置，您应该使用 \r?$.

提取最后 word/text 多个匹配的 logstash

Extracting last word/text multiple matching logstash

regex

logstash