在 logstash 中使用 grok 过滤日志
Filter logs with grok in logstash
我正在尝试过滤掉在 grok 的帮助下收到的日志。下面是示例日志
INFO | jvm 1 | main | 2013/04/05 01:08:47.048 | [m[32mINFO [TaskExecutor-master-2443-ProcessTask [31111111112]] [b2cConfirmationAction] CRON JOB ID : 101AA1C, ACTION : ConfirmationAction , CUSTOMER ID : 000001111111 , EMAIL ADDRESS : abc@gmail.com , SCHEDULE : Every 1 week , MESSAGE : Execution started for action ConfirmationAction
我正在使用 grok 调试器 (https://grokdebug.herokuapp.com/) 在更新 logstash conf 文件之前进行测试。
下面是我的过滤器代码:
%{LOGLEVEL:level}%{GREEDYDATA:greedydata}%{SPACE}%{YEAR}[/-]%{MONTHNUM}[/-]%{MONTHDAY}%{SPACE}%{HOUR}:%{MINUTE}:%{SECOND}%{GREEDYDATA:gd} \[(?:%{WORD:action})\]%{GREEDYDATA:cronjobresult}
这里我的输出是
"level": [ [ "INFO" ] ], "greedydata": [ [ " | jvm 1 | main | 20" ] ], "SPACE": [ [ "", " " ] ], "YEAR": [ [ "13" ] ], "MONTHNUM": [ [ "04" ] ], "MONTHDAY": [ [ "05" ] ], "HOUR": [ [ "01" ] ], "MINUTE": [ [ "08" ] ], "SECOND": [ [ "47.048" ] ], "gd": [ [ " | \u001b[m\u001b[32mINFO [TaskExecutor-master-2443-ProcessTask [31111111112]]" ] ], "action": [ [ "b2cConfirmationAction" ] ], "cronjobresult": [ [ " CRON JOB ID : 101AA4A , ACTION : ConfirmationAction , CUSTOMER ID : 000001111111 , EMAIL ADDRESS : abc@gmail.com , SCHEDULE : Every 1 week , MESSAGE : Execution started for action ConfirmationAction" ] ] }
我的要求是在 cronjobresult 下获取值,例如具有不同且独立字段的 cron 作业 iD 客户 ID,以便我可以在 kibana 中使用这些值。现在我无法得到它。此外,我已经两次使用 greedyData,对于此日志的更好方法将是可观的。
您可以简单地进一步扩展您的过滤器并明确匹配它。例如,要匹配 cron 作业 id,您可以在过滤器中写入 CRON JOB ID : %{BASE16NUM:Cron_job_id}
。
如果您不需要日志中的任何数据,那么您可以简单地写 .*
而不是 GREEDYDATA
,它将被跳过。
这是您日志的完整过滤器,
%{LOGLEVEL:level}%{GREEDYDATA:greedydata}%{SPACE}%{YEAR}[/-]%{MONTHNUM}[/-]%{MONTHDAY}%{SPACE}%{HOUR}:%{MINUTE}:%{SECOND}%{GREEDYDATA:gd} \[(?:%{WORD:action})\] CRON JOB ID : %{BASE16NUM:Cron_job_id},.*CUSTOMER ID : %{NUMBER:Customer_id}.*EMAIL ADDRESS : %{EMAILADDRESS}.*SCHEDULE : %{GREEDYDATA:schedule}.*, MESSAGE : %{GREEDYDATA:Message}
输出:
{
"level": [
[
"INFO"
]
],
"greedydata": [
[
" | jvm 1 | main | 20"
]
],
"SPACE": [
[
"",
" "
]
],
"YEAR": [
[
"13"
]
],
"MONTHNUM": [
[
"04"
]
],
"MONTHDAY": [
[
"05"
]
],
"HOUR": [
[
"01"
]
],
"MINUTE": [
[
"08"
]
],
"SECOND": [
[
"47.048"
]
],
"gd": [
[
" | [m[32mINFO [TaskExecutor-master-2443-ProcessTask [31111111112]]"
]
],
"action": [
[
"b2cConfirmationAction"
]
],
"Cron_job_id": [
[
"101AA1C"
]
],
"Customer_id": [
[
"000001111111"
]
],
"BASE10NUM": [
[
"000001111111"
]
],
"EMAILADDRESS": [
[
"abc@gmail.com"
]
],
"local": [
[
"abc"
]
],
"remote": [
[
"gmail.com"
]
],
"schedule": [
[
"Every 1 week "
]
],
"Message": [
[
"Execution started for action"
]
]
}
请注意,我使用了来自 https://github.com/rgevaert/grok-patterns/blob/master/grok.d/postfix_patterns
的 EMAILADDRESS
模式
如果你想在https://grokdebug.herokuapp.com上测试它,你需要添加,
EMAILADDRESSPART [a-zA-Z0-9_.+-=:]+
EMAILADDRESS %{EMAILADDRESSPART:local}@%{EMAILADDRESSPART:remote}
作为自定义模式通过检查 add custom patterns
我正在尝试过滤掉在 grok 的帮助下收到的日志。下面是示例日志
INFO | jvm 1 | main | 2013/04/05 01:08:47.048 | [m[32mINFO [TaskExecutor-master-2443-ProcessTask [31111111112]] [b2cConfirmationAction] CRON JOB ID : 101AA1C, ACTION : ConfirmationAction , CUSTOMER ID : 000001111111 , EMAIL ADDRESS : abc@gmail.com , SCHEDULE : Every 1 week , MESSAGE : Execution started for action ConfirmationAction
我正在使用 grok 调试器 (https://grokdebug.herokuapp.com/) 在更新 logstash conf 文件之前进行测试。 下面是我的过滤器代码:
%{LOGLEVEL:level}%{GREEDYDATA:greedydata}%{SPACE}%{YEAR}[/-]%{MONTHNUM}[/-]%{MONTHDAY}%{SPACE}%{HOUR}:%{MINUTE}:%{SECOND}%{GREEDYDATA:gd} \[(?:%{WORD:action})\]%{GREEDYDATA:cronjobresult}
这里我的输出是
"level": [ [ "INFO" ] ], "greedydata": [ [ " | jvm 1 | main | 20" ] ], "SPACE": [ [ "", " " ] ], "YEAR": [ [ "13" ] ], "MONTHNUM": [ [ "04" ] ], "MONTHDAY": [ [ "05" ] ], "HOUR": [ [ "01" ] ], "MINUTE": [ [ "08" ] ], "SECOND": [ [ "47.048" ] ], "gd": [ [ " | \u001b[m\u001b[32mINFO [TaskExecutor-master-2443-ProcessTask [31111111112]]" ] ], "action": [ [ "b2cConfirmationAction" ] ], "cronjobresult": [ [ " CRON JOB ID : 101AA4A , ACTION : ConfirmationAction , CUSTOMER ID : 000001111111 , EMAIL ADDRESS : abc@gmail.com , SCHEDULE : Every 1 week , MESSAGE : Execution started for action ConfirmationAction" ] ] }
我的要求是在 cronjobresult 下获取值,例如具有不同且独立字段的 cron 作业 iD 客户 ID,以便我可以在 kibana 中使用这些值。现在我无法得到它。此外,我已经两次使用 greedyData,对于此日志的更好方法将是可观的。
您可以简单地进一步扩展您的过滤器并明确匹配它。例如,要匹配 cron 作业 id,您可以在过滤器中写入 CRON JOB ID : %{BASE16NUM:Cron_job_id}
。
如果您不需要日志中的任何数据,那么您可以简单地写 .*
而不是 GREEDYDATA
,它将被跳过。
这是您日志的完整过滤器,
%{LOGLEVEL:level}%{GREEDYDATA:greedydata}%{SPACE}%{YEAR}[/-]%{MONTHNUM}[/-]%{MONTHDAY}%{SPACE}%{HOUR}:%{MINUTE}:%{SECOND}%{GREEDYDATA:gd} \[(?:%{WORD:action})\] CRON JOB ID : %{BASE16NUM:Cron_job_id},.*CUSTOMER ID : %{NUMBER:Customer_id}.*EMAIL ADDRESS : %{EMAILADDRESS}.*SCHEDULE : %{GREEDYDATA:schedule}.*, MESSAGE : %{GREEDYDATA:Message}
输出:
{
"level": [
[
"INFO"
]
],
"greedydata": [
[
" | jvm 1 | main | 20"
]
],
"SPACE": [
[
"",
" "
]
],
"YEAR": [
[
"13"
]
],
"MONTHNUM": [
[
"04"
]
],
"MONTHDAY": [
[
"05"
]
],
"HOUR": [
[
"01"
]
],
"MINUTE": [
[
"08"
]
],
"SECOND": [
[
"47.048"
]
],
"gd": [
[
" | [m[32mINFO [TaskExecutor-master-2443-ProcessTask [31111111112]]"
]
],
"action": [
[
"b2cConfirmationAction"
]
],
"Cron_job_id": [
[
"101AA1C"
]
],
"Customer_id": [
[
"000001111111"
]
],
"BASE10NUM": [
[
"000001111111"
]
],
"EMAILADDRESS": [
[
"abc@gmail.com"
]
],
"local": [
[
"abc"
]
],
"remote": [
[
"gmail.com"
]
],
"schedule": [
[
"Every 1 week "
]
],
"Message": [
[
"Execution started for action"
]
]
}
请注意,我使用了来自 https://github.com/rgevaert/grok-patterns/blob/master/grok.d/postfix_patterns
的EMAILADDRESS
模式
如果你想在https://grokdebug.herokuapp.com上测试它,你需要添加,
EMAILADDRESSPART [a-zA-Z0-9_.+-=:]+
EMAILADDRESS %{EMAILADDRESSPART:local}@%{EMAILADDRESSPART:remote}
作为自定义模式通过检查 add custom patterns