自定义 GROK 过滤器 - Logstash -> Elasticsearch
Custom GROK filter - Logstash -> Elasticsearch
我有一个正在捕获并发送到 logstash 的日志,日志的格式是
22304999 5 400.OUTPUT_SERVICE.510 submit The limit has been exceeded. Please use a different option. 2.54.44.221 /api/output/v3/contract/:PCID/order /api/output/v3/contract/:pcid/order https://www.example.org/output/ PUT 400 2017-09-28T15:50:57.843176Z
我正在尝试创建自定义 grok 过滤器以在将 header 字段发送到 elasticsearch 之前添加它。
我的目标是这样的,
SessionID => "22304999"
HitNumber => "5"
FactValue => "400.OUTPUT_SERVICE.510"
DimValue1 => "submit"
ErrMessage => "The limit has been exceeded. Please use a different option."
IP => "2.54.44.221"
TLT_URL => "/api/output/v3/contract/:PCID/order"
URL => "/api/output/v3/contract/:pcid/order"
Refferer => "https://www.example.org/output/"
Method => "PUT"
StatsCode => "400"
ReqTime => "2017-09-28T15:50:57.843176Z"
我是新手,所以只是想了解我是如何应用和测试它的,例如我将从一个空过滤器开始,
filter {
grok {
match => { "message" => "" }
}
}
我的第一个问题是 match => { "message" => "" }
消息只是一个日志行吗? 'message' 的定义是什么?
我的日志和我想要的字段由一个 Tab 分隔,每个 Tab 之后是一个新字段,这是否会使我想要实现的目标变得更容易,而不是寻找一种模式,我可以只寻找下一个吗选项卡?
做不到这一点,有人可以为我的领域之一提供一个例子,我应该能够完成其余的。
正则表达式: (?<SessionID>\S+)\s+(?<HitNumber>\S+)\s+(?<FactValue>\S+)\s+(?<DimValue1>\S+)\s+(?<ErrMessage>.+)\s+(?<IP>(?:\d{1,3}\.){3}\d{1,3})\s+(?<TLT_URL>\S+)\s+(?<URL>\S+)\s+(?<Refferer>\S+)\s+(?<Method>\S+)\s+(?<StatsCode>\S+)\s+(?<ReqTime>\S+)
详情:
(?<>)
命名捕获组
\S
匹配任何非空白字符
\d
匹配一个数字,{n,m}
匹配n
到m
次
+
匹配一次到无限次
输出:
{
"SessionID": [
[
"22304999"
]
],
"HitNumber": [
[
"5"
]
],
"FactValue": [
[
"400.OUTPUT_SERVICE.510"
]
],
"DimValue1": [
[
"submit"
]
],
"ErrMessage": [
[
"The limit has been exceeded. Please use a different option."
]
],
"IP": [
[
"2.54.44.221"
]
],
"TLT_URL": [
[
"/api/output/v3/contract/:PCID/order"
]
],
"URL": [
[
"/api/output/v3/contract/:pcid/order"
]
],
"Refferer": [
[
"https://www.example.org/output/"
]
],
"Method": [
[
"PUT"
]
],
"StatsCode": [
[
"400"
]
],
"ReqTime": [
[
"2017-09-28T15:50:57.843176Z"
]
]
}
如果您正在测试解决方案,您可以随时使用此站点:
我为你的问题做了这个 grok 模式:
%{INT:SessionID}\s*%{INT:HitNumber}\s*%{NOTSPACE:FaceValue}\s*%{GREEDYDATA:ErrMessage}\s*%{IP:IP}\s*%{NOTSPACE:TLT_URL}\s*%{NOTSPACE:URL}\s*%{NOTSPACE:Referer}\s*%{NOTSPACE:Method}\s*%{INT:StatsCode}\s*%{TIMESTAMP_ISO8601:ReqTime}
我有一个正在捕获并发送到 logstash 的日志,日志的格式是
22304999 5 400.OUTPUT_SERVICE.510 submit The limit has been exceeded. Please use a different option. 2.54.44.221 /api/output/v3/contract/:PCID/order /api/output/v3/contract/:pcid/order https://www.example.org/output/ PUT 400 2017-09-28T15:50:57.843176Z
我正在尝试创建自定义 grok 过滤器以在将 header 字段发送到 elasticsearch 之前添加它。
我的目标是这样的,
SessionID => "22304999"
HitNumber => "5"
FactValue => "400.OUTPUT_SERVICE.510"
DimValue1 => "submit"
ErrMessage => "The limit has been exceeded. Please use a different option."
IP => "2.54.44.221"
TLT_URL => "/api/output/v3/contract/:PCID/order"
URL => "/api/output/v3/contract/:pcid/order"
Refferer => "https://www.example.org/output/"
Method => "PUT"
StatsCode => "400"
ReqTime => "2017-09-28T15:50:57.843176Z"
我是新手,所以只是想了解我是如何应用和测试它的,例如我将从一个空过滤器开始,
filter {
grok {
match => { "message" => "" }
}
}
我的第一个问题是 match => { "message" => "" }
消息只是一个日志行吗? 'message' 的定义是什么?
我的日志和我想要的字段由一个 Tab 分隔,每个 Tab 之后是一个新字段,这是否会使我想要实现的目标变得更容易,而不是寻找一种模式,我可以只寻找下一个吗选项卡?
做不到这一点,有人可以为我的领域之一提供一个例子,我应该能够完成其余的。
正则表达式: (?<SessionID>\S+)\s+(?<HitNumber>\S+)\s+(?<FactValue>\S+)\s+(?<DimValue1>\S+)\s+(?<ErrMessage>.+)\s+(?<IP>(?:\d{1,3}\.){3}\d{1,3})\s+(?<TLT_URL>\S+)\s+(?<URL>\S+)\s+(?<Refferer>\S+)\s+(?<Method>\S+)\s+(?<StatsCode>\S+)\s+(?<ReqTime>\S+)
详情:
(?<>)
命名捕获组\S
匹配任何非空白字符\d
匹配一个数字,{n,m}
匹配n
到m
次+
匹配一次到无限次
输出:
{
"SessionID": [
[
"22304999"
]
],
"HitNumber": [
[
"5"
]
],
"FactValue": [
[
"400.OUTPUT_SERVICE.510"
]
],
"DimValue1": [
[
"submit"
]
],
"ErrMessage": [
[
"The limit has been exceeded. Please use a different option."
]
],
"IP": [
[
"2.54.44.221"
]
],
"TLT_URL": [
[
"/api/output/v3/contract/:PCID/order"
]
],
"URL": [
[
"/api/output/v3/contract/:pcid/order"
]
],
"Refferer": [
[
"https://www.example.org/output/"
]
],
"Method": [
[
"PUT"
]
],
"StatsCode": [
[
"400"
]
],
"ReqTime": [
[
"2017-09-28T15:50:57.843176Z"
]
]
}
如果您正在测试解决方案,您可以随时使用此站点:
我为你的问题做了这个 grok 模式:
%{INT:SessionID}\s*%{INT:HitNumber}\s*%{NOTSPACE:FaceValue}\s*%{GREEDYDATA:ErrMessage}\s*%{IP:IP}\s*%{NOTSPACE:TLT_URL}\s*%{NOTSPACE:URL}\s*%{NOTSPACE:Referer}\s*%{NOTSPACE:Method}\s*%{INT:StatsCode}\s*%{TIMESTAMP_ISO8601:ReqTime}