Splunk 搜索正则表达式：过滤时间戳和用户 ID

Question

下面的文本想要从下面的行中提取与 UserId 对齐的时间戳并将其分组

    2020-10-12 12:30:22.540  INFO 1 --- [enerContainer-4] c.t.t.o.s.s.UserPrepaidService       : Validating the user with UserID:1111 systemID:sys111

从下面整个日志

2020-10-12 12:30:22.538  INFO 1 --- [ener-4] c.t.t.o.s.service.UserService        :    AccountDetails":[{"snumber":"2222","sdetails":[{"sId":"0474889018","sType":"Java","plan":[{"snumber":"sdds22"}]}]}]}
    2020-10-12 12:30:22.538  INFO 1 --- [ener-4] c.t.t.o.s.service.ReceiverService        : Received userType is:Normal
    2020-10-12 12:30:22.540  INFO 1 --- [enerContainer-4] c.t.t.o.s.s.UserPrepaidService       : Validating the user with UserID:1111 systemID:sys111 
    2020-10-12 12:30:22.540  INFO 1 --- [enerContainer-4] c.t.t.o.s.util.CommonUtil                : The  Code is valid for userId: 1111 systemId: sys111
    2020-10-12 12:30:22.577  INFO 1 --- [enerContainer-4] c.t.t.o.s.r.Dao        : Saving user into dB ..... with User-ID:1111

.....

相同的重复行

下面是我的 SPL 搜索命令 returns 仅来自该特定行的用户 ID 组。

但我还想要该行的时间戳，并用时间图对其进行分组

index="tis" logGroup="/ecs/logsmy" "logEvents{}.message"="*Validating the user with UserID*" | spath output=myfield path=logEvents{}.message | rex field=myfield "(?<=Validating the user with UserID:)(?<userId>[0-9]+)(?= systemID:)" |  table userId | dedup userId | stats count values(userId) by userId

下面基本上我都腻了

(^(?<dtime>\d{4}-\d{1,2}-\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2}\.\d+) )(?<=Validating the user with UserID:)(?<userId>[0-9]+)(?= systemID:)

但它给出了所有时间戳，而不是我上面提到的那一行

Answer 1

您在匹配时间戳模式后立即放置了环视，但您必须首先移动到后视为真的位置。

如果您需要这两个值，您可以匹配 Validating the user with UserID: 和 systemID: 而不是使用环视。

如果有前导空白字符，您可以将它们与 \s 或 [^\S\r\n]*

匹配

^\s*(?<dtime>\d{4}-\d{1,2}-\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2}\.\d+).*\bValidating the user with UserID:(?<userId>[0-9]+) systemID:

Regex demo

Splunk 搜索正则表达式：过滤时间戳和用户 ID

Splunk search Regex: to filter timestamp and userId

regex

splunk