SED 或 awk。在两个字符串 + 和附加标识符之间查找文本
Sed or awk. Find text between two strings + and additional identifier
我想搜索文件并提取两个字符串之间的数据。我可以用 sed 来做到这一点。但我也需要它来只提取特定领域的信息。示例:
2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV
---SYSLOG DATA
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710956]
2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV_OFF
我的 sed 语句可以在 HOOK_EV 和 HOOK_EV_OFF 字符串之间提取数据。但是我希望它只提取特定 SID 号的数据。目前它将提取两个字符串之间的所有数据,但用于所有内容。因此,在上面的示例中,我只想为 HOOK_EV 和 HOOK_EV_OFF 字符串之间的 SID:1630710955 提取数据。
sed 能做所有这些吗?
sed -n '/HOOK_EV$/,/HOOK_EV_OFF$/ {/SID:1630710955/p}'
awk
在线留言:
awk -v sid=1630710955 '/HOOK_EV_OFF$/{flag=0;next}{if(flag && [=10=] ~ "SID:"sid){print}}/HOOK_EV$/{flag=1;next}' infile
解释:
awk -v sid=1630710955 '/HOOK_EV_OFF$/{flag=0;next} # Final pattern found --> turn off the flag and read next line
{if(flag && [=11=] ~ "SID:"sid){print}} # if flag and SID pattern in line print it
/HOOK_EV$/{flag=1;next} # Initial pattern found --> turn on the flag and read the next line
' infile
对于动态 SID
提取,您可以使用:
awk '/HOOK_EV_OFF$/{flag=0;SID="";next}
flag && $NF==SID
/HOOK_EV$/{flag=1;SID=$(NF-1);next}' infile
有这个输入文件:
2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710955]
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710956]
2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV_OFF
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710955]
2015-04-29T08:05:24.668345-04:00 test2 [S=4444] [SID:1630710965] HOOK_EV
2015-04-29T08:05:24.668345-04:00 test2 [S=4447] [SID:1630710965]
2015-04-29T08:05:24.668345-04:00 test2 [S=4447] [SID:1630710967]
2015-04-29T08:05:24.668345-04:00 test2 [S=4444] [SID:1630710965] HOOK_EV_OFF
输出将是:
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710955]
2015-04-29T08:05:24.668345-04:00 test2 [S=4447] [SID:1630710965]
我想搜索文件并提取两个字符串之间的数据。我可以用 sed 来做到这一点。但我也需要它来只提取特定领域的信息。示例:
2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV
---SYSLOG DATA
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710956]
2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV_OFF
我的 sed 语句可以在 HOOK_EV 和 HOOK_EV_OFF 字符串之间提取数据。但是我希望它只提取特定 SID 号的数据。目前它将提取两个字符串之间的所有数据,但用于所有内容。因此,在上面的示例中,我只想为 HOOK_EV 和 HOOK_EV_OFF 字符串之间的 SID:1630710955 提取数据。
sed 能做所有这些吗?
sed -n '/HOOK_EV$/,/HOOK_EV_OFF$/ {/SID:1630710955/p}'
awk
在线留言:
awk -v sid=1630710955 '/HOOK_EV_OFF$/{flag=0;next}{if(flag && [=10=] ~ "SID:"sid){print}}/HOOK_EV$/{flag=1;next}' infile
解释:
awk -v sid=1630710955 '/HOOK_EV_OFF$/{flag=0;next} # Final pattern found --> turn off the flag and read next line
{if(flag && [=11=] ~ "SID:"sid){print}} # if flag and SID pattern in line print it
/HOOK_EV$/{flag=1;next} # Initial pattern found --> turn on the flag and read the next line
' infile
对于动态 SID
提取,您可以使用:
awk '/HOOK_EV_OFF$/{flag=0;SID="";next}
flag && $NF==SID
/HOOK_EV$/{flag=1;SID=$(NF-1);next}' infile
有这个输入文件:
2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710955]
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710956]
2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV_OFF
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710955]
2015-04-29T08:05:24.668345-04:00 test2 [S=4444] [SID:1630710965] HOOK_EV
2015-04-29T08:05:24.668345-04:00 test2 [S=4447] [SID:1630710965]
2015-04-29T08:05:24.668345-04:00 test2 [S=4447] [SID:1630710967]
2015-04-29T08:05:24.668345-04:00 test2 [S=4444] [SID:1630710965] HOOK_EV_OFF
输出将是:
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710955]
2015-04-29T08:05:24.668345-04:00 test2 [S=4447] [SID:1630710965]