如何编写 bash 脚本来过滤日志中的一些数据
how to write bash script to filter some data in logs
我有一个这种格式的日志文件。
-----------------------------------------------------------
name=abc
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:5/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------
name=xyz
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:3/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------
name=awd
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:2/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------
如果日志文件中每个名字的寿命大于特定年份(比如 2),我想提取人名和寿命。该文件也将具有重复的名称和不同的详细信息。
输出:
name:abc
yearLived:5
name:xyz
yearsLived: 3
我试图使用 grep 和 cut 命令来做到这一点。我面临的问题是,一旦我执行 grep 或 cut,我就会丢失另一部分,即名称或地址。我该如何解决这个问题?
这是一个尝试:
awk 'BEGIN {RS = "name="} NR > 1 {match([=10=], "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {print "\t" years[2]}' records_file
编辑:容纳更新的日志行示例和所需的输出:
awk 'BEGIN {RS = "-{59}"} NR > 1 {match([=11=], "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {sub("=", ":", ); print "\n" yl[0]}' records
编辑 2:糟糕,本意是添加评论:要更改匹配年数的阈值,请更改 years[2] > 2
中的第二个 2
。希望对您有所帮助。
像
一样使用 awk
awk '[=10=]~/^name/{split([=10=],a,"=")}{if([=10=]~/yearsLived:[3-9]/){split([=10=],b,":|/");print "name:",a[2] "\nyearsLived: "b[9]}}' 'my_file'
打破一行shell代码
创建一个名为 awkscript
的文本文件并添加以下代码
#!/bin/awk
[=11=]~/^name/{
#find all lines that has name and reference in using an array 'a'
split([=11=],a,"=")
}
#find all lines that has years lived >2 and print name and years lived
{if([=11=]~/yearsLived:[3-9]/){
split([=11=],b,":|/");print "name:",a[2] "\nyearsLived: "b[9] #print name and year
}
}
现在 运行 awk 脚本在你的 shell 喜欢
awk -f 'awkscript' 'my_file'
我有一个这种格式的日志文件。
-----------------------------------------------------------
name=abc
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:5/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------
name=xyz
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:3/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------
name=awd
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:2/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------
如果日志文件中每个名字的寿命大于特定年份(比如 2),我想提取人名和寿命。该文件也将具有重复的名称和不同的详细信息。
输出:
name:abc
yearLived:5
name:xyz
yearsLived: 3
我试图使用 grep 和 cut 命令来做到这一点。我面临的问题是,一旦我执行 grep 或 cut,我就会丢失另一部分,即名称或地址。我该如何解决这个问题?
这是一个尝试:
awk 'BEGIN {RS = "name="} NR > 1 {match([=10=], "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {print "\t" years[2]}' records_file
编辑:容纳更新的日志行示例和所需的输出:
awk 'BEGIN {RS = "-{59}"} NR > 1 {match([=11=], "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {sub("=", ":", ); print "\n" yl[0]}' records
编辑 2:糟糕,本意是添加评论:要更改匹配年数的阈值,请更改 years[2] > 2
中的第二个 2
。希望对您有所帮助。
像
一样使用 awkawk '[=10=]~/^name/{split([=10=],a,"=")}{if([=10=]~/yearsLived:[3-9]/){split([=10=],b,":|/");print "name:",a[2] "\nyearsLived: "b[9]}}' 'my_file'
打破一行shell代码
创建一个名为 awkscript
的文本文件并添加以下代码
#!/bin/awk
[=11=]~/^name/{
#find all lines that has name and reference in using an array 'a'
split([=11=],a,"=")
}
#find all lines that has years lived >2 and print name and years lived
{if([=11=]~/yearsLived:[3-9]/){
split([=11=],b,":|/");print "name:",a[2] "\nyearsLived: "b[9] #print name and year
}
}
现在 运行 awk 脚本在你的 shell 喜欢
awk -f 'awkscript' 'my_file'