如何编写 bash 脚本来过滤日志中的一些数据

how to write bash script to filter some data in logs

我有一个这种格式的日志文件。

-----------------------------------------------------------
name=abc
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:5/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

name=xyz
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:3/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

name=awd
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:2/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

如果日志文件中每个名字的寿命大于特定年份(比如 2),我想提取人名和寿命。该文件也将具有重复的名称和不同的详细信息。

输出:

name:abc
yearLived:5
name:xyz
yearsLived: 3

我试图使用 grep 和 cut 命令来做到这一点。我面临的问题是,一旦我执行 grep 或 cut,我就会丢失另一部分,即名称或地址。我该如何解决这个问题?

这是一个尝试:

awk 'BEGIN {RS = "name="} NR > 1 {match([=10=], "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {print  "\t" years[2]}' records_file

编辑:容纳更新的日志行示例和所需的输出:

awk 'BEGIN {RS = "-{59}"} NR > 1 {match([=11=], "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {sub("=", ":", ); print  "\n" yl[0]}' records

编辑 2:糟糕,本意是添加评论:要更改匹配年数的阈值,请更改 years[2] > 2 中的第二个 2。希望对您有所帮助。

一样使用 awk
awk '[=10=]~/^name/{split([=10=],a,"=")}{if([=10=]~/yearsLived:[3-9]/){split([=10=],b,":|/");print "name:",a[2] "\nyearsLived: "b[9]}}' 'my_file'

打破一行shell代码

创建一个名为 awkscript 的文本文件并添加以下代码

#!/bin/awk
[=11=]~/^name/{
    #find all lines that has name and reference in using an array 'a'   
    split([=11=],a,"=") 
          }
#find all lines that has years lived >2 and print name and years lived
{if([=11=]~/yearsLived:[3-9]/){ 
    split([=11=],b,":|/");print "name:",a[2] "\nyearsLived: "b[9] #print name and year
}
}

现在 运行 awk 脚本在你的 shell 喜欢

 awk -f 'awkscript'  'my_file'