使用 awk 转换空字段

Transforms a null field with awk

我有一个像这样的字符串:

Topic: test1    TopicId: IMjrpzIARVyMPVgxRC1dsA PartitionCount: 1   ReplicationFactor: 2    Configs: message.format.version=2.8-IV1,message.timestamp.type=CreateTime,min.insync.replicas=1
Topic: test2    TopicId: KFkR8ukRQXif1nvjQAcwZA PartitionCount: 1   ReplicationFactor: 1    Configs: cleanup.policy=compact
Topic: d9nvrvth TopicId: 5Ec71za3TV-KznAnG8nV0Q PartitionCount: 6   ReplicationFactor: 3    Configs: message.format.version=2.3-IV1,cleanup.policy=delete,max.message.bytes=2097164,min.compaction.lag.ms=0,message.timestamp.type=CreateTime,min.insync.replicas=2,segment.bytes=104857600,segment.ms=604800000,retention.ms=604800000,message.timestamp.difference.max.ms=9223372036854775807,delete.retention.ms=86400000,retention.bytes=-1

我只想 select 2 个字段(cleanup.policy 和 retention.ms),但有时字符串中不存在这些字段。 当这些字段不存在时,我想设置一个默认值。

我用这句awk

awk '
            match([=11=],/Topic:[^\t]*/){
            topic=substr([=11=],RSTART+6,RLENGTH-6)
            match([=11=],/retention\.ms[^,]*/)
            retention=substr([=11=],RSTART+13,RLENGTH-13)
            if ( length(retention == 0) retention = "1 week"
            match([=11=],/cleanup\.policy[^,]*/)
            clean=substr([=11=],RSTART+15,RLENGTH-15)
            if ( length(clean == 0) clean = "delete"
            print topic","retention,","clean }'

但问题是总是给我相同的值

OP 当前 awk 代码的一些问题:

  • 未尝试捕获 retention.mscleanup.policy 属性的值
  • /retention\.ms/ 匹配 retention.msdelete.retention.ms 所以 match() 会在 Configs: 部分找到第一个
  • print 正在打印文字字符串 "retention""clean" 而不是变量 retentionclean
  • 的内容

一个awk想法:

awk '
 == "Topic:" { topic=
                 retention="1 week"                    # set default value
                 clean="delete"                        # set default value

                 n=split($NF,a,/[,=]/)                 # split last field on dual delimiters "," and "=";
                                                       # odd indexed entries are attributes, even indexed entries are values

                 for (i=1;i<=n;i+=2) {                 # loop through list of attributes
                     if (a[i]=="retention.ms")         # if we have an attribute match then ...
                        retention=a[i+1]               # save value
                     if (a[1]=="cleanup.policy")       # if we have an attribute match then ...
                        clean=a[i+1]                   # save value
                 }

                 print topic, retention, clean
               }
' topic.dat

这会生成:

test1 1 week delete
test2 1 week compact
d9nvrvth 604800000 delete