如何从文件中获取多个字符串并按组打印

how to grep the multiple strings from a file and print them in group wise

我正在尝试从主机日志文件中获取错误列表,作为一个巨大的文件,它会打印大量数据并且很难看到重复和记录的错误

    0x45bae19d6bc0 IO type 16648 (READ) isOrdered:NO isSplit:NO isEncr:NO since 7990 msec status I/O error
        Throttled: 82 IO failed on disk e3d17cdb-3190-9e21-ea45-4cff39420501, Wake up 0x45ba3a34f9c0 with status I/O error
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10432 microseconds to 5392073 microseconds.
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10444 microseconds to 10822733 microseconds.
        naa.5000c500bb7a661f performance has improved. I/O latency reduced from 10822733 microseconds to 2163435 microseconds.
        naa.5000c500bb7a661f performance has improved. I/O latency reduced from 2163435 microseconds to 426054 microseconds.
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10465 microseconds to 925119 microseconds.
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10469 microseconds to 1904014 microseconds.
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10472 microseconds to 3936215 microseconds.
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10479 microseconds to 8517984 microseconds.
        cpu3:2099278)Migrate: 448: Error reading from pending connection: Failure
        cpu3:2099278)Migrate: 448: Error reading from pending connection: Failure
        cpu3:2099278)Migrate: 448: Error reading from pending connection: Failure
        cpu3:2099278)Migrate: 448: Error reading from pending connection: Failure
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10490 microseconds to 17358740 microseconds.
        0x45bae0fefe40 IO type 16648 (READ) isOrdered:NO isSplit:NO isEncr:NO since 48543 msec status I/O error
        Throttled: 82 IO failed on disk e3d17cdb-3190-ea45-4cff39420501, Wake up 0x45da36318840 with status I/O error
        naa.5000c500ba661f performance has improved. I/O latency reduced from 17358740 microseconds to 3372968 microseconds.
        naa.5000c500bb7a661f performance has improved. I/O latency reduced from 3372968 microseconds to 674458 microseconds.
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10677 microseconds to 1353205 microseconds.
        naa.5000c500bb7a661f performance has improved. I/O latency reduced from 1353205 microseconds to 268942 microseconds.
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10682 microseconds to 419051 microseconds.
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10682 microseconds to 872847 microseconds.
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10684 microseconds to 1770518 microseconds.
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10687 microseconds to 3640051 microseconds.
        0x45dae4fe25c0 IO type 16648 (READ) isOrdered:NO isSplit:NO isEncr:NO since 15991 msec status I/O error
        Throttled: 82 IO failed on disk e3d17cdb-3190--ea45-4cff39420501, Wake up 0x45da362677c0 with status I/O error
        0x45dae4fe2340 IO type 16648 (READ) isOrdered:NO isSplit:NO isEncr:NO since 24806 msec status I/O error
        cpu3:2099278)Migrate: 448: Error reading from pending connection: Failure
        cpu3:2099278)Migrate: 448: Error reading from pending connection: Failure
        cpu10:36926358)MemSchedAdmit: 471: Admission failure in path: vm.36926352/vmmanon.36926352
        cpu23:36926381)MemSchedAdmit: 471: Admission failure in path: vm.36926375/vmmanon.36926375
        Throttled: 82 IO failed on disk e3d17cdb-3190-9e21-ea45-4cff39420501, Wake up 0x45ba3abe8880 with status I/O error
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10696 microseconds to 7557465 microseconds.
        Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10711 microseconds to 15202991 microseconds.
        naa.5000c500bb7a661f performance has improved. I/O latency reduced from 15202991 microseconds to 2944264 microseconds.
        naa.5000c500bb7a661f performance has improved. I/O latency reduced from 2944264 microseconds to 577176 microseconds.
        naa.5000c500bb7a661f performance has improved. I/O latency reduced from 577176 microseconds to 112712 microseconds.

我期待以下输出,我搜索了很多地方但没有找到合适的解决方案,希望 awk 和 sed 可以实现

   egrep -i "latency|I/O error|Failure" error.log 
    
    Failure

   cpu3:2099278)Migrate: 448: Error reading from pending connection: Failure
   cpu3:2099278)Migrate: 448: Error reading from pending connection: Failure
   cpu3:2099278)Migrate: 448: Error reading from pending connection: Failure
   cpu3:2099278)Migrate: 448: Error reading from pending connection: Failure

    IO Errors

  cpu5:2098752)WARNING: LSOM: RCIOCompletionLoop:93: Throttled: 82 IO failed on disk e3d17cdb-3190-9e21-ea45-4cff39420501, Wake up 0x45da362677c0 with status I/O error
   cpu6:2097866)LSOMCommon: IORETRYCompleteIO:470: Throttled: 0x45dae4fe2340 IO type 16648 (READ) isOrdered:NO isSplit:NO isEncr:NO since 24806 msec status I/O error
cpu2:2098752)WARNING: LSOM: RCIOCompletionLoop:93: Throttled: 82 IO failed on disk e3d17cdb-3190-9e21-ea45-4cff39420501, Wake up 0x45ba3abe8880 with status I/O error
 cpu9:2099365 opID=add9908b)WARNING: ScsiDeviceIO: 12028: READ CAPACITY on device “naa.5000c500bb7a661f” from Plugin “HPP” failed. I/O error

    LAtency

cpu5:2097866)WARNING: ScsiDeviceIO: 1596: Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10682 microseconds to 419051 microseconds.
cpu19:2097867)WARNING: ScsiDeviceIO: 1596: Device naa.5000c500bb7a661f performance has deteriorated. I/O latency increased from average value of 10682 microseconds to 872847 microseconds

假设:

  • 如果多个模式匹配一​​行,我们将在每个输出组中显示该行
  • 组标题是搜索模式的精确重印(即,不会重新格式化组 headers,就像在搜索模式 I/O error 成为组标题 [=15= 的问题中所做的那样])
  • 不需要只匹配整个单词(例如,failure 将匹配 failurefailuresnonfailuresstufffailuresXYZ
  • 在输出组中,我们希望保持行的输入顺序

问题的当前 input/output 不匹配,因此在解决之前我们将使用一组较小的(呃)输入数据来进行演示:

$ cat test.log
you can ignore this line
you should match this line on abcLaTeNcYxyz
yeah, match this line on Failures and throttled
you can ignore this line
more matches for i/o error and latency
single match on I/O error
couple more matches on failures
couple more matches on failure
ignore this line, too

将 non-matching 字符串 (no-match) 添加到混音中:

$ patterns='latency|I/O error|Failure|throttled|no-match'

一个GNU awk想法(对于数组数组和PROCINFO["sorted_in"]):

awk -v plist="${patterns}" '
BEGIN   { IGNORECASE=1
          delete groups

          n=split(plist,arr,"|")                     # break plist up into components
          for (i=1;i<=n;i++) {
              ptns[arr[i]]                           # assign as indices of ptns[] array for easier processing
              groups[arr[i]][0]                      # place holder to allow us to print an empty group
          }
        }

        { for (ptn in ptns)                          # loop through list of patterns and ...
              if ([=12=] ~ ptn)                          # if found then ...
                 groups[ptn][c++]=[=12=]                 # save in groups[] array
        }

END     { PROCINFO["sorted_in"]="@ind_str_asc"
          for (ptn in ptns) {
              printf "\n######### %s\n\n", ptn
              PROCINFO["sorted_in"]="@ind_num_asc"   # sort the c++ values in ascending order => maintain input ordering
              for (i in groups[ptn])
                  if (groups[ptn][i] != "")
                     print groups[ptn][i]
          }
        }
' test.log

这会生成:

######### Failure

yeah, match this line on Failures and throttled
couple more matches on failures
couple more matches on failure

######### I/O error

more matches for i/o error and latency
single match on I/O error

######### latency

you should match this line on abcLaTeNcYxyz
more matches for i/o error and latency

######### no-match


######### throttled

yeah, match this line on Failures and throttled