将标准输出块解析为 bash 或 ruby 中的数组

Parsing stdout chunks into arrays in bash or ruby

我正在尝试找到最有效的方法将 racadm (dell chassis/idrac) 的 stdout 日志条目转换为单独的数组或 json 数组,以便我可以一次评估每个条目。输出始终具有相同的字段。下面的输出非常典型

$ racadm chassislog view -c Storage -b PDR
SeqNumber       = 11700
Message ID      = PDR17
Category        = Storage
AgentID         = CMC
Severity        = Information
Timestamp       = 2020-03-21 00:02:06
Message Arg   1 = Physical Disk 0:0:15
FQDD            = Disk.Bay.15:Enclosure.Internal.0-0:RAID.ChassisIntegrated.1-1
Message         = Global hot spare assigned to Physical Disk 0:0:15.
--------------------------------------------------------------------------------
SeqNumber       = 11699
Message ID      = PDR26
Category        = Storage
AgentID         = CMC
Severity        = Information
Timestamp       = 2020-03-21 00:02:04
Message Arg   1 = Physical Disk 0:0:3
FQDD            = Disk.Bay.3:Enclosure.Internal.0-0:RAID.ChassisIntegrated.1-1
Message         = Physical Disk 0:0:3 is online.
--------------------------------------------------------------------------------
SeqNumber       = 11696
Message ID      = PDR71
Category        = Storage
AgentID         = CMC
Severity        = Information
Timestamp       = 2020-03-21 00:02:01
Message Arg   1 = Physical Disk 0:0:15
Message Arg   2 = Physical Disk 0:0:3
FQDD            = Disk.Bay.15:Enclosure.Internal.0-0:RAID.ChassisIntegrated.1-1
Message         = Copyback completed from Physical Disk 0:0:15 to Physical Disk 0:0:3.
--------------------------------------------------------------------------------
SeqNumber       = 11670
Message ID      = PDR70
Category        = Storage
AgentID         = CMC
Severity        = Information
Timestamp       = 2020-03-20 21:45:47
Message Arg   1 = Physical Disk 0:0:15
Message Arg   2 = Physical Disk 0:0:3
FQDD            = Disk.Bay.15:Enclosure.Internal.0-0:RAID.ChassisIntegrated.1-1
Message         = Copyback started from Physical Disk 0:0:15 to Physical Disk 0:0:3.
--------------------------------------------------------------------------------
SeqNumber       = 11667
Message ID      = PDR8
Category        = Storage
AgentID         = CMC
Severity        = Information
Timestamp       = 2020-03-20 21:45:44
Message Arg   1 = Physical Disk 0:0:3
FQDD            = Disk.Bay.3:Enclosure.Internal.0-0:RAID.ChassisIntegrated.1-1
Message         = Physical Disk 0:0:3 is inserted.
--------------------------------------------------------------------------------

我真的很想将整个输出读入关联数组,这样我就可以单步执行 for 事件循环中的每个条目。在 ruby(主厨) 或 bash.

中寻求指导

不是 bash,因为 shell 用于处理文件和启动命令,但使用 GNU awk,它通常被错误地认为是 shell 的一部分,它是简单而强大的编程语。 遍历事件的 for 循环中的每个条目 并不是真正的要求,所以这里有一个小示例:

$ gawk -v item="Message Arg   2" '  # queried item as parameter 
BEGIN {
    RS="\n-+$\n"                    # record is separated by a bunch of -:s
    FS="\n"                         # a line is a field within a record
}
{
    for(nf=1;nf<=NF;nf++) {         # loop all lines in a record
        split($nf,t,/ *= */)        # split lines by = and surrounding space
        a[NR][t[1]]=t[2]            # hash to a 2 dimensional array indexed by
    }                               # record no. and the item, value as value
}
END {                               # after lines are hashed, make queries
    for(nr in a)                    # for each record in hash
        if(item in a[nr])           # if queried item is found in it
            printf "%d: %s = %s\n", nr,item,a[nr][item]  # output
}' file

查询项 Message Arg 2 的输出:

3: Message Arg   2 = Physical Disk 0:0:3
4: Message Arg   2 = Physical Disk 0:0:3

这是 的替代结尾,匹配我在 "Message" 中寻找的条件 我想参考相应的 FQDD:

$ gawk -v item=Message -v cond=started -v output=FQDD
BEGIN {
    RS="\n-+$\n"                    # record is separated by a bunch of -:s
    FS="\n"                         # a line is a field within a record
}
{
    for(nf=1;nf<=NF;nf++) {         # loop all lines in a record
        split($nf,t,/ *= */)        # split lines by = and surrounding space
        a[NR][t[1]]=t[2]            # hash to a 2 dimensional array indexed by
    }                               # record no. and the item, value as value
}
END {
    for(nr in a)
        if((item in a[nr]) && a[nr][item]~cond)
            printf "%d: %s = %s\n", nr,output,a[nr][output]
}

现在输出:

4: FQDD = Disk.Bay.15:Enclosure.Internal.0-0:RAID.ChassisIntegrated.1-1

即。如果在 a[nr][item] 中找到变量 item 并且该数组元素值与 cond 匹配,则在同一记录中打印 a[nr]["FQDD"] 的值。 在 SQL 中会是 SELECT output FROMfileWHERE item LIKE '%cond%'

这个 perl 单行代码将像上面那样的输入转换为一个 JSON 对象数组,然后您可以在任何 JSON 感知工具中对其进行处理。

racadm chassislog view -c Storage -b PDR | \
perl -MJSON::PP -lne 'if (/([^=]*?)\s*=\s*(.*)/) { $obj{} =  }
                      elsif (/^-+$/) { push @records, { %obj }; undef %obj }
                      END { push @records, { %obj } if defined %obj;
                            print encode_json(\@records) }'

输出(漂亮打印后):

[
  {
    "Timestamp": "2020-03-21 00:02:06",
    "Message ID": "PDR17",
    "Category": "Storage",
    "Message": "Global hot spare assigned to Physical Disk 0:0:15.",
    "AgentID": "CMC",
    "Severity": "Information",
    "SeqNumber": "11700",
    "FQDD": "Disk.Bay.15:Enclosure.Internal.0-0:RAID.ChassisIntegrated.1-1",
    "Message Arg   1": "Physical Disk 0:0:15"
  },
  {
    "Category": "Storage",
    "Message ID": "PDR26",
    "Timestamp": "2020-03-21 00:02:04",
    "SeqNumber": "11699",
    "Message": "Physical Disk 0:0:3 is online.",
    "Severity": "Information",
    "AgentID": "CMC",
    "Message Arg   1": "Physical Disk 0:0:3",
    "FQDD": "Disk.Bay.3:Enclosure.Internal.0-0:RAID.ChassisIntegrated.1-1"
  },
  {
    "FQDD": "Disk.Bay.15:Enclosure.Internal.0-0:RAID.ChassisIntegrated.1-1",
    "Message Arg   2": "Physical Disk 0:0:3",
    "Message Arg   1": "Physical Disk 0:0:15",
    "Severity": "Information",
    "AgentID": "CMC",
    "Message": "Copyback completed from Physical Disk 0:0:15 to Physical Disk 0:0:3.",
    "SeqNumber": "11696",
    "Timestamp": "2020-03-21 00:02:01",
    "Category": "Storage",
    "Message ID": "PDR71"
  },
  {
    "Message Arg   1": "Physical Disk 0:0:15",
    "FQDD": "Disk.Bay.15:Enclosure.Internal.0-0:RAID.ChassisIntegrated.1-1",
    "Message Arg   2": "Physical Disk 0:0:3",
    "SeqNumber": "11670",
    "Message": "Copyback started from Physical Disk 0:0:15 to Physical Disk 0:0:3.",
    "Severity": "Information",
    "AgentID": "CMC",
    "Category": "Storage",
    "Message ID": "PDR70",
    "Timestamp": "2020-03-20 21:45:47"
  },
  {
    "Timestamp": "2020-03-20 21:45:44",
    "Message ID": "PDR8",
    "Category": "Storage",
    "Message": "Physical Disk 0:0:3 is inserted.",
    "AgentID": "CMC",
    "Severity": "Information",
    "SeqNumber": "11667",
    "FQDD": "Disk.Bay.3:Enclosure.Internal.0-0:RAID.ChassisIntegrated.1-1",
    "Message Arg   1": "Physical Disk 0:0:3"
  }
]

基于 Shawns one liner 作为模式,一位同事最终找到了一种 python 2.7 兼容的方式来做我们想要的,代码在下面并提供了我需要的确切功能。

import re
import json
from pprint import pprint

regex_string_1 = '([^=]*?)\s*=\s*(.*)'
regex_string_2 = '^-+$'
regex1 = re.compile(regex_string_1)
regex2 = re.compile(regex_string_2)
current_entry = {}
entries = []
lines = test.split('\n')
for line in lines:
    if regex1.match(line):
        key, value = [element.strip() for element in line.split('=')]
        current_entry[key] = value
    elif regex2.match(line):
        entries.append(current_entry)
        current_entry = {}
pprint(entries)