识别分层文件 bash 脚本中的行

Question

我需要确定文件中配置条目列表的头部。不可预测，它可以是任何字符串，但它始终是从比其他行更靠近左侧开始的行（不包括“exit”）：

这是一个例子：

    vpls 2662 customer 1 v-vpls vlan 2662 create
        description "RES_2662"
        mac-move
            allow-res-res
            allow-reg-res
        exit
        stp
            shutdown
        exit
        ingress
            qos 2
        exit
        sap lt:1/1/1:2662 create
            description "RES_2662"
            enable-stats
            no shutdown
        exit
        sap lag-1:2662 create
            no shutdown
        exit
        no shutdown
    exit
    vpls 2663 customer 1 v-vpls vlan 2663 create
        description "RES_2663"
        mac-move
            allow-res-res
            allow-reg-res
        exit
        stp
            shutdown
        exit
        ingress
            qos 2
        exit
        sap lt:1/1/1:2663 create
            description "RES_2663"
            enable-stats
            no shutdown
        exit
        sap lag-1:2663 create
            no shutdown

在这种情况下，我需要能够识别以以下内容开头的两行： vpls 266X customer 1 v-vpls vlan 266X create 脚本应该知道这些是我正在寻找的行。

输出并不总是在左侧显示空格，如本例所示：

port vlan-port:1/1/1/3/7/4/4:824
  admin-up
  severity no-value
exit
port vlan-port:1/1/1/3/7/4/4:1224
  admin-up
  severity no-value
exit

在这种情况下，所需的行是： port vlan-port:x/x/x/x/x/x/x/x

不知道用grep/sed/awk能不能做到。

感谢您的帮助。

Answer 1

我怀疑有更好的方法可以做到这一点，但我的第一个想法如下。你可以从这样的事情开始并改进它。

minl=$(awk '{match([=10=], /^ */);if (NR==1 || RLENGTH<minl) {minl=RLENGTH}} END{print minl}' test.txt)
sed -n "/^[ ]\{${minl}\}[^ ]/p" test.txt | grep -v "exit"

第一行使用awk获取文件行首的最小空格数。

第二行使用sed匹配以第一行计算的空格数开头的行。我通过 grep -v "exit" 将结果通过管道传输以摆脱退出行...您可能需要更严格地检查有效的输出行是否包含文本“exit.”

Answer 2

另一种可能的解决方案

# gets the number of leading spaces + 1
n=$(sort -r file.txt | sed -nE '1s/(^ *).*//p' | wc -c | tr -d ' ')
# filter the file
egrep -vE "^ {$n,}|^ *exit" file.txt

Answer 3

以下将在每个 Unix 机器上使用任何 shell 中的任何 awk 并在重要的情况下保留输出的输入行顺序：

$ cat tst.awk
 != "exit" {
    match([=10=],/^ */)
    if ( (min == "") || (RLENGTH <= min) ) {
        min = RLENGTH
        lines[min,++cnt[min]] = [=10=]
    }
}
END {
    for (i=1; i<=cnt[min]; i++) {
        print lines[min,i]
    }
}

$ awk -f tst.awk file
    vpls 2662 customer 1 v-vpls vlan 2662 create
    vpls 2663 customer 1 v-vpls vlan 2663 create

Answer 4

假设：

前导白色 space 仅由 space 组成（即没有制表符，没有非打印字符）

一个 awk 想法，我们维护一个包含（当前）最小前导数 space 的那些行的数组，每当我们找到具有更少（即 'new' 最小）前导 spaces:

awk '
BEGIN   { min = 9999999 }

/^$/    { exit }                  # skip blank lines

/exit/  { if {NF==1) next }       # skip lines with single field "exit"

        { n = match([=10=],/[^ ]/)    # find index of first non-space

          if ( n < min ) {        # if a new minimum is found then ...
             delete arr           # delete the array and ...
             i = 1                # reset the array index and ...
             min = n              # reset the min
          }

          if ( n == min )         # if current row matches with "min" then ...
             arr[i++] = [=10=]        # save the row in our array; increment the index
        }

END     { for (j=1;j<i;j++)       # loop through entries in array
             print arr[j]
        }
' file.dat

对于 OP 的第一组数据，这会生成：

    vpls 2662 customer 1 v-vpls vlan 2662 create
    vpls 2663 customer 1 v-vpls vlan 2663 create

对于 OP 的第二组数据，这会生成：

port vlan-port:1/1/1/3/7/4/4:824
port vlan-port:1/1/1/3/7/4/4:1224

Answer 5

试试这个 perl 和 awk 组合：

$ perl -ne ' /(^.\s*)/ and !/^\s*exit/ and print length(), $_ ' fernando.txt | sort -n | awk ' { f=; p=NR==1?f:p; sub(/^[0-9]+/,"",[=10=]);if(f==p)  print } '
    vpls 2662 customer 1 v-vpls vlan 2662 create
    vpls 2663 customer 1 v-vpls vlan 2663 create

识别分层文件 bash 脚本中的行

Identify line in hierarchical file bash script

awk

grep

sed