从文本文件中获取匹配对

Question

我有一个文本文件test1.txt:

1   first_match
2   not_needed_line1
3   not_needed_line2
4   not_needed_line3
5   second_match
6   not_needed_line4
7   not_needed_line5
8   not_needed_line6
9   not_needed_line7
10  not_needed_line8
11  first_match
12  second_match
13  not_needed_line9
14  not_needed_line10
15  not_needed_line11
16  second_match
17  not_needed_line12
18  not_needed_line13
19  second_match
20  not_needed_line14
21  second_match
22  not_needed_line15
23  not_needed_line16
24  first_match
25  not_needed_line17
26  not_needed_line18
27  second_match

我想提取包含 "first_match" 和 ""=55=]" 的对并添加文件名test1.txt 在结果的每一行之前。

在此示例中，它将是行：

#1 and #5
#11 and #12
#24 and #27

请注意 - #16、#19 和 #21 行不包括在内，因为它们缺少对 "first_match".[= 中的第一个匹配行16=]

我找到了 awk (GNU Awk 3.1.6) 脚本来提取对之间的所有行。

/first_match/{printf FILENAME " - "; f=1} f; /second_match/{f=0}

结果是：

test1.txt - 1   first_match
2   not_needed_line1
3   not_needed_line2
4   not_needed_line3
5   second_match
test1.txt - 11  first_match
12  second_match
test1.txt - 24  first_match
25  not_needed_line17
26  not_needed_line18
27  second_match

问题：

如何只获取包含 "first_match" 和 "second_match"?[=47 的对=]

test1.txt - 1   first_match
test1.txt - 5   second_match
test1.txt - 11  first_match
test1.txt - 12  second_match
test1.txt - 24  first_match
test1.txt - 27  second_match

如何只从对中获取第二行 - "second_match"?

test1.txt - 5   second_match
test1.txt - 12  second_match
test1.txt - 27  second_match

Answer 1

这会在输出的末尾打印“second_match onlys”

gawk '
/first_match/{
  cnt=1
  old=[=10=]
}
/second_match/{
  if (cnt==1){
    print FILENAME, "-", old;
    print FILENAME,"-",[=10=]
  }else{
    only[++o]=FILENAME" - "[=10=]
  }
  cnt=0
}
END{
  print "\nonlys";
  for(i=1;i<=o;i++)
    print only[i]
}'

Answer 2

同时打印“first_match”和“second_match”对：

awk '
    /first_match/ && !f {print FILENAME, "-", NR, [=10=]; f=1}
    /second_match/ && f {print FILENAME, "-", NR, [=10=]; f=0}
' test1.txt

输出：

test1.txt - 1 first_match
test1.txt - 5 second_match
test1.txt - 11 first_match
test1.txt - 12 second_match
test1.txt - 24 first_match
test1.txt - 27 second_match

仅打印对的“second_match”：

awk '
    /first_match/ && !f {f=1}
    /second_match/ && f {print FILENAME, "-", NR, [=12=]; f=0}
' test1.txt

输出：

test1.txt - 5 second_match
test1.txt - 12 second_match
test1.txt - 27 second_match

[编辑]
正如 Ed Morton 指出的那样，即使没有相应的 second_match，上面的“两者”版本也会打印 first_match。这是一个严格的版本：

awk '
  /first_match/ && !f {l1 = FILENAME " - " NR " " [=14=]; f=1}
  /second_match/ && f {print l1; print FILENAME, "-", NR, [=14=]; f=0}
' test1.txt

从文本文件中获取匹配对

Get matching pairs from text file

awk