sed 头痛：在文件中的单个匹配项上插入行（不是每行）

Question

经过八个多小时的搜索，我认输并为此创建了一个新问题。操作很简单，但我很难让它正常工作，似乎已经通过了 SO 上的所有其他解决方案。我需要两件事：

1.) 在整个文件中PBS 的FIRST MATCH 出现的行之前插入一行。它应该在整个文件中只发生一次。出于某种原因，我尝试过的每个解决方案最终都会为文件中的每个事件复制插入；我怀疑，因为 sed 是逐行跟踪的。

所以这需要发生。原始文件：

stuff here  
stuff here  
PBS -N  
PBS -V  
stuff here

变为：

stuff here  
stuff here  
**inserted line**  
PBS -N  
PBS -V  
stuff here

2.) 在整个文件中出现"PBS" 的LAST MATCH 的行之后追加一行。和以前一样：它应该在整个文件中只发生一次。

所以这需要发生：

stuff here  
stuff here  
PBS -N  
PBS -V  
stuff here

变为：

stuff here  
stuff here  
PBS -N  
PBS -V  
**inserted line**  
stuff here

我在网上看到的所有解决方案（此时我打开了大约二十个选项卡）都表明这应该相对容易。我毫不羞愧地宣布 sed 在这一点上正在损害我的自尊...感谢任何可以提供帮助的人

Answer 1

这里有三种方法，两种使用sed，一种使用awk。

单独使用 sed

在第一次出现之前插入一次

$ sed ':a;$!{N;ba}; s/PBS/inserted line\nPBS/' file
stuff here
stuff here
inserted line
PBS -N
PBS -V
stuff here

在最后一次出现后插入一次：

$ tac file | sed ':a;$!{N;ba}; s/PBS/inserted line\nPBS/' | tac
stuff here
stuff here
PBS -N
PBS -V
inserted line
stuff here

工作原理

:a;$!{N;ba};

这会一次读入整个文件。（如果整个文件很大，你会想看看其他方法之一。）
s/PBS/inserted line\nPBS/

这会执行替换。
tac

通常，在我们读入整个文件之前，没有办法知道文件中最后出现的 PBS。然而，tac 颠倒了行的顺序。因此，最后的变成了第一个。

使用 awk

awk 的主要优势在于它允许轻松使用变量。在这里，我们创建一个标志 f，在我们到达第一次出现 PBS 后将其设置为 true：

$ awk '/PBS/ && !f {print "inserted line"; f=1} 1'  file
stuff here
stuff here
inserted line
PBS -N
PBS -V
stuff here

要在最后一次出现之后插入，我们可以使用上述 tac 解决方案。为了多样化，这种方法分两次读取文件。在第一个运行上，它跟踪 PBS 的最后行号。第二，它打印需要打印的内容：

$ awk 'NR==FNR{if (/PBS/)n=FNR;next} 1{print} n==FNR {print "inserted line"}'  file file
stuff here
stuff here
PBS -N
PBS -V
inserted line
stuff here

这些 awk 解决方案一次处理一行文件。如果文件非常大，这有助于限制内存使用。

使用 grep 和 sed

另一种方法是使用grep 来告诉我们需要处理的行号。在第一次出现之前插入：

$ sed "$(grep -n PBS file | cut -d: -f1 | head -n1)"' s/PBS/inserted line\nPBS/' file
stuff here
stuff here
inserted line
PBS -N
PBS -V
stuff here

这在最后一个之后插入：

$ sed  "$(grep -n PBS file | cut -d: -f1 | tail -n1)"' s/.*PBS.*/&\ninserted line/' file
stuff here
stuff here
PBS -N
PBS -V
inserted line
stuff here

这种方法不需要一次将整个文件读入内存。

Answer 2

@John1924 回答得很好。在这种情况下，您也可以不以有效的方式完成任务，例如：

仅打印第一个 PBS 之前的行
回显行
仅打印（包括）第一个 PBS 之后的行

例如。当 ./pbsfile

中有以下内容时

line 1
line 2
PBS -N first
PBS -N second
line 3
PBS -V last-1
PBS -V last
line 4
line 5

上面的例子可以做到：

pbsfile="./pbsfile"

(
#delete the lines after the 1st PBS
#so remains only the lines before the 1st PBS
sed  '/PBS/,$d' "$pbsfile"

#echo the needed line
echo "THIS SOULD BE INSERTED BEFORE 1st PBS"

#print only the lines after the 1st PBS
sed -n '/PBS/,$p' "$pbsfile"

)

产生：

line 1
line 2
THIS SOULD BE INSERTED BEFORE 1st PBS
PBS -N first
PBS -N second
line 3
PBS -V last-1
PBS -V last
line 4
line 5

同上，最后一个PBS也可以，将sed前后的文件倒过来即可，例如以下

pbsfile="./pbsfile"

(
tail -r "$pbsfile" | sed -n '/PBS/,$p' | tail -r
echo "THIS SOULD BE INSERTED AFTER THE LAST PBS"
tail -r "$pbsfile" | sed  '/PBS/,$d' | tail -r
)

什么产生

line 1
line 2
PBS -N first
PBS -N second
line 3
PBS -V last-1
PBS -V last
THIS SOULD BE INSERTED AFTER THE LAST PBS
line 4
line 5

同样，这仅适用于 "alternative solution"（无效）。

Answer 3

另一种 sed 方法：

sed '/PBS/ {
  # insert the new line
  i\
inserted line
  # then loop over the rest of the file, implicitly printing each line
  :a; n; ba
}' file

对于last匹配，此版本不需要tac

sed '
  # read the whole file into pattern space
  :a; $!{N;ba}
  # then, use greedy matching to get to the *last* PBS
  # and non-greedy matching to get to the end of that line.
  s/.*PBS[^\n]*/&\ninserted line/   
' file

Answer 4

sed 是用于此类工作的错误工具，它用于对各行进行简单替换。只需使用 awk:

$ cat tst.awk
NR  == FNR { if (/PBS/) hits[++numHits] = NR; next }
FNR == hits[1] { print "inserted line before" }
{ print }
FNR == hits[numHits] { print "inserted line after" }

$ awk -f tst.awk file file
stuff here
stuff here
inserted line before
PBS -N
PBS -V
inserted line after
stuff here

Answer 5

这是一个只读取文件一次的awk：

cat file
line 1
line 2
PBS -N first
PBS -N second
line 3
PBS -V last-1
PBS -V last
line 4
line 5

awk '/PBS/ {last=NR;if (!f) {first=NR;f=1}} {a[NR]=[=11=]} END {for (i=1;i<=NR;i++) {if (i==first) a[i]="new line before\n"a[i];if (i==last) a[i]=a[i]"\nnew line after";print a[i]}}' file
line 1
line 2
new line before
PBS -N first
PBS -N second
line 3
PBS -V last-1
PBS -V last
new line after
line 4
line 5

工作原理：

awk '                                       # Start
/PBS/ {                                     # Does line contains "PBS"
    last=NR                                 # Set last to current line number
    if (!f) {                               # Is flag "f" false
        first=NR                            # Yes, set first line to current line
        f=1}}                               # and set flag "f"
    {
    a[NR]=[=12=]}                               # Store alle line in array "a"
END {
    for (i=1;i<=NR;i++) {                   # Loop trough all lines
        if (i==first)                       # Is line number equal to first hits
            a[i]="new line before\n"a[i]    # Add data before line
        if (i==last)                        # Is line number equal to last hits
            a[i]=a[i]"\nnew line after"     # Add data after line
        print a[i]}}                        # Print the line
' file

Answer 6

要让 sed 正确执行它，您必须绕过它的每行操作，然后使用原始正则表达式重新设置它。不难，就是有点麻烦

sed -E 'H;$!d;g
        s/\n[^\n]*PBS/\ninsert before first PBS-containing line&/
        s/.*PBS[^\n]*/&\ninsert after last PBS-containing line/;
        s/.//
'

H;$!d;g 将整个文件拖到保持缓冲区，前面有一个额外的换行符（H 将当前行附加到保持缓冲区，前面有 \n ，如果这不是最后一行，$!d 将被删除；g（及其后面的内容）仅在最后一行运行并检索保留缓冲区。

因此 s/\n[^\n]*PBS 将找到第一个 PBS 之前的换行符，因为每行之前总是有一个换行符，s/.*PBS[^\n]*/ 将找到最后一个 PBS 以及任何后续换行符之前的所有内容，并且s/.// 去掉我们卡在那里的人工换行符，使第一次出现的搜索工作。

请注意，您可以使第一次出现的插入对任意 n 有效，方法是将其附加到搜索中，s/\n[^\n]*PBS/\netc&/4 对第四次出现。

sed 头痛：在文件中的单个匹配项上插入行（不是每行）

sed headaches: inserting lines upon singular matches in file (NOT per line)

regex

unix

bash

sed

单独使用 sed

工作原理

使用 awk

使用 grep 和 sed