如何删除当前行中的最后一个单词，但前提是下一行出现模式？

Question

文件内容为

some line DELETE_ME
some line this_is_the_pattern

如果 this_is_the_pattern 出现在 下一个 行，则删除当前行的最后一个单词（在本例中为 DELETE_ME）。

如何使用 sed 或 awk 执行此操作？我的理解是 sed 比 awk 更适合这个任务，因为 awk 适合对以表格格式存储的数据进行操作。如果我的理解不正确，请告诉我。

Answer 1

$ awk '/this_is_the_pattern/{sub(/[^[:space:]]+$/, "", last)} NR>1{print last} {last=[=10=]} END{print last}' file
some line
some line this_is_the_pattern

工作原理

此脚本使用一个名为 last 的变量，其中包含文件中的前一行。总之，如果当前行包含该模式，则从 last 中删除最后一个单词。否则，last 将按原样打印。

详细来说，依次执行每条命令：

/this_is_the_pattern/{sub(/[^[:space:]]+$/, "", last)}

如果此行有模式，请删除最后一行的最后一个词。
NR>1{print last}

对于第一行之后的每一行，打印最后一行。
last=[=16=]

将当前行保存在变量 last.
END{print last}

打印文件的最后一行。

Answer 2

 awk 'NR>1 && /this_is_the_pattern/ {print t;}
      NR>1 && !/this_is_the_pattern/ {print f;}
      {f=[=10=];$NF="";t=[=10=]}
      END{print f}' input-file

请注意，这将在删除最后一个字段的任何行中修改 whitespace，将 whitespace 的运行压缩为单个 space.

您可以将其简化为：

awk 'NR>1 { print( /this_is_the_pattern/? t:f)}
      {f=[=11=];$NF="";t=[=11=]}
      END{print f}' input-file

您可以通过以下方式解决挤压白色space问题：

awk 'NR>1 { print( /this_is_the_pattern/? t:f)}
      {f=[=12=];sub(" [^ ]*$","");t=[=12=]}
      END{print f}' input-file

Answer 3

您可以使用 tac 向后搜索文件，以便您首先看到模式。然后设置一个标志并删除您看到的下一行的最后一个单词。然后在最后，通过tac将文件反转回原来的顺序。

tac file | awk '/this_is_the_pattern/{f=1;print;next} f==1{sub(/ [^ ]+$/, "");print;f=0}' | tac

Answer 4

使用缓冲区在内存中保留上一行

sed -n 'H;1h;1!{x;/\nPAGE/ s/[^ ]*\(\n\)//;P;s/.*\n//;h;$p;}' YourFile

使用循环但概念相同

sed -n ':cycle
N;/\nPAGE/ s/[^ ]*\(\n\)//;P;s/.*\n//;$p;b cycle' YourFile

在这两种情况下，它都会删除上一行的最后一个单词，而且搜索模式是在连续的 2 行上

使用最后读取的 2 行，测试模式是否在最后，如果存在则删除单词，然后打印第一行，将其删除并循环

Answer 5

惯用的 awk 解决方案只是保留前一行（或一般情况下的 N 行）的缓冲区，这样您就可以测试当前行，然后相应地修改 and/or 打印缓冲区：

$ awk '
    NR>1 {
        if (/this_is_the_pattern/) {
            sub(/[^[:space:]]+$/,"",prev)
        }
        print prev
    }
    { prev = [=10=] }
    END { print prev }
' file
some line
some line this_is_the_pattern

如何删除当前行中的最后一个单词，但前提是下一行出现模式？

How can I delete the last word in the current line, but only if a pattern occurs on the next line?

regex

awk

sed

工作原理