Unix 提取两行之间的行并将它们存储在各自的文件中

Question

我在文件 allreport.txt 中有一个如下所示的文件：

11:22:33:456        Script started                  Running: first_script
11:22:34:456        GetData - Read                  Read 12 Bytes
11:22:34:456        SetData - Write                 Write 12 Bytes
11:32:33:456        Script started                  Running: second_script
11:32:34:456        GetData - Read                  Read 12 Bytes
11:32:34:456        SetData - Write                 Write 12 Bytes
11:42:33:456        Script started                  Running: third_script
11:42:34:456        GetData - Read                  Read 12 Bytes
11:42:34:456        SetData - Write                 Write 12 Bytes
11:52:33:456        Script started                  Running: fourth_script

我的要求是我需要提取“*脚本”之间的“....”行。我尝试了如下所示：

grep 'Running:' allreport.txt | sed 's/[^ ]* //' | cut -d":" -f2 | tr '\n' ' ' | awk -v col=1 '/$col/,/$($col+1)/ allreport.txt  > $col'

但是执行命令后我没有看到任何输出，也没有创建结果文件？

我怎样才能达到同样的效果——预期的输出是像 first_script、second_script 等文件，每个文件都包含其运行的日志——示例first_script 应该只有以下几行：

11:22:34:456        GetData - Read                  Read 12 Bytes
11:22:34:456        SetData - Write                 Write 12 Bytes

同样second_script应该有下面几行等等：

11:32:34:456        GetData - Read                  Read 12 Bytes
11:32:34:456        SetData - Write                 Write 12 Bytes

Answer 1

眼前的问题似乎是您的 Awk 脚本的引号内有一些 shell 脚本代码...但更根本的是，您的脚本过于复杂和怪异。

awk '/Running:/ { close(c); c++; next }
    { print >c }' allreport.txt

Answer 2

$ awk 'sub(/.*Running: /,""){ close(out); out=[=10=]; next } { print > out }' allreport.txt

$ head *script
==> first_script <==
11:22:34:456        GetData - Read                  Read 12 Bytes
11:22:34:456        SetData - Write                 Write 12 Bytes

==> second_script <==
11:32:34:456        GetData - Read                  Read 12 Bytes
11:32:34:456        SetData - Write                 Write 12 Bytes

==> third_script <==
11:42:34:456        GetData - Read                  Read 12 Bytes
11:42:34:456        SetData - Write                 Write 12 Bytes

Answer 3

只有您展示的示例，请尝试以下操作。作为输出，它将创建 3 个名为 first_script、second_script 和 third_script 的文件，其中包含示例。

awk -F': ' '/Running/{close(outFile);outFile=;next} {print > (outFile)}' Input_file

解释： 简单的解释就是，将字段分隔符设为 : 然后检查行是否有 Running 然后将输出文件名设置为第二个字段。如果行没有 Running 则将该行打印到输出文件中。还要确保在后端关闭输出文件以避免此处出现“太多打开的文件错误”。

Answer 4

你也可以试试这个 awk:

awk '$(NF-1) == "Running:" {close(fn); fn = $NF; next} {print > fn}' file

Answer 5

您可以使用 csplit 实用程序根据内容拆分文件：

csplit allrep.txt /[[:space:]]Running:[[:space:]]/ '{*}'
# produces xx00 - xxNN files based on that match

部分版本不支持'{*}'。如果是这种情况，您需要在 {} 中提供分割数。你可以这样做：

csplit allrep.txt /[[:space:]]Running:[[:space:]]/ "{$(awk '/\sRunning:\s/{cnt++} END{print cnt-1}' allrep.txt)}"

如果你想要文件名，我会在awk中做这样的事情：

awk 'BEGIN{fn="0000 - Header"}
{sub(/\r$/,"")}   # the file you uploaded has \r\n endings
$(NF-1)=="Running:" {close(fn); fn=sprintf("%04d - %s.txt", ++fc, $NF)}
{print >fn}
' allrep.txt

通过在前面添加数字，1) 处理任何重复项，2) 允许您查看顺序。

Unix 提取两行之间的行并将它们存储在各自的文件中

Unix extract lines in between 2 lines and store them in respective files

awk

grep

sed