使用 awk 命令遍历文本文件并递增脚本的早上、下午和晚上部分的计数器

Question

我正在使用 awk 浏览一个文本文件，该文件包含完成脚本的信息然后说完成（早上）然后完成脚本的信息然后说完成（下午）然后完成脚本的信息然后说完成（晚上）。

我正在尝试跟踪每个块的已完成脚本。

我的做法是...

awk '
    /Completed/ {next} //finished morning block

    /Finished/ {mornCount+=1} //count finishes in morning block

    /Completed/ {next}

    /Finished/ {afterCount+=1}

    /Completed/ {exit}

    /Finished/ {nightCount+=1}

    END{ 
        print "procedures completed this morning: " mornCount 
        print "procedures completed this afternoon: " afterCount
        print "procedures completed at night: " nightCount
    } 
' file.txt

但是我在早上、之后和晚上得到的值是相同的。

start 
start 
start 
Finished
Finished
Finished
Completed
start 
start 
Finished
Finished 
Completed 
start 
Finished 
Complete

所以我想要 3、2 和 1 作为我的输出 - 完成（早上、下午和晚上）

Answer 1

即使各部分完成的任务为零，这也应该有效：

awk -v FS="\n" -v RS="complete" '
    BEGIN { morn = 0; after = 0; night = 0 }    
    NR == 1 { for( i = 1; i < NF; i++) { if ($i ~ "finish") { morn++ } } }
    NR == 2 { for( i = 1; i < NF; i++) { if ($i ~ "finish") { after++ } } }
    NR == 3 { for( i = 1; i < NF; i++) { if ($i ~ "finish") { night++ } } }
    END { 
        print "procedures completed this morning: " morn 
        print "procedures completed this afternoon: " after
        print "procedures completed at night: " night
}' file.txt

Answer 2

这应该有效

awk 'BEGIN{split("morning afternoon night",a)}
     /Finished/{x++}
     /Completed/{print a[++y]":"x;x=0}' file

BEGIN 块在脚本的开头执行。拆分只是创建一个数组，如

a[1] = morning 
a[2] = afternoon 
a[3] = night

每次看到 Finished 时 x 都会递增
当看到完成时，y 递增，a 中该位置的值与 x.
中的值一起打印 x 已重置
重复

示例的输出

morning:3
afternoon:2
night:1

Answer 3

TXR Lisp:

(mapdo (do put-line `@1: @2`)
       '#"morning afternoon night"
       [mapcar (op count-if (op match-regex @1 #/Finished/))
               (partition (get-lines)
                          (op where (op match-regex @1 #/Complete/)))])

$ txr count.tl < data.txt
morning: 3
afternoon: 2
night: 1

获取文件的行作为字符串列表。将列表划分为列表列表，在匹配 /Complete 的地方将其拆分。然后计算每个片段中 /Finished/ 的匹配项的出现次数，并通过将它们成对转换为输出的函数将计数与节名称映射在一起。

上述过程的线性描述可以通过 opip 宏对功能管道的重新安排来表达。由于returns是一个函数，它必须被调用；澄清一下，为什么不使用很少使用的 call 函数而不是语法 [pipeline].

(let ((pipeline (opip (get-lines)
                      (partition @1 (op where (op match-regex @1 #/Complete/)))
                      (mapcar (op count-if (op match-regex @1 #/Finished/)))
                      (mapdo (do put-line `@1: @2`) '#"morning afternoon night"))))
  (call pipeline))

没有临时 pipeline 变量，并且 call 被方括号替换：

[(opip (get-lines)
       (partition @1 (op where (op match-regex @1 #/Complete/)))
       (mapcar (op count-if (op match-regex @1 #/Finished/)))
       (mapdo (do put-line `@1: @2`) '#"morning afternoon night"))]

TXR 文本提取模式语言的解决方案，带点 Lisp：

@(collect)
@  (collect)
@{f /Finished.*/}
@  (until)
@/Complete.*/
@  (end)
@(end)
@(output)
morning: @(length [f 0])
afternoon: @(length [f 1])
night: @(length [f 2])
@(end)

$ txr count.txr data.txt 
morning: 3
afternoon: 2
night: 1

将部分名称编码为列表并遍历：

@(collect)
@  (collect)
@{f /Finished.*/}
@  (until)
@/Complete.*/
@  (end)
@(end)
@(bind sec #"morning afternoon night")
@(output)
@  (repeat :vars (f))
@sec: @(length f)
@  (end)
@(end)

注意：需要:vars (f)是因为@(output)处理器不会遍历Lisp来寻找变量引用，所以它不能自动看到f t看到[=25=的方式].没有好的方法可以做到这一点，因为 Lisp 代码可以表达 @(output) 不应该 看到的自由变量引用。

使用 awk 命令遍历文本文件并递增脚本的早上、下午和晚上部分的计数器

Using awk command to go through a text file and incrementing counters for morning, afternoon, and night sections of the script

linux

bash

shell

scripting

awk