Bash 使用 awk 的脚本不读取整行,只读取第一列
Bash script using awk doesn't read the entire line, just the first column
有一个文件要处理,列之间用制表符分隔:
$ cat system.log
2 camila create db
3 andrew create table
5 greg update table
6 nataly update view
7 greg delete table
9 camila update table
11 nataly create view
12 peter link table
14 andrew update view
15 greg update db
我希望这些行以一种形式显示:
Entry No. 7: camila (action: create db)
为此,我创建了以下 bash 脚本:
#!/bin/bash
filename=
while read line; do
printf $line | awk -F '\t' '{ print "Entry No. ", , ": ", , " (action: ", , ")" }'
done < $filename
然而,我得到的是:
$ ./log_parser.sh system.log
Entry No. 2 : (action: )
Entry No. 3 : (action: )
Entry No. 5 : (action: )
Entry No. 6 : (action: )
Entry No. 7 : (action: )
Entry No. 9 : (action: )
Entry No. 11 : (action: )
Entry No. 12 : (action: )
Entry No. 14 : (action: )
Entry No. 15 : (action: )
为什么只处理第一列以及如何处理整行?
您必须引用您的变量以防止分词。考虑 $line
是否计算为字符串 2 camila create db
。在这种情况下,printf $line
等同于 printf 2 camila create db
,后者使用 4 个参数调用 printf
。 printf
正确地解析了这些参数并尽职地编写了字符串 2
。如果你想将单个参数传递给 printf
,你可以做 printf "$line"
。但这也是不正确的,因为 printf
的第一个参数应该是格式字符串,而您不想将输入字符串用作格式字符串。相反,您应该写 printf '%s' "$line"
。但也不要那样做。 while read; printf | awk
是一种反模式。只需使用 awk
来读取输入。
未能将 $line
括在双引号中导致 \t
字符被替换为 spaces,这反过来又搞砸了 awk -F'\t'
.
考虑:
$ line=$(head -1 system.log)
# double quoting ${line} maintains the \t characters:
$ echo "${line}" | od -c
0000000 2 \t c a m i l a \t c r e a t e
0000020 d b \n
0000023
# no (double) quoting of ${line} replaces the \t with spaces:
$ echo ${line} | od -c
0000000 2 c a m i l a c r e a t e
0000020 d b \n
0000023
问题因 printf
如何处理未引用的 ${line}
而变得更加复杂,例如:
$ printf ${line}
2
$ printf "${line}"
2 camila create db
至于整个 while
循环,假设 while
循环的唯一目的是将修改后的文件内容发送到标准输出(即,您没有使用 ${line}
对于其他 bash 级别的操作),您可以用单个 awk
调用替换整个操作,例如:
$ awk -F '\t' '{ print "Entry No. ", , ": ", , " (action: ", , ")" }' system.log
Entry No. 2 : camila (action: create db )
Entry No. 3 : andrew (action: create table )
Entry No. 5 : greg (action: update table )
Entry No. 6 : nataly (action: update view )
Entry No. 7 : greg (action: delete table )
Entry No. 9 : camila (action: update table )
Entry No. 11 : nataly (action: create view )
Entry No. 12 : peter (action: link table )
Entry No. 14 : andrew (action: update view )
Entry No. 15 : greg (action: update db )
注意: 输出中的额外 space 是由于 print
命令的构建方式;用 ,
分隔每个参数,在每个参数之间添加默认的 awk/OFS
分隔符(space);删除逗号(awk/OFS
分隔符)生成:
$ awk -F '\t' '{ print "Entry No. " ": " " (action: " ")" }' system.log
Entry No. 2: camila (action: create db)
Entry No. 3: andrew (action: create table)
Entry No. 5: greg (action: update table)
Entry No. 6: nataly (action: update view)
Entry No. 7: greg (action: delete table)
Entry No. 9: camila (action: update table)
Entry No. 11: nataly (action: create view)
Entry No. 12: peter (action: link table)
Entry No. 14: andrew (action: update view)
Entry No. 15: greg (action: update db)
有一个文件要处理,列之间用制表符分隔:
$ cat system.log
2 camila create db
3 andrew create table
5 greg update table
6 nataly update view
7 greg delete table
9 camila update table
11 nataly create view
12 peter link table
14 andrew update view
15 greg update db
我希望这些行以一种形式显示:
Entry No. 7: camila (action: create db)
为此,我创建了以下 bash 脚本:
#!/bin/bash
filename=
while read line; do
printf $line | awk -F '\t' '{ print "Entry No. ", , ": ", , " (action: ", , ")" }'
done < $filename
然而,我得到的是:
$ ./log_parser.sh system.log
Entry No. 2 : (action: )
Entry No. 3 : (action: )
Entry No. 5 : (action: )
Entry No. 6 : (action: )
Entry No. 7 : (action: )
Entry No. 9 : (action: )
Entry No. 11 : (action: )
Entry No. 12 : (action: )
Entry No. 14 : (action: )
Entry No. 15 : (action: )
为什么只处理第一列以及如何处理整行?
您必须引用您的变量以防止分词。考虑 $line
是否计算为字符串 2 camila create db
。在这种情况下,printf $line
等同于 printf 2 camila create db
,后者使用 4 个参数调用 printf
。 printf
正确地解析了这些参数并尽职地编写了字符串 2
。如果你想将单个参数传递给 printf
,你可以做 printf "$line"
。但这也是不正确的,因为 printf
的第一个参数应该是格式字符串,而您不想将输入字符串用作格式字符串。相反,您应该写 printf '%s' "$line"
。但也不要那样做。 while read; printf | awk
是一种反模式。只需使用 awk
来读取输入。
未能将 $line
括在双引号中导致 \t
字符被替换为 spaces,这反过来又搞砸了 awk -F'\t'
.
考虑:
$ line=$(head -1 system.log)
# double quoting ${line} maintains the \t characters:
$ echo "${line}" | od -c
0000000 2 \t c a m i l a \t c r e a t e
0000020 d b \n
0000023
# no (double) quoting of ${line} replaces the \t with spaces:
$ echo ${line} | od -c
0000000 2 c a m i l a c r e a t e
0000020 d b \n
0000023
问题因 printf
如何处理未引用的 ${line}
而变得更加复杂,例如:
$ printf ${line}
2
$ printf "${line}"
2 camila create db
至于整个 while
循环,假设 while
循环的唯一目的是将修改后的文件内容发送到标准输出(即,您没有使用 ${line}
对于其他 bash 级别的操作),您可以用单个 awk
调用替换整个操作,例如:
$ awk -F '\t' '{ print "Entry No. ", , ": ", , " (action: ", , ")" }' system.log
Entry No. 2 : camila (action: create db )
Entry No. 3 : andrew (action: create table )
Entry No. 5 : greg (action: update table )
Entry No. 6 : nataly (action: update view )
Entry No. 7 : greg (action: delete table )
Entry No. 9 : camila (action: update table )
Entry No. 11 : nataly (action: create view )
Entry No. 12 : peter (action: link table )
Entry No. 14 : andrew (action: update view )
Entry No. 15 : greg (action: update db )
注意: 输出中的额外 space 是由于 print
命令的构建方式;用 ,
分隔每个参数,在每个参数之间添加默认的 awk/OFS
分隔符(space);删除逗号(awk/OFS
分隔符)生成:
$ awk -F '\t' '{ print "Entry No. " ": " " (action: " ")" }' system.log
Entry No. 2: camila (action: create db)
Entry No. 3: andrew (action: create table)
Entry No. 5: greg (action: update table)
Entry No. 6: nataly (action: update view)
Entry No. 7: greg (action: delete table)
Entry No. 9: camila (action: update table)
Entry No. 11: nataly (action: create view)
Entry No. 12: peter (action: link table)
Entry No. 14: andrew (action: update view)
Entry No. 15: greg (action: update db)