AWK:模式扫描调试脚本不工作
AWK: Pattern scanning debug script not working
我有以下 table:
cat test.txt
c_az_1858 2020-01-15 -5.50 Parking Serv Parking Serv
c_az_1859 2020-01-15 -80.56 Avery Johnson Avery Johnso 592242
c_az_1860 2020-01-15 100.00 Wayne Alexander Flin 7 Pikarere S Titahi Bay
c_az_1861 2020-01-15 51.75 Setefano P M Crew Cuts Lawns
c_az_1862 2020-01-13 -5.50 Parking Serv Parking Serv
c_az_1863 2020-01-13 -3.00 Parking Serv Parking Serv
c_az_1864 2020-01-13 57.50 0520/5200000000/002 Apu Cresent
c_az_1865 2020-01-13 46.00 Becta Ltd Taylormallon Lawns
c_az_1866 2020-01-13 28.75 Strata Title Adminis Crewcut Gard De Payment
c_az_1867 2020-01-13 19.17 D S & S A Tapp David Tapp Weekly Lawn
我正在尝试 运行 针对该文件的一系列搜索模式,以便它打印出调用该行前面的行的搜索模式。搜索模式扫描列 $4。像这样:
Park: c_az_1858 2020-01-15 -5.50 Parking Serv Parking Serv
ayn : c_az_1860 2020-01-15 100.00 Wayne Alexander Flin 7 Pikarere S Titahi Bay
o P: c_az_1861 2020-01-15 51.75 Setefano P M Crew Cuts Lawns
Park: c_az_1862 2020-01-13 -5.50 Parking Serv Parking Serv
Park: c_az_1863 2020-01-13 -3.00 Parking Serv Parking Serv
S A: c_az_1867 2020-01-13 19.17 D S & S A Tapp David Tapp Weekly Lawn
为此我编写了如下脚本:
#!/usr/bin/env bash
awk '
BEGIN{
FS = OFS = "\t"
x="ayn|o P|S A|Park"
}
{
for (i in x) {
if ( ~ i) {
print x[i] ": " , i
}
}
}
' test.txt
当我 运行 这样做时,我收到以下错误消息:
awk: cmd. line:7: (FILENAME=test.txt FNR=1) fatal: attempt to use scalar `x' as an array
x 是一个标量吗?如何重写它才能工作。非常感谢帮助。
在当前代码中,以下代码将字符串分配给变量 x
:
x="ayn|o P|S A|Park"
将这些模式分配给数组可以像这样单独完成:
# assign as array values
x[1]="ayn" ; x[2]="o P" ; x[3]="S A" ; x[4]="Park"
# assign as array indices (no need to assign a value)
x["ayn"] ; x["o P"] ; x["S A"] ; x["Park"]
如果以分隔字符串的形式提供,我们可以使用 split()
函数将值分解为单独的字符串并将它们分配为数组值。
对 OP 的当前代码进行一些更改:
- 允许将搜索模式从 shell 馈送到
awk
变量中
- 将搜索模式拆分为单独的数组组件
修改后的代码:
patterns='ayn|o P|S A|Park'
awk -v ptns="${patterns}" '
BEGIN { FS = OFS = "\t"
split(ptns,arr,"|") # split ptns into array arr[] based on "|" delimiter
for (i in arr)
x[arr[i]] # convert arr[] values to x[] indices
}
{ for (i in x)
if ( ~ i) # compare with the array indices
print i ": " [=12=]
}
' test.txt
或者我们可以只使用 split()
的结果并确保我们将 </code> 与数组中的值(而不是数组的索引)匹配,例如:</p>
<pre><code>patterns='ayn|o P|S A|Park'
awk -v ptns="${patterns}" '
BEGIN { FS = OFS = "\t"
split(ptns,arr,"|") # split ptns into array arr[] based on "|" delimiter
}
{ for (i in arr)
if ( ~ arr[i]) # compare with the array values
print arr[i] ": " [=13=]
}
' test.txt
这两个都会生成:
Park: c_az_1858 2020-01-15 -5.50 Parking Serv Parking Serv
ayn: c_az_1860 2020-01-15 100.00 Wayne Alexander Flin 7 Pikarere S Titahi Bay
o P: c_az_1861 2020-01-15 51.75 Setefano P M Crew Cuts Lawns
Park: c_az_1862 2020-01-13 -5.50 Parking Serv Parking Serv
Park: c_az_1863 2020-01-13 -3.00 Parking Serv Parking Serv
S A: c_az_1867 2020-01-13 19.17 D S & S A Tapp David Tapp Weekly Lawn
另一个选项可以通过将双引号替换为正斜杠,将竖线分隔的字符串更改为正则表达式,其中模式中的竖线将用于列出替代项。
然后您可以检查第 4 列中的匹配项并打印第一个匹配的部分加上整行。
awk '
BEGIN{FS=OFS="\t"}
match(, /ayn|o P|S A|Park/) {
print substr(, RSTART, RLENGTH) ":", [=10=]
}
' test.txt
输出
Park: c_az_1858 2020-01-15 -5.50 Parking Serv Parking Serv
ayn: c_az_1860 2020-01-15 100.00 Wayne Alexander Flin 7 Pikarere S Titahi Bay
o P: c_az_1861 2020-01-15 51.75 Setefano P M Crew Cuts Lawns
Park: c_az_1862 2020-01-13 -5.50 Parking Serv Parking Serv
Park: c_az_1863 2020-01-13 -3.00 Parking Serv Parking Serv
S A: c_az_1867 2020-01-13 19.17 D S & S A Tapp David Tapp Weekly Lawn
我有以下 table:
cat test.txt
c_az_1858 2020-01-15 -5.50 Parking Serv Parking Serv
c_az_1859 2020-01-15 -80.56 Avery Johnson Avery Johnso 592242
c_az_1860 2020-01-15 100.00 Wayne Alexander Flin 7 Pikarere S Titahi Bay
c_az_1861 2020-01-15 51.75 Setefano P M Crew Cuts Lawns
c_az_1862 2020-01-13 -5.50 Parking Serv Parking Serv
c_az_1863 2020-01-13 -3.00 Parking Serv Parking Serv
c_az_1864 2020-01-13 57.50 0520/5200000000/002 Apu Cresent
c_az_1865 2020-01-13 46.00 Becta Ltd Taylormallon Lawns
c_az_1866 2020-01-13 28.75 Strata Title Adminis Crewcut Gard De Payment
c_az_1867 2020-01-13 19.17 D S & S A Tapp David Tapp Weekly Lawn
我正在尝试 运行 针对该文件的一系列搜索模式,以便它打印出调用该行前面的行的搜索模式。搜索模式扫描列 $4。像这样:
Park: c_az_1858 2020-01-15 -5.50 Parking Serv Parking Serv
ayn : c_az_1860 2020-01-15 100.00 Wayne Alexander Flin 7 Pikarere S Titahi Bay
o P: c_az_1861 2020-01-15 51.75 Setefano P M Crew Cuts Lawns
Park: c_az_1862 2020-01-13 -5.50 Parking Serv Parking Serv
Park: c_az_1863 2020-01-13 -3.00 Parking Serv Parking Serv
S A: c_az_1867 2020-01-13 19.17 D S & S A Tapp David Tapp Weekly Lawn
为此我编写了如下脚本:
#!/usr/bin/env bash
awk '
BEGIN{
FS = OFS = "\t"
x="ayn|o P|S A|Park"
}
{
for (i in x) {
if ( ~ i) {
print x[i] ": " , i
}
}
}
' test.txt
当我 运行 这样做时,我收到以下错误消息:
awk: cmd. line:7: (FILENAME=test.txt FNR=1) fatal: attempt to use scalar `x' as an array
x 是一个标量吗?如何重写它才能工作。非常感谢帮助。
在当前代码中,以下代码将字符串分配给变量 x
:
x="ayn|o P|S A|Park"
将这些模式分配给数组可以像这样单独完成:
# assign as array values
x[1]="ayn" ; x[2]="o P" ; x[3]="S A" ; x[4]="Park"
# assign as array indices (no need to assign a value)
x["ayn"] ; x["o P"] ; x["S A"] ; x["Park"]
如果以分隔字符串的形式提供,我们可以使用 split()
函数将值分解为单独的字符串并将它们分配为数组值。
对 OP 的当前代码进行一些更改:
- 允许将搜索模式从 shell 馈送到
awk
变量中 - 将搜索模式拆分为单独的数组组件
修改后的代码:
patterns='ayn|o P|S A|Park'
awk -v ptns="${patterns}" '
BEGIN { FS = OFS = "\t"
split(ptns,arr,"|") # split ptns into array arr[] based on "|" delimiter
for (i in arr)
x[arr[i]] # convert arr[] values to x[] indices
}
{ for (i in x)
if ( ~ i) # compare with the array indices
print i ": " [=12=]
}
' test.txt
或者我们可以只使用 split()
的结果并确保我们将 </code> 与数组中的值(而不是数组的索引)匹配,例如:</p>
<pre><code>patterns='ayn|o P|S A|Park'
awk -v ptns="${patterns}" '
BEGIN { FS = OFS = "\t"
split(ptns,arr,"|") # split ptns into array arr[] based on "|" delimiter
}
{ for (i in arr)
if ( ~ arr[i]) # compare with the array values
print arr[i] ": " [=13=]
}
' test.txt
这两个都会生成:
Park: c_az_1858 2020-01-15 -5.50 Parking Serv Parking Serv
ayn: c_az_1860 2020-01-15 100.00 Wayne Alexander Flin 7 Pikarere S Titahi Bay
o P: c_az_1861 2020-01-15 51.75 Setefano P M Crew Cuts Lawns
Park: c_az_1862 2020-01-13 -5.50 Parking Serv Parking Serv
Park: c_az_1863 2020-01-13 -3.00 Parking Serv Parking Serv
S A: c_az_1867 2020-01-13 19.17 D S & S A Tapp David Tapp Weekly Lawn
另一个选项可以通过将双引号替换为正斜杠,将竖线分隔的字符串更改为正则表达式,其中模式中的竖线将用于列出替代项。
然后您可以检查第 4 列中的匹配项并打印第一个匹配的部分加上整行。
awk '
BEGIN{FS=OFS="\t"}
match(, /ayn|o P|S A|Park/) {
print substr(, RSTART, RLENGTH) ":", [=10=]
}
' test.txt
输出
Park: c_az_1858 2020-01-15 -5.50 Parking Serv Parking Serv
ayn: c_az_1860 2020-01-15 100.00 Wayne Alexander Flin 7 Pikarere S Titahi Bay
o P: c_az_1861 2020-01-15 51.75 Setefano P M Crew Cuts Lawns
Park: c_az_1862 2020-01-13 -5.50 Parking Serv Parking Serv
Park: c_az_1863 2020-01-13 -3.00 Parking Serv Parking Serv
S A: c_az_1867 2020-01-13 19.17 D S & S A Tapp David Tapp Weekly Lawn