awk：打印与文件中模式不匹配的行，查看特定列

Question

我有一个 idFile:

和famFile:

1006 1006001 1006016 1006017 1
1006 1006006 1006016 1006017 1
1006 1006007 0       0       2
1006 1006008 1006007 1006006 2
1006 1006010 1006016 1006017 2
1006 1006011 1006016 1006017 1
1006 1006016 0       0       2
1006 1006017 0       0       1
1007 1007001 1007950 1007015 2
1007 1007002 1007014 1007015 2
......

我需要 grep 来自 famFile 的所有行，其中第二列不匹配 idFile.[=19= 中的任何行]

This command:

awk 'BEGIN { while(getline <"idFile") id[[=12=]]=1; }
id[] ' famFile

returns 所有匹配项：

1006 1006006 1006016 1006017 1
1006 1006008 1006007 1006006 2
1006 1006011 1006016 1006017 1
1007 1007002 1007014 1007015 2
......

但是如何修改命令以获得匹配项的补码？

Answer 1

$ awk 'NR==FNR{a[];next} !( in a)' idFile famFile
1006 1006001 1006016 1006017 1
1006 1006007 0       0       2
1006 1006010 1006016 1006017 2
1006 1006016 0       0       2
1006 1006017 0       0       1
1007 1007001 1007950 1007015 2

解释：

$ awk '
NR==FNR {                  # process the idFile
    a[]                  # hash to a 
    next                   # next id
}
!( in a)                 # if the second field id is not in a, output record
' idFile famFile           # mind the file order

awk：打印与文件中模式不匹配的行，查看特定列

awk: print lines that DO NOT match patterns in a file, looking at a specific column

awk

text-processing

bioinformatics

pattern-matching