awk 一个文件的列来自另一个文件

Question

awk 遇到一些问题。我有两个文件，正在尝试用第一个文件读取第二个文件的一列，然后找出所有匹配项。

文件 1:

文件 2:

apples  peaches 3  
apples  peaches 9  
oranges  pears 7  
apricots  figs 1

预期输出：

apples peaches 3  
apricots figs 1

awk -F"|" '
           FNR==NR {f1[];next}
           ( in f1)
          ' file1 file2 > output.txt

Answer 1

我不清楚 file2 的格式（例如，字段之间是 space 还是制表符？），或者 file2 中的一行是否可以包含超过 3 个（白色）spaced 分隔字符串（例如，apples black raspberries 6），因此为 file2 选择分隔符需要更多详细信息。话虽如此...

示例文件中没有竖线（'|'），因此当前代码（使用 -F"|"）会将整行合并到 awk 变量 </code></li> <li>我们可以通过认识到我们只对来自 <code>file2

last

正在向 file2 添加条目：

$ cat file2
apples  peaches 3
apples  peaches 9
oranges  pears 7
apricots  figs 1
apples black raspberries 2

对当前 awk 代码的几个小改动：

awk 'FNR==NR {f1[]; next} $(NF) in f1' file1 file2

这会生成：

apples  peaches 3
apricots  figs 1
apples black raspberries 2

Answer 2

这更像是一个旁注，我建议使用 awk，如所述。

您可以使用join命令：

join -11 -23 <(sort -k1,1n file1) <(sort -k3,3n file2)

上面的示例是在 shell 和排序命令的帮助下使用 join：

命令解释：

join
  -11                  # Join based on column 1 of file 1 ...
  -23                  # and column 3 in file 2
  <(sort -k1,1n file1) # sort file 1 based on column 1
  <(sort -k3,3n file2) # sort file 2 based on column 3

<() 构造被称为 process substitutions，由 shell 提供，您在其中输入运行命令。括号中的命令输出将被处理就像一个文件，可以用作我们的连接命令的参数。我们不需要创建中间的排序文件。

awk 一个文件的列来自另一个文件

awk the column of one file from another file

awk