用 awk 合并文件

Question

我正在尝试根据 file2 的第 2 列中的数字和 file1 的第 1 列中的数字（主要由逗号分隔）合并 file1 和 file2。我在 file2

中尝试匹配的数字之间也有一个 . 分隔符

这是文件 1。

1,Mary,24 Fuller Rd
2,Fred,19 St Johns
3,Jonathan,8 Poplar Drive
4,Susan,116 Shepherds Way
5,Michael,4 Nerthern Court

和文件 2

Dawning,Order.5.DHL
Hawkins,Order.3.FedEx
Jacob,Order.2.Yodel
Plateu,Order.4.DPD
Martins,Order.1.Hermes

我的方法是用split从file2中提取密钥。作为单个文件，这是有效的，但是当处理多个文件时，行为很奇怪，而不是预期的结果。

awk -F, '{{split(,i,".")}{ print i[2]}' file2
5
3
2
4
1

awk -F, 'NR==FNR{split(,i,"."); next}{ print i[2]}' file2 file1
1
1
1
1
1

如果我删除 split 但无法提取匹配项，我只会得到预期的结果。

awk -F, 'NR==FNR{array[]}END{ for (i in array) print i}' file2 file1
Order.4.DPD
Order.3.FedEx
Order.1.Hermes
Order.5.DHL
Order.2.Yodel

我已经采取了许多其他步骤但都失败了，但这可能会使问题变得过于臃肿，所以如果需要更多相关信息，请询问。

我的预期结果是这样的

Mary Martins 24 Fuller Rd
Fred Jacob 19 St Johns
Jonathan Hawkins 8 Poplar Drive
Susan Plateu 116 Shepherds Way
Michael Dawning 4 Nerthern Court

file2 中的 column2 和 file1 中的 column1 根据数字匹配，因此打印 </code> 和 <code>$NF 来自 file1 和 </code> 来自 <code>file2

以下是我在许多尝试中失败的一些

awk -F, 'NR==FNR {M=; array[]; next}{( in array)}END{ for (i in array) print , M, $NF}' file2 file1
awk -F, 'NR==FNR {M=; array[]; for (i in array) split(i,a,"."); next} ==a[2]{print ,M, }' file2 file1
awk -F, 'NR==FNR {M=; array[]; next}END { for (i in array) split(i,a,".")}(~a[2]){ print ,M}' file2 file1

我添加了 perl 标签，因为我对 perl 的解决方案很感兴趣，但如果可能的话，我主要想用 awk 来解决这个问题。

谢谢。

Answer 1

$ awk -F'[.,]' 'NR==FNR{a[]=; next} {print , a[], }' file2 FS=, file1
Mary Martins 24 Fuller Road
Fred Jacob 19 St Johns
Jonathan Hawkins 8 Poplar Drive
Susan Plateu 116 Shepherds Way
Michael Dawning 4 Nerthern Court

-F'[.,]' 使用 . 或 , 作为 file2
NR==FNR{a[]=; next}根据第三个字段保存file2的第一个字段作为key

file1

FS=, 会将 file1

,

print , a[], 打印所需数据（默认OFS为单个space字符）

Answer 2

perl -F'[.,]' -ane '$ARGV eq "file2" ? $r{$F[2]} = $F[0] : print "$F[1] $r{$F[0]} $F[2]"' file2 file1

-F'[.,]' 使用 . 或 , 作为 file2 和 file1
on file2, $F[0] = last name, $F[2] = 我们要与 file1 关联的 ID id
在 file1、$F[0] = id、$F[1] = 名字、$F[2] = 地址
-ane 用于 while 循环/迭代行并自动拆分 $_ 到 @F 在命令行
这个例子在 Perl 中使用了三元运算符，看起来像 CONDITION ? EVALUATE_IF_CONDITION_WAS_TRUE : EVALUATE_IF_CONDITION_WAS_FALSE
$ARGV读取行时包含当前文件名
$ARGV eq "file2" - 这是条件
如果从 "file2" 读取的行为真，那么我们将 id 作为键和姓氏作为值存储到 Perl 散列中 ($r)
一旦行在 file2 结束，条件将为假，因为 $ARGV 将是 'file1'
因为我们已经在 $r 中有了 file2 映射，我们可以使用 $r{$F[0]}

用 awk 合并文件

Merge files with awk

awk