根据另一个文件中的值替换变量

Question

我是 awk 的新手，想用文件 2 中的转换替换文件 1 中的 IID。这两个文件都是 txt 文件。

我有一个文件 （文件 1） 看起来像这样（只显示前两列）并且有 2060 行：

FID IID
1   RQ00001-2
2   RQ00002-0
3   RQ00004-9
4   RQ00005-4   
5   RQ00006-5

我有另一个文件显示 IID 到另一种格式的转换。该文件如下所示：

id Id
468768 RQ00001-2
468769 RQ00006-5
468770 RQ00005-4
468771 RQ00002-0
468772 RQ00004-9

所以我想用文件 2 作为转换，将文件 1 中的 IID 替换为 id。所以文件 1 应该是这样的：

所以我基本上想使用文件 2 作为转换将文件 2 中的 IID 替换为 id。

我知道我可以用 awk 做到这一点，但我不确定如何做。任何帮助将不胜感激。

Answer 1

注意： OP 的原始问题包括：

awk 'FNR==NR{a[]=;next} {print , in a?a[]:}' OFS="\t" Input_file2 Input_file1
But I have no idea what this means    ^ and I don't think it's applicable for my problem.

in a ? a[] : 就是 awk ternary operator.

在这种情况下，它显示为：if in a then output a[] else output </code></p> <p>对于这种特殊情况，它表示如果 <code>（第二个文件的第一个字段）是 a[] 数组中的索引，则打印数组条目的内容 a[] else打印 </code> 的内容（第二个文件的第二个字段）；换句话说，三元运算符正在确定您是保留当前字段 #2 值还是用数组中的对应值替换它。</p> <p>话虽如此，我认为当前的 <code>awk 代码存在问题...

假设：

如果 file #1 / field #2 匹配 file #2 / field #2 那么 ...
将 file #1 / field #2 替换为 file #2 / field #1

awk 修改的一个想法：

awk -v OFS="\t" '                          # just my personal preference to list variables first; OP can leave after script and before files

        # process 1st input file

FNR==NR { if ( FNR>1 )                     # skip 1st line "id Id"
             a[]=                      # 2nd file: 2nd field is index, 1st field is replacement value
          next
        }

        # process 2nd input file

        { print , in a ? a[] :  }  # if 2nd field is an index in array a[] then replace it with said array value else keep current 2nd field 
' Input_file2 Input_file1

 # eliminating comments and pesky white space for a 'compact' one-liner:

 awk -v OFS="\t" 'FNR==NR {if(FNR>1)a[]=;next}{print , in a?a[]:}' Input_file2 Input_file1

这两个都会生成：

FID     IID
1       468768
2       468771
3       468772
4       468770
5       468769

备注：

OP 提到要替换文件 #1 中的值； OP 需要将此 awk 脚本的输出捕获到另一个（临时）文件中，然后用这个新的（临时）文件覆盖原始文件；由 OP 决定是否应该首先制作文件 #1 的备份副本
OP 提到文件 #1 有超过 2 列；假设列数可以是 'large' and/or 动态的，OP 可以对代码进行以下更改 ...

修改代码以替换 file #1 / field #2 然后打印行：

# change from:

{ print , in a ? a[] :  }

# change to:

{  =  in a ? a[] : ; print }   # overwrite value in field #2 in current line and then print current line

根据另一个文件中的值替换变量

Replace variables based upon a value in another file

bash

awk

grep