在awk中动态创建n个数组

Question

我想用 n 个文件调用一个 awk 程序，但我不知道 n 个文件的大小。

awk -f program.awk test1 test2 test3 ... testn

在我想为每个文件 test1、test2 创建一个数组的程序中。

    BEGIN{
        files = ARGC-1;
        for (i=1; i<=files; ++i){
                # here I would like to create arr1, arr2 ... arrn
        }
     }

知道如何解决这个问题吗？

谢谢

Answer 1

如果你有GNU awk（对于数组的数组）：

awk '
FNR==1 { fd++ }                 # first line of a new file increment variable fd (1st dimension index for the lines[] array)
       { lines[fd][FNR]=[=10=] }    # simple example of saving each line in array
' test1 test2 test3 ... testn

如果您没有 GNU awk，您可以模拟一个 multi-dimensional 数组，例如：

awk '
FNR==1 { fd++ }
       { lines[fd FS FNR]=[=11=] }  # concatentate fd and FNR to form index for single-dimensional array
' test1 test2 test3 ... testn

注意： 这种方法的一个缺点是要访问给定文件的所有记录（例如，fd=1），您必须扫描所有索引，拆分它们（按 FS）然后测试第一个（拆分）字段 == fd=1

Answer 2

这是我将在 GNU awk 中提出的内容。假设我们有文件名，例如：test1、test2、test3、test4 等等。在我的例子中，我创建了 8 个虚拟文件，从 test1 到 test8。

现在我正在创建一个数组来打印文件名，只是为了在这里展示我们可以让每个文件的索引在 GNU awk 中有一个开箱即用的变量，命名为 ARGIND 本身。

这是相同的代码片段（在 GNU awk 中编写和测试）：

awk '
NR==1{
  print ARGC-1
}
FNR==1
{
  arr[ARGIND","FILENAME]
  nextfile
}
END{
  PROCINFO["sorted_in"]="@ind_num_asc"
  for(i in arr){
    split(i,getVal,",")
    print "Index is:"getVal[1]", Filename is:"getVal[2]
  }
}
' test*

示例输出如下：

Index is:1, Filename is:test1
Index is:2, Filename is:test2
Index is:3, Filename is:test3
Index is:4, Filename is:test4
Index is:5, Filename is:test5
Index is:6, Filename is:test6
Index is:7, Filename is:test7
Index is:8, Filename is:test8

我只是向您展示如何获取两个索引号（gawk 将拾取的文件，它将根据其处理和该文件的名称通过 ARGIND 为其分配编号），一旦您理解了这个概念，您就可以多玩一些。

注意 1： 我使用 NR==1{print ARGC-1} 只是为了向您展示它将如何打印多少参数（数字的文件）在我们运行时传递给您的 awk 程序。

注意 2： 还使用 PROCINFO["sorted_in"]="@ind_num_asc" 确保它按索引的升序显示输出，例如：从 1 到2到3等等...

说明：为上述awk程序添加详细说明。

awk '                                       ##Starting awk program from here.
NR==1{                                      ##Checking if its very first line of very first file.
  print ARGC-1                              ##Printing total number of files passed to it by ARGC-1.
}
FNR==1                                      ##Checking if its every file 1st line.
{
  arr[ARGIND","FILENAME]                    ##Creating array with index of ARGIND comma and filename.
  nextfile                                  ##nextfile will take program to read next file.
}
END{                                        ##Starting END block for this program from here.
  PROCINFO["sorted_in"]="@ind_num_asc"      ##Using this function to get indexes sorted.
  for(i in arr){                            ##Traversing through arr here.
    split(i,getVal,",")                     ##Splitting index into getVal array with delimiter as ,
    print "Index is:"getVal[1]", Filename is:"getVal[2] ##Printing indexes and filenames by 1st 2 elements of getVal here.
  }
}
' test*                                     ##Passing all test files into this awk program.

在awk中动态创建n个数组

Creating dynamically n arrays in awk

awk