如何重复一个数字序列到一列的末尾？

Question

我有一个数据文件，需要一个新的标识符列，从1到5。最终目的是将数据拆分成五个单独的文件，没有剩余文件（拆分留下一个剩余文件）。

数据：

aa
bb
cc
dd
ff
nn
ww
tt
pp

标识符列：

aa 1
bb 2
cc 3
dd 4
ff 5
nn 1
ww 2
tt 3
pp 4

不确定这是否可以用 seq 完成？之后它将被拆分为：

awk ' == 1 {print [=12=]}' 
awk ' == 2 {print [=12=]}' 
awk ' == 3 {print [=12=]}' 
awk ' == 4 {print [=12=]}' 
awk ' == 5 {print [=12=]}'

Answer 1

Perl 来拯救：

perl -pe 's/$/" " . $. % 5/e' < input > output

使用 0 而不是 5。

$.是行号。
% 是模运算符。
/e 修饰符告诉替换将替换部分评估为代码

即行尾 ($) 替换为 space 连接 (.) 与行号模 5.

Answer 2

$ awk '{print [=10=], ((NR-1)%5)+1}' file
aa 1
bb 2
cc 3
dd 4
ff 5
nn 1
ww 2
tt 3
pp 4

当然不需要创建 5 个单独的文件。您只需要：

awk '{print > ("file_" ((NR-1)%5)+1)}' file

看起来您对输出 1-4 然后 0 而不是 1-5 的 perl 解决方案很满意所以仅供参考，这是 awk 中的等价物：

$ awk '{print [=12=], NR%5}' file        
aa 1
bb 2
cc 3
dd 4
ff 0
nn 1
ww 2
tt 3
pp 4

Answer 3

我将提供一个 Perl 解决方案，尽管它没有被标记，因为 Perl 非常适合解决这个问题。

如果我明白你想做什么，你有一个文件，你想根据数据文件中一行的位置拆分成 5 个单独的文件：

the first line in the data file goes to file 1
the second line in the data file goes to file 2 
the third line in the data file goes to file 3 
...

因为您已经在文件中找到行位置，所以您实际上并不需要标识符列（尽管您可以根据需要寻求该解决方案）。

相反，您可以打开 5 个文件句柄并简单地交替写入哪个句柄：

use strict;
use warnings; 

my $datafilename = shift @ARGV; 

# open filehandles and store them in an array 
my @fhs;
foreach my $i ( 0 .. 4 ) {
   open my $fh, '>', "${datafilename}_$i"
      or die "$!";
   $fhs[$i] = $fh;
}

# open the datafile 
open my $datafile_fh, '<', $datafilename 
   or die "$!";

my $row_number = 0;
while ( my $datarow = <$datafile_fh> ) {
   print { $fhs[$row_number++ % @fhs] } $datarow;
}

# close resources
foreach my $fh ( @fhs ) {
   close $fh; 
}

如何重复一个数字序列到一列的末尾？

How to repeat a sequence of numbers to the end of a column?

perl

awk

sed

seq