在 Perl 中如何设置长度并调整数组字段？

Question

我正在分解一个文本文件，并将其设置为一个新文件。代码有效，但我知道格式没有正确排列，因为我是 Perl 的新手，而且 google 搜索似乎无效。构建数组后可以设置数组的各个字段长度吗？

while (my $line = <INFILE1>) 
{   
    chomp $line;
    my @tokens = split /\t/, $line;
    $numOfElements = 0;
    $counter = 0;
    foreach $element (@tokens)
    {
        $counter = $counter + 1;
    }

foreach $element (@tokens)
{

    if ($element eq "" or $element eq " ")
    {

    }
    else
    {
        push @shiftedElements, $element;

        $numOfElements = $numOfElements + 1;

    }



}


my @finalElementLine = ($numOfElements);#used to prevent array size` from not matching up with the elements in the new array
    push @finalElementLine, @shiftedElements;#fills the new array 
    $printToFile = " $finalElementLine[1] |   $finalElementLine[2]   |   $finalElementLine[$numOfElements]   |   $finalElementLine[$numOfElements-4]  |  $finalElementLine[$numOfElements-3] | $finalElementLine[$numOfElements-2]  $finalElementLine[$numOfElements-1]\n";




    my $OUTFILE;        
    open $OUTFILE, '>>', $newFile;
    print { $OUTFILE } $printToFile;
    close $OUTFILE;

Answer 1

我不确定我是否完全理解问题，如果需要请澄清。

正在打印的字段的宽度可以由 printf, or you can form a string of the desired length by sprintf 控制。

为了使整个输出很好地排列，您首先需要找到每列中最长字符串的长度，或者至少是总共最长的字符串。这在您显示的内容中不太可能，因为您一次打印一行。

my $maxlen = '...';  # decide on or precompute the maximum field width

my $printToFile = join ' |  ', 
    map { sprintf "%${maxlen}s", $_ } @finalElementLine;

map formats a string of length $maxlen out of each element, by padding each with spaces as needed. It returns that list, which is then join-ed 为问题中使用的标量。

如果您想将它们排在左侧，请使用 sprintf "%-${maxlen}s", $_。我使用 s 转换（对于字符串），因为没有给出详细信息。请参阅文档并根据需要进行调整。

为了可靠地估计最大字段宽度，您需要首先拥有所有行。如果没有太多数据，您可以将每个处理过的行作为数组中的 arrayref 存储并在最后打印。进行其他简化

use warnings;
use strict;
use List::Util qw(max);

my $file = '...';
open my $fh, '<', $file or die "Can't open $file: $!";

while (my $line = <$fh>) 
{   
    chomp $line;
    my @tokens = split /\t/, $line;   

    # Run the explicit loop if other processing is needed, or:
    my @shiftedElements = grep { $_ ne '' and $_ ne ' ' } @tokens;
    my $numOfElements = @shiftedElements;

    # UNCLEAR -- is the first element below necessary?
    # "used to prevent array size from 
    #  not matching up with the elements in the new array"
    my @finalElementLine = ($numOfElements, @shiftedElements);

    push @rows, \@finalElementLine;
}
close $fh;

my $maxlen = max map { length } map { @$_ } @rows;  # for all fields in all rows

open my $OUTFILE, '>>', $newFile or die "Can't open for appending: $!";
foreach my $rline (@rows) 
{
    my $printToFile = join ' |  ', 
        map { sprintf "%${maxlen}", $_ } @$rline;
    print $OUTFILE $printToFile, "\n";
}
close $OUTFILE;

这会打印出所有具有相同宽度的字段。如果有些比其他不是最佳的长得多，在这种情况下为 每一列 分别设置字段宽度并在打印中使用它。这使得打印有点混乱，所以只在必要时才打印。由于我没有您的数据，因此尚未对此进行测试，请找出可能的详细信息。

一些评论

当数组赋值给标量时，标量得到数组元素的个数
$counter 没有被使用所以我删除了它。要恢复：my $counter = @tokens;
grep 中的条件可以使用正则表达式缩短
每行 (@finalElementLine) 作为数组引用存储在 @rows 中
$maxlen：形成所有行中所有字段的列表，然后取它们的长度，然后取最大值
@rows 的每个元素 $rline 被 @$rline 取消引用 到 map 的列表中
如果实际上不需要 $NumOfElements 整个循环会大大简化
```
push @rows, [ grep { not /^(?:| )$/ } @tokens ];
```
如果您可以排除 任意数量 的 space（并且不仅仅是一个），则使用
grep { not /^\s*$/ } 不仅 space 秒（或无） – 或 –
grep { /\S/ } a non-space（至少一个）

如果 $numOfElements 不是必需的，则处理顺序的摘要是

my @rows = map { 
    my @r = grep { /\S/ } split /\t/; 
    @r ? \@r : (); 
} <$fh>;

虽然这正确地替换了 while 循环，但这种挤压可能不适合生产。

列表上下文 return 中的 <$fh> 是文件中的所有行，map 将其转换为输出列表，分配给 @rows。在 map 中，每一行都在选项卡上 split 并且从该列表中过滤掉 empty/space-only 元素。 refarray 是 returned，如果 @r 最终没有元素，则为空列表 ()。

map 的 return 中的一个空列表与其他元素一起被压平到一个列表中，从而有效地从输出中消失。这是 map 做 grep 工作的把戏，过滤掉东西。

在 Perl 中如何设置长度并调整数组字段？

How do you set a length and justify an Array field in Perl?

arrays

perl

customization