Perl

Question

我有这个字符串：

$str="     a, b,    c>d:e,  f,    g ";

在这个字符串中可能有 spaces and/or tabs

我在 perl 中拆分了字符串：

my (@COLUMNS) = split(/[\s\t,]+/, $str));

但这会在位置 [0] 中创建前导 space。

@COLUMNS=[

    a
    b
    c>d:e
    f
    g
]

我想要这个：

@COLUMNS=[
    a
    b
    c>d:e
    f
    g
]

Answer 1

一个非常常见的解决方案是转换 returned 拆分的值。在这种情况下，您想要删除任何前导或尾随 space，通常称为 trim 操作。使用这种方法，您根本不必担心拆分操作中的 spaces:

use strict; 
use warnings; 

my $str="     a, b,    c>d:e,  f,    g ";
my @columns = map { s/^\s*|\s*$//gr } split(/,/, $str);
print join(',', @columns), "\n";

@toolic 上面提到的另一种解决方案是预先删除所有 space：

use strict; 
use warnings; 

my $str="     a, b,    c>d:e,  f,    g ";
$str =~ s/\s+//g; # remove all occurrences of 1 or more spaces
my @columns = split(/,/, $str);
print join(',', @columns), "\n";

以上两种解决方案return这个输出：

a,b,c>d:e,f,g

有关 /r 修饰符的更多信息：

/r 是可应用于非破坏性替换的修饰符。这意味着原始字符串未被修改，而是创建、修改和 returned 的副本。这具有优势，因为通常在标量上下文中，s/// 运算符将 return 发生的替换次数而不是修改后的字符串。这仅适用于 >= 5.14 的 Perl 版本。对于低于此的 Perl 版本，等效语句为：

my $original = "some_string";
(my $copy = $original) =~ s/$search_pattern/$replace_pattern/;

并在地图中使用：

map { 
   (my $temp = $_) =~ s/$search_pattern/$replace_pattern/; $temp 
} split /$delimiter/, $original;

例如：

my $string = 'abc'; 
my $num_substitutions = $string =~ s/a/d/; # 1 

my $string = 'abc';
my $new_string = $string =~ s/a/d/r; # dbc

Answer 2

我建议您使用全局正则表达式匹配来查找既不是逗号也不是空格的字符的所有子序列

它将产生与您的 split(/[\s\t,]+/ 相同的输出。（请注意 \t 是多余的，因为 \s 也匹配制表符。）但是会创建一个没有任何空元素的列表

use strict;
use warnings 'all';

my $str = "     a, b,    c>d:e,  f,    g ";

my @columns = $str =~ /[^\s,]+/g;

use Data::Dump;
dd \@columns;

输出

["a", "b", "c>d:e", "f", "g"]

请注意，就像您的拆分一样，此方法将忽略任何空字段：a,,,b 之类的内容将 return [ 'a', 'b' ] 而不是 [ 'a', '', '', 'b' ]。此外，包含空格的列将被拆分，因此 a,two words,b 将生成 [ 'a', 'two', 'words', 'b' ] 而不是 [ 'a', 'two words', 'b' ]。这些情况是否有可能出现只有你自己知道

如果此方法有可能产生错误的结果，那么最好简单地以逗号分隔并编写一个子例程 trim 结果字段

use strict; 
use warnings 'all';

sub trim(;$);

my $str="     a  ,, ,two words ,,, b";
my @columns = map trim, split /,/, $str;

use Data::Dump;
dd \@columns;


sub trim(;$) {
    (my $trimmed = $_[0] // $_) =~ s/\A\s+|\s+\z//g;
    $trimmed;
}

输出

["a", "", "", "two words", "", "", "b"]

Perl - 在逗号上拆分字符串。忽略空格

Perl - Split string on comma. Ignore whitespace

regex

whitespace

split

comma

输出

输出