Perl PDL：搜索向量是在数组中还是在矩阵中

Question

我尝试像在 PDL 矩阵或 Vector 数组上做一个 grep :

my @toto;
push(@toto, pdl(1,2,3));
push(@toto, pdl(4,5,6));
my $titi=pdl(1,2,3);
print("OK") if (grep { $_ eq $titi} @toto);

我也试过了

my @toto;
push(@toto, pdl(1,2,3));
push(@toto, pdl(4,5,6));
my $titi=pdl(1,2,3);
print("OK") if (grep { $_ eq $titi} PDL::Matrix->pdl(\@toto));

None 有效。

任何帮助请

Answer 1

您可以使用 eq_pdl from Test::PDL:

use PDL;
use Test::PDL qw( eq_pdl );
my @toto;
push(@toto, pdl(1,2,3));
push(@toto, pdl(4,5,6));
my $titi = pdl(4,5,6);
print("OK\n") if (grep { eq_pdl( $_, $titi) } @toto);

输出:

OK

Answer 2

免责声明：我对 PDL 一无所知。我已经阅读了源代码来解决这个问题。

有一个函数 PDL::all() 可以与重载比较运算符 == 结合使用。

use PDL;
my $foo = pdl(1,2,3);
my $bar = pdl(4,5,6);
my $qrr = pdl(1,2,3);

print "OK 1" if PDL::all( $foo == $bar );
print "OK 2" if PDL::all( $foo == $qrr );

我仍在寻找文档。

Answer 3

以可缩放的方式有效地执行此操作的方法是使用 PDL::VectorValued::Utils，其中包含两个 ndarray（“haystack”是一个 ndarray，而不是 Perl 的 ndarray 数组）。小函数 vv_in 没有显示 copy-pasted 到 perldl CLI 中，因为它会比这个答案少 copy-pastable：

sub vv_in {
  require PDL::VectorValued::Utils;
  my ($needle, $haystack) = @_;
  die "needle must have 1 dim less than haystack"
    if $needle->ndims != $haystack->ndims - 1;
  my $ign = $needle->dummy(1)->zeroes;
  PDL::_vv_intersect_int($needle->dummy(1), $haystack, $ign, my $nc=PDL->null);
  $nc;
}
pdl> p $titi = pdl(1,2,3)
[1 2 3]
pdl> p $toto = pdl([1,2,3], [4,5,6])
[
 [1 2 3]
 [4 5 6]
]
pdl> p $notin = pdl(7,8,9)
[7 8 9]
pdl> p vv_in($titi, $toto)
[1]
pdl> p vv_in($notin, $toto)
[0]

请注意，为了提高效率，$haystack 需要已经排序（使用 qsortvec）。 dummy 将 $needle “膨胀”为具有一个向量的 vector-set，然后 vv_intersect returns 两个 ndarrays：

要么是相交的 vector-set（这里总是一个向量），要么是一组零（可能是例程的一个缺点，它应该是 vectorlength,0 - 一个空的 ndarray )
找到的向量数量（此处为 0 或 1）

使用“内部”(_vv_intersect_int) 版本是因为从 PDL::VectorValued 1.0.15 开始，它有一些不允许广播的包装 Perl 代码（issue已归档）。

注意 vv_in 将在多组 input-vectors 和 input-haystacks 上“广播”（以前称为“线程”，容易混淆）。这可用于搜索多个向量：

sub vv_in_multi {
  my ($needles, $haystack) = @_;
  die "needles must have same number of dims as haystack"
    if $needles->ndims != $haystack->ndims;
  vv_in($needles, $haystack->dummy(-1));
}
pdl> p vv_in_multi(pdl($titi,$notin), $toto)
[1 0]

Answer 4

感谢 Ed 上面的 VectorValued shout-out（以及错误报告）。经过反思，我发现如果 $toto 已排序（la qsortvec()，如您的示例所示），您可以使用 vsearchvec()，也来自 [=19] =] 并且通常比 vv_intersect 快（对数与线性）：

sub vv_in_vsearch {
  require PDL::VectorValued::Utils;
  my ($needle, $haystack) = @_;
  my $found = $needle->vsearchvec($haystack);
  return all($haystack->dice_axis(1,$found) == $needle);
}
pdl> $titi = pdl(1,2,3)
pdl> $tata = pdl(4,5,6)
pdl> $toto = pdl([1,2,3], [4,5,6])
pdl> $notin = pdl(7,8,9)
pdl> p vv_in_vsearch($titi, $toto)
1
pdl> p vv_in_vsearch($tata, $toto)
1
pdl> p vv_in_vsearch($notin, $toto)
0

（完全披露：我编写并维护 PDL::VectorValued）

Perl PDL：搜索向量是在数组中还是在矩阵中

Perl PDL : Search if a vector is in an array or in a matrix

perl

grep

pdl