numpy中的Unicode元素字符串比较

Question

我有一个关于与 numpy 和字符串数组进行相等比较的问题。假设我定义了以下数组：

x = np.array(['yes', 'no', 'maybe'])

然后我可以测试与其他字符串的相等性，它与单个字符串进行元素明智的比较（我认为，遵循这里的广播规则：http://docs.scipy.org/doc/numpy-1.10.1/user/basics.broadcasting.html？）：

'yes' == x
#op : array([ True, False, False], dtype=bool)

x == 'yes'
#op : array([ True, False, False], dtype=bool)

但是，如果我与 unicode 字符串进行比较，我会得到不同的行为，只有当我将数组与字符串进行比较时才会发生元素明智的比较，而如果我将字符串与数组进行比较则只会进行一次比较。

x == u'yes'
#op : array([ True, False, False], dtype=bool)

u'yes' == x
#op : False

我在 numpy 文档中找不到此行为的详细信息，希望有人可以解释或指出为什么与 unicode 字符串的比较行为不同的详细信息？

Answer 1

相关资料是Python's coercion rules的这部分：

For objects xand y, first x.__op__(y) is tried. If this is not implemented or returns NotImplemented, y.__rop__(x) is tried.

使用你的 numpy 数组 x，当左侧是 str ('yes' == x):

'yes'.__eq__(x) returns NotImplemented 和
因此解析为 x.__eq__('yes') – 导致 numpy 的元素比较。

然而，当左边是unicode (u'yes' == x):

u'yes'.__eq__(x) 简单 returns False.

不同 __eq__ 行为的原因是 str.__eq__() 只是 returns NotImplemented 如果它的参数不是 str 类型，而 unicode.__eq__() 首先尝试将其参数转换为 unicode，如果转换失败，则仅尝试 returns NotImplemented。在这种情况下，numpy 数组可转换为 unicode：u'yes' == x 本质上是 u'yes' == unicode(x).

numpy中的Unicode元素字符串比较

Unicode elementwise string comparison in numpy

python

arrays

unicode

numpy

python-2.x