为什么在使用 NumPy 进行楼层划分时会显示数据类型（即使它是本机数据类型）？

Question

通常 dtype 等同于本机类型时会被隐藏：

>>> import numpy as np
>>> np.arange(5)
array([0, 1, 2, 3, 4])
>>> np.arange(5).dtype
dtype('int32')

>>> np.arange(5) + 3
array([3, 4, 5, 6, 7])

但不知何故不适用于地板除法或模数：

>>> np.arange(5) // 3
array([0, 0, 0, 1, 1], dtype=int32)
>>> np.arange(5) % 3
array([0, 1, 2, 0, 1], dtype=int32)

为什么会有差异？

Python 3.5.4，NumPy 1.13.1，Windows 64 位

Answer 1

归结为dtype的区别，从view可以看出：

In [186]: x = np.arange(10)
In [187]: y = x // 3
In [188]: x
Out[188]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [189]: y
Out[189]: array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3], dtype=int32)
In [190]: x.view(y.dtype)
Out[190]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)
In [191]: y.view(x.dtype)
Out[191]: array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3])

尽管 dtype descr 相同，但有些属性有所不同。但是哪个？

In [192]: x.dtype.descr
Out[192]: [('', '<i4')]
In [193]: y.dtype.descr
Out[193]: [('', '<i4')]

In [204]: x.dtype.type
Out[204]: numpy.int32
In [205]: y.dtype.type
Out[205]: numpy.int32
In [207]: dtx.type is dty.type
Out[207]: False

In [243]: np.core.numeric._typelessdata
Out[243]: [numpy.int32, numpy.float64, numpy.complex128]
In [245]: x.dtype.type in np.core.numeric._typelessdata
Out[245]: True
In [246]: y.dtype.type in np.core.numeric._typelessdata
Out[246]: False

所以 ys dtype.type 从表面上看与 xs 相同，但它是不同的对象，具有不同的 id:

In [261]: id(np.int32)
Out[261]: 3045777728
In [262]: id(x.dtype.type)
Out[262]: 3045777728
In [263]: id(y.dtype.type)
Out[263]: 3045777952
In [282]: id(np.intc)
Out[282]: 3045777952

将这个额外的 type 添加到列表中，y 不再显示数据类型：

In [267]: np.core.numeric._typelessdata.append(y.dtype.type)
In [269]: y
Out[269]: array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3])

所以y.dtype.type是np.intc（和np.intp），而x.dtype.type是np.int32（和np.int_）。

因此，要制作一个显示 dtype 的数组，请使用 np.intc。

In [23]: np.arange(10,dtype=np.int_)
Out[23]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [24]: np.arange(10,dtype=np.intc)
Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)

要关闭此功能，请将 np.intc 附加到 np.core.numeric._typelessdata。

Answer 2

您实际上在这里有多个不同的 32 位整数数据类型。这可能是一个错误。

NumPy（不小心？）创建了多个不同的有符号 32 位整数类型，可能对应于 C int 和 long。它们都显示为 numpy.int32，但它们实际上是不同的对象。在 C 级别，我相信类型对象是 PyIntArrType_Type 和 PyLongArrType_Type，生成 here.

dtype 对象有一个 type 属性对应于该 dtype 标量的类型对象。就是这个type属性，NumPy inspects在决定是否打印数组的repr中的dtype信息时：

_typelessdata = [int_, float_, complex_]
if issubclass(intc, int):
    _typelessdata.append(intc)


if issubclass(longlong, int):
    _typelessdata.append(longlong)

...

def array_repr(arr, max_line_width=None, precision=None, suppress_small=None):
    ...
    skipdtype = (arr.dtype.type in _typelessdata) and arr.size > 0

    if skipdtype:
        return "%s(%s)" % (class_name, lst)
    else:
        ...
        return "%s(%s,%sdtype=%s)" % (class_name, lst, lf, typename)

在 numpy.arange(5) 和 numpy.arange(5) + 3 上，.dtype.type 是 numpy.int_；在 numpy.arange(5) // 3 或 numpy.arange(5) % 3 上，.dtype.type 是另一种 32 位有符号整数类型。

至于为什么+和//有不同的输出数据类型，他们使用不同的类型解析例程。 Here's the one for //, and here's + 的那个。 // 的类型解析寻找一个 ufunc 内部循环，该循环采用可以安全地将输入转换为的类型，而 + 的类型解析将 NumPy 类型提升应用于参数并选择与结果相匹配的循环类型。

为什么在使用 NumPy 进行楼层划分时会显示数据类型（即使它是本机数据类型）？

Why is the dtype shown (even if it's the native one) when using floor division with NumPy?

python

arrays

numpy

division

numpy-dtype