为什么后面的操作数不能一起广播?

Why the following operands could not be broadcasted together?

数组具有以下维度: dists: (500,5000) train: (5000,) test:(500,)

为什么前两个语句抛出错误,而第三个语句运行良好?

  1. dists += train + test

错误:ValueError: operands could not be broadcast together with shapes (5000,) (500,)

  1. dists += train.reshape(-1,1) + test.reshape(-1,1)

错误:ValueError: operands could not be broadcast together with shapes (5000,1) (500,1)

  1. dists += train + test.reshape(-1,1) 这很好用!

为什么会这样?

与NumPy的广播规则有关。引用 NumPy 手册:

When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when

  1. they are equal, or
  2. one of them is 1

第一条语句报错,因为NumPy只看维度,(5000,)(500,)不相等,不能一起广播

在第二个语句中,train.reshape(-1,1) 的形状为 (5000,1)test.reshape(-1,1) 的形状为 (500,1)。尾部维度(长度一)相等,所以没关系,但是 NumPy 检查另一个维度和 5000 != 500,所以这里广播失败。

第三种情况,你的操作数是(5000,)(500,1)。在这种情况下,NumPy 确实 允许广播。一维数组沿二维数组的尾部长度 1 维度扩展。

FWIW,形状和广播规则有时会有点棘手,我经常被类似的事情弄糊涂。