神经网络（操作数不能与形状一起广播（1,713）（713,18））

Question

我目前正在 Coursera 上学习 Deeplearning.ai 的深度学习专业，并且我正在执行第一个需要使用逻辑回归思维方式实施神经网络的任务。问题是分配是神经网络作为 UNSTRUCTURED DATA (IMAGES) 的逻辑回归函数的实现。我已经成功完成了任务，获得了所有预期的输出。但是，我现在正尝试将编码神经网络用于 STRUCTURE DATA，但遇到了广播错误。部分代码如下：

数据集代码

path_train = r'C:\Users\Ahmed Ismail Khalid\Desktop\Research Paper\Research Paper Feature Sets\Balanced Feature Sets\Balanced Train combined scores.csv'
path_test = r'C:\Users\Ahmed Ismail Khalid\Desktop\Research Paper\Research Paper Feature Sets\Balanced Feature Sets\Balanced Test combined scores.csv'

df_train = pd.read_csv(path_train)
#df_train = df_train.to_numpy()

df_test = pd.read_csv(path_test)
#df_test = df_test.to_numpy()

x_train = df_train.iloc[:,1:19]
x_train = x_train.to_numpy()
x_train = x_train.T

y_train = df_train.iloc[:,19]
y_train = y_train.to_numpy()
y_train = y_train.reshape(y_train.shape[0],1)
y_train = y_train.T

x_test = df_test.iloc[:,1:19]
x_test = x_test.to_numpy()
x_test = x_test.T

y_test = df_test.iloc[:,19]
y_test = y_test.to_numpy()
y_test = y_test.reshape(y_test.shape[0],1)
y_test = y_test.T

print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("train_set_x shape: " + str(x_train.shape))
print ("train_set_y shape: " + str(y_train.shape))
print ("test_set_x shape: " + str(x_test.shape))
print ("test_set_y shape: " + str(y_test.shape))

数据集代码输出

Number of training examples: df_train = 713
Number of testing examples: df_test = 237
x_train shape: (18, 713)
y_train shape: (1, 713)
x_test shape: (18, 237)
y_test shape: (1, 237)

传播函数代码

def propagate(w,b,X,Y) :

    m = X.shape[1]

    A = sigmoid((w.T * X) + b)

    cost = (- 1 / m) * np.sum(np.dot(Y,np.log(A)) + np.dot((1 - Y), np.log(1 - A)))

    dw = (1 / m) * np.dot((X,(A - Y)).T)
    db = (1 / m) * np.sum(A - Y)

    assert(dw.shape == w.shape)
    assert(db.dtype == float)
    cost = np.squeeze(cost)
    assert(cost.shape == ())

    grads = {"dw": dw,
             "db": db}

    return grads, cost

优化和模型函数

**def optimize**(w,b,X,Y,num_iterations,learning_rate,print_cost) :

costs = []

for i in range(num_iterations) :

    # Cost and gradient calculation
    grads, cost = propagate(w,b,X,Y)

    # Retrieve derivatives from gradients
    dw = grads['dw']
    db = grads['db']

    # Update w and b
    w = w - learning_rate * dw
    b = b - learning_rate * db

    if i % 100 == 0:
        costs.append(cost)

    # Print the cost every 100 training iterations
    if print_cost and i % 100 == 0:
        print ("Cost after iteration %i: %f" %(i, cost))

    params = {"w": w,
          "b": b}

    grads = {"dw": dw,
         "db": db}

    return params, grads, costs

**def model**(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False) :

# initialize parameters with zero
w, b = initialize_with_zeros(X_train.shape[0])

# Gradient descent (≈ 1 line of code)
parameters, grads, costs = optimize(w,b,X_train,Y_train,num_iterations,learning_rate,print_cost)

# Retrieve parameters w and b from dictionary "parameters"
w = parameters["w"]
b = parameters["b"]

# Predict train/test set examples (≈ 2 lines of code)
Y_prediction_train = predict(w,b,X_train)
Y_prediction_test = predict(w,b,X_test)

 # Print train/test Errors
print("train accuracy: {} %".format(100 - np.mean(abs(Y_prediction_train - Y_train)) * 100))
print("test accuracy: {} %".format(100 - np.mean(abs(Y_prediction_test - Y_test)) * 100))


d = {"costs": costs,
     "Y_prediction_test": Y_prediction_test, 
     "Y_prediction_train" : Y_prediction_train, 
     "w" : w, 
     "b" : b,
     "learning_rate" : learning_rate,
     "num_iterations": num_iterations}

return d

模型函数输出

Cost after iteration 0: 0.693147
train accuracy: -0.1402524544179613 %
test accuracy: 0.4219409282700326 %

当我运行代码时，我在 A = sigmoid((w.T * X) + b) 处得到 ValueError: operands could not be broadcast together with shapes (1,713) (713,18)。我对神经网络和 numpy 的使用还很陌生，所以我无法找出问题所在。任何和所有帮助将不胜感激。包含全部代码的整个 .ipynb 文件可以是 downloaded from here

谢谢

Answer 1

* 运算符是逐元素乘法，您的数组具有不兼容的形状。您需要矩阵乘法，可以使用 np.matmul() 或 @ 运算符：

A = sigmoid(w.T @ X + b)

很多机器学习，尤其是神经网络，都是关于保持事物的形状笔直。检查 w、X 和 Y 的形状——它们应该分别是：(features, 1)、(features, m)、(1, m)，其中 features对你来说是18，m是713。

然后您还应该能够确保 A 的形状匹配 Y。

神经网络（操作数不能与形状一起广播（1,713）（713,18））

Neural Network (operands could not be broadcast together with shapes (1,713) (713,18) )

python

numpy

broadcast

neural-network

logistic-regression