使用 numpy 进行单变量回归的神经网络仅给出线性结果
Neural network for univariate regression with numpy gives only linear results
我的目标是创建一个具有单个隐藏层(使用 ReLU 激活)的神经网络,它能够逼近一个简单的单变量平方根函数。
我已经用 numpy 实现了网络,也做了梯度检查,一切似乎都很好,除了结果:出于某种原因我只能获得线性近似值,如下所示:noisy sqrt approx
尝试更改超参数,但没有成功。有什么想法吗?
import numpy as np
step_size = 1e-6
input_size, output_size = 1, 1
h_size = 10
train_size = 500
x_train = np.abs(np.random.randn(train_size, 1) * 1000)
y_train = np.sqrt(x_train) + np.random.randn(train_size, 1) * 0.5
#initialize weights and biases
Wxh = np.random.randn(input_size, h_size) * 0.01
bh = np.zeros((1, h_size))
Why = np.random.randn(h_size, output_size) * 0.01
by = np.zeros((1, output_size))
for i in range(300000):
#forward pass
h = np.maximum(0, np.dot(x_train, Wxh) + bh1)
y_est = np.dot(h, Why) + by
loss = np.sum((y_est - y_train)**2) / train_size
dy = 2 * (y_est - y_train) / train_size
print("loss: ",loss)
#backprop at output
dWhy = np.dot(h.T, dy)
dby = np.sum(dy, axis=0, keepdims=True)
dh = np.dot(dy, Why.T)
#backprop ReLU non-linearity
dh[h <= 0] = 0
#backprop Wxh, and bh
dWxh = np.dot(x_train.T, dh)
dbh = np.sum(dh1, axis=0, keepdims=True)
Wxh += -step_size * dWxh
bh += -step_size * dbh
Why += -step_size * dWhy
by += -step_size * dby
编辑:
问题似乎是缺乏规范化和数据不是以零为中心的。在对训练数据应用这些转换后,我设法获得了以下结果:noisy sqrt2
我可以让你的代码产生一种分段线性近似:
如果我 zero-centre 并标准化您的输入和输出范围:
# normalise range and domain
x_train -= x_train.mean()
x_train /= x_train.std()
y_train -= y_train.mean()
y_train /= y_train.std()
剧情是这样产生的:
x = np.linspace(x_train.min(),x_train.max(),3000)
y = np.dot(np.maximum(0, np.dot(x[:,None], Wxh) + bh), Why) + by
import matplotlib.pyplot as plt
plt.plot(x,y)
plt.show()
我的目标是创建一个具有单个隐藏层(使用 ReLU 激活)的神经网络,它能够逼近一个简单的单变量平方根函数。 我已经用 numpy 实现了网络,也做了梯度检查,一切似乎都很好,除了结果:出于某种原因我只能获得线性近似值,如下所示:noisy sqrt approx
尝试更改超参数,但没有成功。有什么想法吗?
import numpy as np
step_size = 1e-6
input_size, output_size = 1, 1
h_size = 10
train_size = 500
x_train = np.abs(np.random.randn(train_size, 1) * 1000)
y_train = np.sqrt(x_train) + np.random.randn(train_size, 1) * 0.5
#initialize weights and biases
Wxh = np.random.randn(input_size, h_size) * 0.01
bh = np.zeros((1, h_size))
Why = np.random.randn(h_size, output_size) * 0.01
by = np.zeros((1, output_size))
for i in range(300000):
#forward pass
h = np.maximum(0, np.dot(x_train, Wxh) + bh1)
y_est = np.dot(h, Why) + by
loss = np.sum((y_est - y_train)**2) / train_size
dy = 2 * (y_est - y_train) / train_size
print("loss: ",loss)
#backprop at output
dWhy = np.dot(h.T, dy)
dby = np.sum(dy, axis=0, keepdims=True)
dh = np.dot(dy, Why.T)
#backprop ReLU non-linearity
dh[h <= 0] = 0
#backprop Wxh, and bh
dWxh = np.dot(x_train.T, dh)
dbh = np.sum(dh1, axis=0, keepdims=True)
Wxh += -step_size * dWxh
bh += -step_size * dbh
Why += -step_size * dWhy
by += -step_size * dby
编辑: 问题似乎是缺乏规范化和数据不是以零为中心的。在对训练数据应用这些转换后,我设法获得了以下结果:noisy sqrt2
我可以让你的代码产生一种分段线性近似:
如果我 zero-centre 并标准化您的输入和输出范围:
# normalise range and domain
x_train -= x_train.mean()
x_train /= x_train.std()
y_train -= y_train.mean()
y_train /= y_train.std()
剧情是这样产生的:
x = np.linspace(x_train.min(),x_train.max(),3000)
y = np.dot(np.maximum(0, np.dot(x[:,None], Wxh) + bh), Why) + by
import matplotlib.pyplot as plt
plt.plot(x,y)
plt.show()