将激活函数从 Sigmoid 更改为 Tanh?
changing activation function from Sigmoid to Tanh?
我正在尝试将我的神经网络从对隐藏层和输出层使用 sigmoid 激活更改为 tanh 函数。
我很困惑我应该改变什么。只是神经元的输出计算还是反向传播的误差计算?
这是输出计算:
public void calcOutput()
{
if (!isBias)
{
float sum = 0;
float bias = 0;
//System.out.println("Looking through " + connections.size() + " connections");
for (int i = 0; i < connections.Count; i++)
{
Connection c = (Connection) connections[i];
Node from = c.getFrom();
Node to = c.getTo();
// Is this connection moving forward to us
// Ignore connections that we send our output to
if (to == this)
{
// This isn't really necessary
// But I am treating the bias individually in case I need to at some point
if (from.isBias) bias = from.getOutput()*c.getWeight();
else sum += from.getOutput()*c.getWeight();
}
}
// Output is result of sigmoid function
output = Tanh(bias+sum);
}
}
它对我之前训练它的方式非常有效,但现在我想训练它给出 1 或 -1 作为输出。
当我改变
输出 = Sigmoid(偏差+总和);
到
输出= Tanh(偏置+总和);
结果全乱了...
乙状结肠:
public static float Sigmoid(float x)
{
return 1.0f / (1.0f + (float) Mathf.Exp(-x));
}
坦赫:
public float Tanh(float x)
{
//return (float)(Mathf.Exp(x) - Mathf.Exp(-x)) / (Mathf.Exp(x) + Mathf.Exp(-x));
//return (float)(1.7159f * System.Math.Tanh(2/3 * x));
return (float)System.Math.Tanh(x);
}
如您所见,我尝试了为 tanh 找到的不同公式,但 none 输出有意义,我在问 0 的地方得到 -1,在我问 1 的地方得到 0.76159,或者它一直在正数和正数之间翻转当询问 -1 和其他不匹配时为负数...
-EDIT- 更新当前工作代码(将上面的 calcOuput 更改为我现在使用的):
public float[] train(float[] inputs, float[] answer)
{
float[] result = feedForward(inputs);
deltaOutput = new float[result.Length];
for(int ii=0; ii<result.Length; ii++)
{
deltaOutput[ii] = 0.66666667f * (1.7159f - (result[ii]*result[ii])) * (answer[ii]-result[ii]);
}
// BACKPROPOGATION
for(int ii=0; ii<output.Length; ii++)
{
ArrayList connections = output[ii].getConnections();
for (int i = 0; i < connections.Count; i++)
{
Connection c = (Connection) connections[i];
Node node = c.getFrom();
float o = node.getOutput();
float deltaWeight = o*deltaOutput[ii];
c.adjustWeight(LEARNING_CONSTANT*deltaWeight);
}
}
// ADJUST HIDDEN WEIGHTS
for (int i = 0; i < hidden.Length; i++)
{
ArrayList connections = hidden[i].getConnections();
//Debug.Log(connections.Count);
float sum = 0;
// Sum output delta * hidden layer connections (just one output)
for (int j = 0; j < connections.Count; j++)
{
Connection c = (Connection) connections[j];
// Is this a connection from hidden layer to next layer (output)?
if (c.getFrom() == hidden[i])
{
for(int k=0; k<deltaOutput.Length; k++)
sum += c.getWeight()*deltaOutput[k];
}
}
// Then adjust the weights coming in based:
// Above sum * derivative of sigmoid output function for hidden neurons
for (int j = 0; j < connections.Count; j++)
{
Connection c = (Connection) connections[j];
// Is this a connection from previous layer (input) to hidden layer?
if (c.getTo() == hidden[i])
{
float o = hidden[i].getOutput();
float deltaHidden = o * (1 - o); // Derivative of sigmoid(x)
deltaHidden *= sum;
Node node = c.getFrom();
float deltaWeight = node.getOutput()*deltaHidden;
c.adjustWeight(LEARNING_CONSTANT*deltaWeight);
}
}
}
return result;
}
I'm confused what i should change. just the output calculation for the neurons or also error calculation for back propagation? this is the output calculation:
您应该在反向传播代码中的某处使用 sigmoid 函数的导数。您还需要将其替换为 tanh
函数的导数,即 1 - (tanh(x))^2
.
您的代码看起来像 C#。我明白了:
Console.WriteLine(Math.Tanh(0)); // prints 0
Console.WriteLine(Math.Tanh(-1)); // prints -0.761594155955765
Console.WriteLine(Math.Tanh(1)); // prints 0.761594155955765
Console.WriteLine(Math.Tanh(0.234)); // prints 0.229820548214317
Console.WriteLine(Math.Tanh(-4)); // prints -0.999329299739067
符合tanh
情节:
我认为您读错了结果:您得到 1
的正确答案。您确定 tanh(0)
得到 -1
吗?
如果您确定有问题,请post更多代码。
我正在尝试将我的神经网络从对隐藏层和输出层使用 sigmoid 激活更改为 tanh 函数。 我很困惑我应该改变什么。只是神经元的输出计算还是反向传播的误差计算? 这是输出计算:
public void calcOutput()
{
if (!isBias)
{
float sum = 0;
float bias = 0;
//System.out.println("Looking through " + connections.size() + " connections");
for (int i = 0; i < connections.Count; i++)
{
Connection c = (Connection) connections[i];
Node from = c.getFrom();
Node to = c.getTo();
// Is this connection moving forward to us
// Ignore connections that we send our output to
if (to == this)
{
// This isn't really necessary
// But I am treating the bias individually in case I need to at some point
if (from.isBias) bias = from.getOutput()*c.getWeight();
else sum += from.getOutput()*c.getWeight();
}
}
// Output is result of sigmoid function
output = Tanh(bias+sum);
}
}
它对我之前训练它的方式非常有效,但现在我想训练它给出 1 或 -1 作为输出。 当我改变 输出 = Sigmoid(偏差+总和); 到 输出= Tanh(偏置+总和); 结果全乱了...
乙状结肠:
public static float Sigmoid(float x)
{
return 1.0f / (1.0f + (float) Mathf.Exp(-x));
}
坦赫:
public float Tanh(float x)
{
//return (float)(Mathf.Exp(x) - Mathf.Exp(-x)) / (Mathf.Exp(x) + Mathf.Exp(-x));
//return (float)(1.7159f * System.Math.Tanh(2/3 * x));
return (float)System.Math.Tanh(x);
}
如您所见,我尝试了为 tanh 找到的不同公式,但 none 输出有意义,我在问 0 的地方得到 -1,在我问 1 的地方得到 0.76159,或者它一直在正数和正数之间翻转当询问 -1 和其他不匹配时为负数...
-EDIT- 更新当前工作代码(将上面的 calcOuput 更改为我现在使用的):
public float[] train(float[] inputs, float[] answer)
{
float[] result = feedForward(inputs);
deltaOutput = new float[result.Length];
for(int ii=0; ii<result.Length; ii++)
{
deltaOutput[ii] = 0.66666667f * (1.7159f - (result[ii]*result[ii])) * (answer[ii]-result[ii]);
}
// BACKPROPOGATION
for(int ii=0; ii<output.Length; ii++)
{
ArrayList connections = output[ii].getConnections();
for (int i = 0; i < connections.Count; i++)
{
Connection c = (Connection) connections[i];
Node node = c.getFrom();
float o = node.getOutput();
float deltaWeight = o*deltaOutput[ii];
c.adjustWeight(LEARNING_CONSTANT*deltaWeight);
}
}
// ADJUST HIDDEN WEIGHTS
for (int i = 0; i < hidden.Length; i++)
{
ArrayList connections = hidden[i].getConnections();
//Debug.Log(connections.Count);
float sum = 0;
// Sum output delta * hidden layer connections (just one output)
for (int j = 0; j < connections.Count; j++)
{
Connection c = (Connection) connections[j];
// Is this a connection from hidden layer to next layer (output)?
if (c.getFrom() == hidden[i])
{
for(int k=0; k<deltaOutput.Length; k++)
sum += c.getWeight()*deltaOutput[k];
}
}
// Then adjust the weights coming in based:
// Above sum * derivative of sigmoid output function for hidden neurons
for (int j = 0; j < connections.Count; j++)
{
Connection c = (Connection) connections[j];
// Is this a connection from previous layer (input) to hidden layer?
if (c.getTo() == hidden[i])
{
float o = hidden[i].getOutput();
float deltaHidden = o * (1 - o); // Derivative of sigmoid(x)
deltaHidden *= sum;
Node node = c.getFrom();
float deltaWeight = node.getOutput()*deltaHidden;
c.adjustWeight(LEARNING_CONSTANT*deltaWeight);
}
}
}
return result;
}
I'm confused what i should change. just the output calculation for the neurons or also error calculation for back propagation? this is the output calculation:
您应该在反向传播代码中的某处使用 sigmoid 函数的导数。您还需要将其替换为 tanh
函数的导数,即 1 - (tanh(x))^2
.
您的代码看起来像 C#。我明白了:
Console.WriteLine(Math.Tanh(0)); // prints 0
Console.WriteLine(Math.Tanh(-1)); // prints -0.761594155955765
Console.WriteLine(Math.Tanh(1)); // prints 0.761594155955765
Console.WriteLine(Math.Tanh(0.234)); // prints 0.229820548214317
Console.WriteLine(Math.Tanh(-4)); // prints -0.999329299739067
符合tanh
情节:
我认为您读错了结果:您得到 1
的正确答案。您确定 tanh(0)
得到 -1
吗?
如果您确定有问题,请post更多代码。