将激活函数从 Sigmoid 更改为 Tanh?

changing activation function from Sigmoid to Tanh?

我正在尝试将我的神经网络从对隐藏层和输出层使用 sigmoid 激活更改为 tanh 函数。 我很困惑我应该改变什么。只是神经元的输出计算还是反向传播的误差计算? 这是输出计算:

public void calcOutput() 
{
    if (!isBias) 
    {
        float sum = 0;
        float bias = 0;
        //System.out.println("Looking through " + connections.size() + " connections");
        for (int i = 0; i < connections.Count; i++) 
        {
            Connection c = (Connection) connections[i];
            Node from = c.getFrom();
            Node to = c.getTo();
            // Is this connection moving forward to us
            // Ignore connections that we send our output to
            if (to == this) 
            {
                // This isn't really necessary
                // But I am treating the bias individually in case I need to at some point
                if (from.isBias) bias = from.getOutput()*c.getWeight();
                else sum += from.getOutput()*c.getWeight();
            }
        }
        // Output is result of sigmoid function
        output = Tanh(bias+sum);
    }
}

它对我之前训练它的方式非常有效,但现在我想训练它给出 1 或 -1 作为输出。 当我改变 输出 = Sigmoid(偏差+总和); 到 输出= Tanh(偏置+总和); 结果全乱了...

乙状结肠:

public static float Sigmoid(float x) 
{
    return 1.0f / (1.0f + (float) Mathf.Exp(-x));
}

坦赫:

public float Tanh(float x)
{
    //return (float)(Mathf.Exp(x) - Mathf.Exp(-x)) / (Mathf.Exp(x) + Mathf.Exp(-x));
    //return (float)(1.7159f * System.Math.Tanh(2/3 * x));
    return (float)System.Math.Tanh(x);
}

如您所见,我尝试了为 tanh 找到的不同公式,但 none 输出有意义,我在问 0 的地方得到 -1,在我问 1 的地方得到 0.76159,或者它一直在正数和正数之间翻转当询问 -1 和其他不匹配时为负数...

-EDIT- 更新当前工作代码(将上面的 calcOuput 更改为我现在使用的):

public float[] train(float[] inputs, float[] answer) 
{
    float[] result = feedForward(inputs);
    deltaOutput = new float[result.Length];

    for(int ii=0; ii<result.Length; ii++)
    {
        deltaOutput[ii] = 0.66666667f * (1.7159f - (result[ii]*result[ii]))  * (answer[ii]-result[ii]);
    }

    // BACKPROPOGATION

    for(int ii=0; ii<output.Length; ii++)
    {
        ArrayList connections = output[ii].getConnections();
        for (int i = 0; i < connections.Count; i++) 
        {
            Connection c = (Connection) connections[i];
            Node node = c.getFrom();
            float o = node.getOutput();
            float deltaWeight = o*deltaOutput[ii];
            c.adjustWeight(LEARNING_CONSTANT*deltaWeight);
        }
    }

    // ADJUST HIDDEN WEIGHTS
    for (int i = 0; i < hidden.Length; i++) 
    {
        ArrayList connections = hidden[i].getConnections();
        //Debug.Log(connections.Count);
        float sum  = 0;
        // Sum output delta * hidden layer connections (just one output)
        for (int j = 0; j < connections.Count; j++) 
        {
            Connection c = (Connection) connections[j];
            // Is this a connection from hidden layer to next layer (output)?
            if (c.getFrom() == hidden[i]) 
            {
                for(int k=0; k<deltaOutput.Length; k++)
                    sum += c.getWeight()*deltaOutput[k];
            }
        }    
        // Then adjust the weights coming in based:
        // Above sum * derivative of sigmoid output function for hidden neurons
        for (int j = 0; j < connections.Count; j++) 
        {
            Connection c = (Connection) connections[j];
            // Is this a connection from previous layer (input) to hidden layer?
            if (c.getTo() == hidden[i]) 
            {
                float o = hidden[i].getOutput();
                float deltaHidden = o * (1 - o);  // Derivative of sigmoid(x)
                deltaHidden *= sum;   
                Node node = c.getFrom();
                float deltaWeight = node.getOutput()*deltaHidden;
                c.adjustWeight(LEARNING_CONSTANT*deltaWeight);
            }
        } 
    }
    return  result;
}

I'm confused what i should change. just the output calculation for the neurons or also error calculation for back propagation? this is the output calculation:

您应该在反向传播代码中的某处使用 sigmoid 函数的导数。您还需要将其替换为 tanh 函数的导数,即 1 - (tanh(x))^2.

您的代码看起来像 C#。我明白了:

Console.WriteLine(Math.Tanh(0));     // prints 0
Console.WriteLine(Math.Tanh(-1));    // prints -0.761594155955765
Console.WriteLine(Math.Tanh(1));     // prints 0.761594155955765
Console.WriteLine(Math.Tanh(0.234)); // prints 0.229820548214317
Console.WriteLine(Math.Tanh(-4));    // prints -0.999329299739067

符合tanh情节:

我认为您读错了结果:您得到 1 的正确答案。您确定 tanh(0) 得到 -1 吗?

如果您确定有问题,请post更多代码。