在 softmax 层之前提取输出，然后手动计算 softmax 会给出不同的结果

Question

我有一个经过训练的模型可以将 rgb 值分为 1000 个类别。

#Model architecture
model = Sequential()
model.add(Dense(512,input_shape=(3,),activation="relu"))
model.add(BatchNormalization())
model.add(Dense(512,activation="relu"))
model.add(BatchNormalization())
model.add(Dense(1000,activation="relu"))
model.add(Dense(1000,activation="softmax"))

我希望能够在 softmax 层之前提取输出，以便我可以对模型中不同类别的样本进行分析。我想对每个样本执行 softmax，并使用名为 getinfo() 的函数进行分析。

型号最初，我将 X_train 数据输入 model.predict，以获得每个输入的 1000 个概率向量。我在此数组上执行 getinfo() 以获得所需的结果。
Pop1 然后我使用 model.pop() 删除 softmax 层。我得到了弹出模型的新预测，并执行 scipy.special.softmax。但是，getinfo() 在此数组上产生完全不同的结果。
Pop2 我编写了自己的 softmax 函数来验证第二个结果，并且我收到了与 Pop1 几乎相同的答案。
Pop3 但是，当我在 model.pop() 的输出上简单地计算 getinfo() 而没有 softmax 函数时，我得到与初始模型相同的结果。

data = np.loadtxt("allData.csv",delimiter=",")
model = load_model("model.h5")

def getinfo(data):
    objects = scipy.stats.entropy(np.mean(data, axis=0), base=2)
    print(('objects_mean',objects))
    colours_entropy = []
    for i in data:
        e = scipy.stats.entropy(i, base=2)
        colours_entropy.append(e)
    colours = np.mean(np.array(colours_entropy))
    print(('colours_mean',colours))
    info = objects - colours
    print(('objects-colours',info))
    return info

def softmax_max(data):
    # calculate softmax whilst subtracting the max values (axis=1)
    sm = []
    count = 0
    for row in data:
        max = np.argmax(row)
        e = np.exp(row-data[count,max])
        s = np.sum(e)
        sm.append(e/s)
    sm = np.asarray(sm)
    return sm

#model
preds = model.predict(X_train)
getinfo(preds)

#pop1
model.pop()
preds1 = model.predict(X_train)
sm1 = scipy.special.softmax(preds1,axis=1)
getinfo(sm1)

#pop2
sm2 = softmax_max(preds1)
getinfo(sm2)

#pop3
getinfo(preds1)

我希望从 Model、Pop1 和 Pop2 得到相同的输出，但对 Pop3 得到不同的答案，因为我没有在这里计算 softmax。我想知道问题是否出在 model.predict 之后计算 softmax？以及我是否在 Model 和 Pop3 中得到相同的结果，因为 softmax 将值限制在 0-1 之间，所以就 getinfo() 函数而言，结果在数学上是等效的？

如果是这种情况，那么在model.predict之前如何执行softmax？

我已经绕过了这个圈子，所以任何帮助或见解将不胜感激。如果有任何不清楚的地方，请告诉我。谢谢！

Answer 1

model.pop() 不会立即生效。您需要再次运行 model.compile() 重新编译不包含最后一层的新模型。

如果不重新编译，您实际上是在完全相同的模型上连续运行ning model.predict() 两次，这解释了为什么 Model 和 Pop3 给出相同的结果。 Pop1 和 Pop2 给出了奇怪的结果，因为它们正在计算 softmax 的 softmax。

此外，您的模型没有将 softmax 作为单独的层，因此 pop 去掉了整个最后 Dense 层。要解决此问题，请将 softmax 添加为单独的层，如下所示：

model.add(Dense(1000))           # softmax removed from this layer...
model.add(Activation('softmax')) # ...and added to its own layer

在 softmax 层之前提取输出，然后手动计算 softmax 会给出不同的结果

Extracting output before the softmax layer, then manually calculating softmax gives a different result

machine-learning

entropy

python-2.7

keras

softmax