Keras二分类不同数据集相同预测结果
Keras binary classification different dataset same prediction results
我有 2 个预测标签值,-1 或 1。
LSTM
或 Dense
的学习看起来不错,但是不同预测数据集的预测总是相同的,将层更改为 Dense 不会改变预测,也许我做错了什么。
这是代码
// set up data arrays
float[,,] training_data = new float[training.Count(), 12, 200];
float[,,] testing_data = new float[testing.Count(), 12, 200];
float[,,] predict_data = new float[1, 12, 200];
IList<float> training_labels = new List<float>();
IList<float> testing_labels = new List<float>();
// Load Data and add to arrays
...
...
/////////////////////////
NDarray train_y = np.array(training_labels.ToArray());
NDarray train_x = np.array(training_data);
NDarray test_y = np.array(testing_labels.ToArray());
NDarray test_x = np.array(testing_data);
NDarray predict_x = np.array(predict_data);
train_y = Util.ToCategorical(train_y, 2);
test_y = Util.ToCategorical(test_y, 2);
//Build functional model
var model = new Sequential();
model.Add(new Input(shape: new Keras.Shape(12, 200)));
model.Add(new BatchNormalization());
model.Add(new LSTM(128, activation: "tanh", recurrent_activation: "sigmoid", return_sequences: false));
model.Add(new Dropout(0.2));
model.Add(new Dense(32, activation: "relu"));
model.Add(new Dense(2, activation: "softmax"));
model.Compile(optimizer: new SGD(), loss: "binary_crossentropy", metrics: new string[] { "accuracy" });
model.Summary();
var history = model.Fit(train_x, train_y, batch_size: 1, epochs: 1, verbose: 1, validation_data: new NDarray[] { test_x, test_y });
var score = model.Evaluate(test_x, test_y, verbose: 2);
Console.WriteLine($"Test loss: {score[0]}");
Console.WriteLine($"Test accuracy: {score[1]}");
NDarray predicted=model.Predict(predict_x, verbose: 2);
Console.WriteLine($"Prediction: {predicted[0][0]*100}");
Console.WriteLine($"Prediction: {predicted[0][1]*100}");
这是输出
483/483 [==============================]
- 9s 6ms/step - loss: 0.1989 - accuracy: 0.9633 - val_loss: 0.0416 - val_accuracy: 1.0000
4/4 - 0s - loss: 0.0416 - accuracy: 1.0000
Test loss: 0.04155446216464043
Test accuracy: 1
1/1 - 0s
Prediction: 0.0010418787496746518
Prediction: 99.99896287918091
在ML.net中使用相同的预测数据给出了不同的结果,但是在ML.Net中准确率只有0.6,这就是为什么我需要深度学习
我没有设置 c# 来重现您的代码。但我看到一个您可能需要考虑的小问题(不确定这是否会导致问题)。根据您上面的代码设置,我认为您使用了错误的 loss
函数进行训练。如您所愿,
Util.ToCategorical(train_y, 2);
model.Add(new Dense(2, activation: "softmax"));
那么你的损失函数应该是'categorical_crossentropy'
而不应该是'binary_crossentropy'
。因为,您将标签 (-1, 1) 转换为单热编码向量,并在最后一层设置了 softmax
激活。
但是,正如你所说,你的标签是-1和1;因此,如果您将问题视为二元分类问题,那么设置应该如下所示:
# Util.ToCategorical(train_y, 2); # no transformation
model.Add(new Dense(1, activation: "sigmoid"));
model.compile(..., loss: "binary_crossentropy" )
参考。
更新
在这里我将给出一些工作演示代码以便更好地理解。但在此之前,这是一张小纸条。比方说,我们有一个训练数据集,标签从 < 0
或负值开始,例如 [-2, -1, 0, 1]
。为了将这个整数值转换为单热编码向量,我们可以选择 tf.keras.utils.to_categorical
或 pd.get_dummies
函数。但是这两种方法之间的一个小区别是,在 tf..to_categorical
中,我们的整数标签必须从 0
开始;这不是pd.get_dummies
的情况,请检查我的。很快,
np.random.randint(-1, 1, size=(80))
array([-1, -1, 0, 0, 0 .. ]
pd.get_dummies(a).astype('float32').values[:5]
array([[1., 0.],
[1., 0.],
[0., 1.],
[0., 1.],
[0., 1.]], dtype=float32)
tf.keras.utils.to_categorical(a+1, num_classes = 2)[:5]
array([[1., 0.],
[1., 0.],
[0., 1.],
[0., 1.],
[0., 1.]], dtype=float32)
好的,我现在给出一些工作演示代码。
img = tf.random.normal([80, 32], 0, 1, tf.float32)
tar = pd.get_dummies(np.random.randint(-1, 1, # mine: [-1, 1) - yours: [-1, 1]
size=80)).astype('float32').values
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, input_dim = 32,
kernel_initializer ='normal',
activation= 'relu'))
model.add(tf.keras.layers.Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])
model.fit(img, tar, epochs=10, verbose=2)
Epoch 1/10
3/3 - 0s - loss: 0.7610 - accuracy: 0.4375
Epoch 2/10
3/3 - 0s - loss: 0.7425 - accuracy: 0.4375
....
Epoch 8/10
3/3 - 0s - loss: 0.6694 - accuracy: 0.5125
Epoch 9/10
3/3 - 0s - loss: 0.6601 - accuracy: 0.5750
Epoch 10/10
3/3 - 0s - loss: 0.6511 - accuracy: 0.5750
推理
loss, acc = model.evaluate(img, tar); print(loss, acc)
pred = model.predict(img); print(pred[:5])
3ms/step - loss: 0.6167 - accuracy: 0.7250
0.6166597604751587 0.7250000238418579
# probabilities of the predicted labels -1 and 0
[[0.35116166 0.64883834]
[0.5542663 0.4457338 ]
[0.28023133 0.71976864]
[0.5024315 0.49756846]
[0.41029742 0.5897026 ]]
现在,如果我们这样做
print(pred[0])
pred[0].argmax(-1) # expect: -1, 0 as our label
[0.35116166 0.64883834]
1
它分别为目标标签-1
和0
给出0.35x
和0.64x
。但是,当我们根据概率对预测的 标签 进行 .argmax
时,它 returns 零索引最高值; (使训练标签从零索引开始的原因,因此我认为在您的情况下最好将 [-1, 1]
转换为 [0, 1]
)。
好吧,最后,正如你所说,你想要预测标签和相应的置信度分数;为此,我们可以使用 tf.math.top_k 和 k = num_of_class
.
top_k_values, top_k_indices = tf.math.top_k(pred, k=2)
for values, indices in zip(top_k_values, top_k_indices):
print(
'For class {}, model confidence {:.2f}%'
.format(indices.numpy()[0]-1, values.numpy()[0]*100)
)
print(
'For class {}, model confidence {:.2f}%'
.format(indices.numpy()[1]-1, values.numpy()[1]*100)
)
'''
Note: above we substract -1 to match with
the target label (-1, 0)
And it would not necessary if we initially -
transform our label from (-1, 0) to (0, 1), i.e. start from zero
'''
print()
break # remove for full results
For class 0, model confidence 64.88%
For class -1, model confidence 35.12%
正在验证分数顺序
# pick first samples: input and label
model(img)[0].numpy(), tar[0]
(array([0.35116166, 0.64883834], dtype=float32),
array([0., 1.], dtype=float32))
Here,
0: for -1
1: for 0
# Again, better to transform (-1, 0) to (0, 1) at initial.
我有 2 个预测标签值,-1 或 1。
LSTM
或 Dense
的学习看起来不错,但是不同预测数据集的预测总是相同的,将层更改为 Dense 不会改变预测,也许我做错了什么。
这是代码
// set up data arrays
float[,,] training_data = new float[training.Count(), 12, 200];
float[,,] testing_data = new float[testing.Count(), 12, 200];
float[,,] predict_data = new float[1, 12, 200];
IList<float> training_labels = new List<float>();
IList<float> testing_labels = new List<float>();
// Load Data and add to arrays
...
...
/////////////////////////
NDarray train_y = np.array(training_labels.ToArray());
NDarray train_x = np.array(training_data);
NDarray test_y = np.array(testing_labels.ToArray());
NDarray test_x = np.array(testing_data);
NDarray predict_x = np.array(predict_data);
train_y = Util.ToCategorical(train_y, 2);
test_y = Util.ToCategorical(test_y, 2);
//Build functional model
var model = new Sequential();
model.Add(new Input(shape: new Keras.Shape(12, 200)));
model.Add(new BatchNormalization());
model.Add(new LSTM(128, activation: "tanh", recurrent_activation: "sigmoid", return_sequences: false));
model.Add(new Dropout(0.2));
model.Add(new Dense(32, activation: "relu"));
model.Add(new Dense(2, activation: "softmax"));
model.Compile(optimizer: new SGD(), loss: "binary_crossentropy", metrics: new string[] { "accuracy" });
model.Summary();
var history = model.Fit(train_x, train_y, batch_size: 1, epochs: 1, verbose: 1, validation_data: new NDarray[] { test_x, test_y });
var score = model.Evaluate(test_x, test_y, verbose: 2);
Console.WriteLine($"Test loss: {score[0]}");
Console.WriteLine($"Test accuracy: {score[1]}");
NDarray predicted=model.Predict(predict_x, verbose: 2);
Console.WriteLine($"Prediction: {predicted[0][0]*100}");
Console.WriteLine($"Prediction: {predicted[0][1]*100}");
这是输出
483/483 [==============================]
- 9s 6ms/step - loss: 0.1989 - accuracy: 0.9633 - val_loss: 0.0416 - val_accuracy: 1.0000
4/4 - 0s - loss: 0.0416 - accuracy: 1.0000
Test loss: 0.04155446216464043
Test accuracy: 1
1/1 - 0s
Prediction: 0.0010418787496746518
Prediction: 99.99896287918091
在ML.net中使用相同的预测数据给出了不同的结果,但是在ML.Net中准确率只有0.6,这就是为什么我需要深度学习
我没有设置 c# 来重现您的代码。但我看到一个您可能需要考虑的小问题(不确定这是否会导致问题)。根据您上面的代码设置,我认为您使用了错误的 loss
函数进行训练。如您所愿,
Util.ToCategorical(train_y, 2);
model.Add(new Dense(2, activation: "softmax"));
那么你的损失函数应该是'categorical_crossentropy'
而不应该是'binary_crossentropy'
。因为,您将标签 (-1, 1) 转换为单热编码向量,并在最后一层设置了 softmax
激活。
但是,正如你所说,你的标签是-1和1;因此,如果您将问题视为二元分类问题,那么设置应该如下所示:
# Util.ToCategorical(train_y, 2); # no transformation
model.Add(new Dense(1, activation: "sigmoid"));
model.compile(..., loss: "binary_crossentropy" )
参考。
更新
在这里我将给出一些工作演示代码以便更好地理解。但在此之前,这是一张小纸条。比方说,我们有一个训练数据集,标签从 < 0
或负值开始,例如 [-2, -1, 0, 1]
。为了将这个整数值转换为单热编码向量,我们可以选择 tf.keras.utils.to_categorical
或 pd.get_dummies
函数。但是这两种方法之间的一个小区别是,在 tf..to_categorical
中,我们的整数标签必须从 0
开始;这不是pd.get_dummies
的情况,请检查我的
np.random.randint(-1, 1, size=(80))
array([-1, -1, 0, 0, 0 .. ]
pd.get_dummies(a).astype('float32').values[:5]
array([[1., 0.],
[1., 0.],
[0., 1.],
[0., 1.],
[0., 1.]], dtype=float32)
tf.keras.utils.to_categorical(a+1, num_classes = 2)[:5]
array([[1., 0.],
[1., 0.],
[0., 1.],
[0., 1.],
[0., 1.]], dtype=float32)
好的,我现在给出一些工作演示代码。
img = tf.random.normal([80, 32], 0, 1, tf.float32)
tar = pd.get_dummies(np.random.randint(-1, 1, # mine: [-1, 1) - yours: [-1, 1]
size=80)).astype('float32').values
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, input_dim = 32,
kernel_initializer ='normal',
activation= 'relu'))
model.add(tf.keras.layers.Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])
model.fit(img, tar, epochs=10, verbose=2)
Epoch 1/10
3/3 - 0s - loss: 0.7610 - accuracy: 0.4375
Epoch 2/10
3/3 - 0s - loss: 0.7425 - accuracy: 0.4375
....
Epoch 8/10
3/3 - 0s - loss: 0.6694 - accuracy: 0.5125
Epoch 9/10
3/3 - 0s - loss: 0.6601 - accuracy: 0.5750
Epoch 10/10
3/3 - 0s - loss: 0.6511 - accuracy: 0.5750
推理
loss, acc = model.evaluate(img, tar); print(loss, acc)
pred = model.predict(img); print(pred[:5])
3ms/step - loss: 0.6167 - accuracy: 0.7250
0.6166597604751587 0.7250000238418579
# probabilities of the predicted labels -1 and 0
[[0.35116166 0.64883834]
[0.5542663 0.4457338 ]
[0.28023133 0.71976864]
[0.5024315 0.49756846]
[0.41029742 0.5897026 ]]
现在,如果我们这样做
print(pred[0])
pred[0].argmax(-1) # expect: -1, 0 as our label
[0.35116166 0.64883834]
1
它分别为目标标签-1
和0
给出0.35x
和0.64x
。但是,当我们根据概率对预测的 标签 进行 .argmax
时,它 returns 零索引最高值; (使训练标签从零索引开始的原因,因此我认为在您的情况下最好将 [-1, 1]
转换为 [0, 1]
)。
好吧,最后,正如你所说,你想要预测标签和相应的置信度分数;为此,我们可以使用 tf.math.top_k 和 k = num_of_class
.
top_k_values, top_k_indices = tf.math.top_k(pred, k=2)
for values, indices in zip(top_k_values, top_k_indices):
print(
'For class {}, model confidence {:.2f}%'
.format(indices.numpy()[0]-1, values.numpy()[0]*100)
)
print(
'For class {}, model confidence {:.2f}%'
.format(indices.numpy()[1]-1, values.numpy()[1]*100)
)
'''
Note: above we substract -1 to match with
the target label (-1, 0)
And it would not necessary if we initially -
transform our label from (-1, 0) to (0, 1), i.e. start from zero
'''
print()
break # remove for full results
For class 0, model confidence 64.88%
For class -1, model confidence 35.12%
正在验证分数顺序
# pick first samples: input and label
model(img)[0].numpy(), tar[0]
(array([0.35116166, 0.64883834], dtype=float32),
array([0., 1.], dtype=float32))
Here,
0: for -1
1: for 0
# Again, better to transform (-1, 0) to (0, 1) at initial.