keras中的自定义损失函数?熵。 Math/implementation 个问题

Custom loss function in keras? Correntropy. Math/implementation issues

我正在处理一篇论文,其中实现了一个带有自定义损失函数的自动编码器来处理振动信号。

我在 keras 上实现它时遇到了问题。他们实施 “最大相关熵” 作为损失函数,以避免信号上的背景噪声问题。

这是描述:

高斯核是相关熵中最流行的Mercer核,定义为

其中 r 是内核大小。然后,可以通过最大化以下函数来设计新的自编码器损失函数:

由于我从未实现过自定义损失函数,因此我在 python 中遇到了数学问题。内核用于我需要实现的损失函数。 这就是我所拥有的:

dataset.npz

file = np.load('./data/CWRU_48k_load_1_CNN_data.npz')  # Numpy Array 

data = file['data'].reshape(len(file['data']), 1024)
labels = file['labels']
category_labels = np.unique(labels)
labels = pd.Categorical(labels, categories = category_labels).codes

train_data, test_data, train_labels, test_labels = train_test_split(data, labels, test_size = int(data.shape[0]*0.2), random_state = 100, stratify = labels)

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# Data shape. Sample Len: 1024. Outputs/Classifications: 10
print(train_data.shape, train_labels.shape, test_data.shape, test_labels.shape)
#(3680, 1024) (3680, 10) (920, 1024) (920, 10)

act_func = 'relu'
out_func = 'softmax'
k_inic = 'glorot_uniform'  

def create_model(shape=[512, 100], loss_func='mse'):
    model = Sequential()

    for shape_size in shape:
        model.add(Dense(shape_size, activation=act_func, kernel_initializer=k_inic))

    model.add(Dense(10, activation=out_func, kernel_initializer=k_inic))
    model.compile(loss=loss_func, optimizer=keras.optimizers.Adam(), metrics=["accuracy"])
    model.build(input_shape=(None, 1024))

    return model

BATCH_SIZE = 45
EPOCHS = 200
VALIDATION_SPLIT = 0.05

# Design Mercer Kernel
def kernel(x, sigma=1):
    return (1/(K.sqrt(2*np.pi)*sigma))*K.exp((-(x*x)/(2*sigma*sigma)))

# Use Mercer Kernel on Maximum Correntropy for loss function
def correntropy(y_true, y_pred):
    sum_score = 0.0
    for i in range(len(y_true)):
        sum_score = kernel(y_true[i] - y_pred[i])
    sum_score = sum_score/len(y_true)
    return -sum_score

# Create AutoEncoder model with my custom loss function
model = create_model(shape=[512, 100], loss_func=correntropy)
history = model.fit(train_data, train_labels, epochs = EPOCHS, batch_size = BATCH_SIZE, validation_data=(test_data, test_labels), 
                        callbacks = callbacks.callbacks, verbose = 0)

res = model.evaluate(test_data, test_labels, batch_size = BATCH_SIZE, verbose = 0)[1]

但是我有这个错误:

AttributeError: in user code:

    /home/user/.local/lib/python3.8/site-packages/keras/engine/training.py:853 train_function  *
        return step_function(self, iterator)
    /tmp/ipykernel_95935/2003563015.py:26 correntropy  *
        sum_score = kernel(y_true[i] - y_pred[i])
    /tmp/ipykernel_95935/2239884018.py:20 kernel  *
        return (1/(K.sqrt(2*np.pi)*sigma))*K.exp((-(x*x)/(2*sigma*sigma)))
    /home/user/.local/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:206 wrapper  **
        return target(*args, **kwargs)
    /home/user/.local/lib/python3.8/site-packages/keras/backend.py:2539 sqrt
        zero = _constant_to_tensor(0., x.dtype.base_dtype)

    AttributeError: 'float' object has no attribute 'dtype'

错误似乎在 kernel 上,但我该如何解决以使用张量?

print(y_true)
print(y_pred)
>> Tensor("IteratorGetNext:1", shape=(None, 10), dtype=float32)
>> Tensor("sequential_161/dense_491/Softmax:0", shape=(None, 10), dtype=float32)

我在您的代码中注意到了 3 个主要方面:

  1. 您正在组合来自不同包 (K, np) 的数学函数。尽可能坚持使用原生的 tensorflow 函数(例如 tf.math.reduce_sum)。有很多东西。检查 documentation for an overview
  2. 自定义损失函数应该转换为tensorflow graph-compatible函数,这就像在它前面放置tf.function装饰器一样简单。参见 here
  3. 循环通常效果不佳。尽可能向量化你的函数。

总的来说,我认为这样的东西应该可以工作(没有测试):

import tensorflow as tf
tf_2pi = tf.constant(tf.sqrt(2*np.pi), dtype=tf.float32)

@tf.function
def kernel(x, sigma=1):
    return (1 / (tf_2pi * sigma)) * tf.exp((-(x * x) / (2 * sigma * sigma)))


@tf.function
def correntropy(y_true, y_pred):
    return -tf.math.reduce_mean(kernel(y_true - y_pred))