如何在 TensorFlow 中使用 tf.nn.embedding_lookup_sparse？

Question

我们已经尝试使用 tf.nn.embedding_lookup 并且它有效。但它需要密集的输入数据，现在我们需要 tf.nn.embedding_lookup_sparse 用于稀疏输入。

我写了下面的代码，但是有一些错误。

import tensorflow as tf
import numpy as np

example1 = tf.SparseTensor(indices=[[4], [7]], values=[1, 1], shape=[10])
example2 = tf.SparseTensor(indices=[[3], [6], [9]], values=[1, 1, 1], shape=[10])

vocabulary_size = 10
embedding_size = 1
var = np.array([0.0, 1.0, 4.0, 9.0, 16.0, 25.0, 36.0, 49.0, 64.0, 81.0])
#embeddings = tf.Variable(tf.ones([vocabulary_size, embedding_size]))
embeddings = tf.Variable(var)

embed = tf.nn.embedding_lookup_sparse(embeddings, example2, None)

with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())

    print(sess.run(embed))

错误日志如下所示。

现在我不知道如何修复和正确使用此方法。如有任何意见，我们将不胜感激。

深入了解 safe_embedding_lookup_sparse 的单元测试后，我更困惑的是，如果给出稀疏权重，为什么我会得到这个结果，尤其是为什么我们得到类似 embedding_weights[0][3] 的东西，其中 3 没有出现在上面的代码中。

Answer 1

tf.nn.embedding_lookup_sparse() 使用 Segmentation 组合嵌入，这需要 SparseTensor 的索引从 0 开始并递增 1。这就是为什么会出现此错误。

您的稀疏张量不需要布尔值，只需保存您要从嵌入中检索的每一行的索引。这是您调整后的代码：

import tensorflow as tf
import numpy as np

example = tf.SparseTensor(indices=[[0], [1], [2]], values=[3, 6, 9], dense_shape=[3])

vocabulary_size = 10
embedding_size = 1
var = np.array([0.0, 1.0, 4.0, 9.0, 16.0, 25.0, 36.0, 49.0, 64.0, 81.0])
embeddings = tf.Variable(var)

embed = tf.nn.embedding_lookup_sparse(embeddings, example, None)

with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())
    print(sess.run(embed)) # prints [  9.  36.  81.]

此外，您可以使用 tf.SparseTensor() 中的索引，使用允许的 tf.nn.embedding_lookup_sparse() 组合器之一组合词嵌入：

"sum" computes the weighted sum of the embedding results for each row.

"mean" is the weighted sum divided by the total weight.

"sqrtn" is the weighted sum divided by the square root of the sum of the squares of the weights.

例如：

example = tf.SparseTensor(indices=[[0], [0]], values=[1, 2], dense_shape=[2])
...
embed = tf.nn.embedding_lookup_sparse(embeddings, example, None, combiner='sum')
...
print(sess.run(embed)) # prints [ 5.]

如何在 TensorFlow 中使用 tf.nn.embedding_lookup_sparse？

How to use tf.nn.embedding_lookup_sparse in TensorFlow?

python

embedding

sparse-matrix

tensorflow