Word2Vec Tutorial: Tensorflow TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'

Question

Tensorflow 版本：1.2.1
Python 版本：3.5
操作系统：Windows 10

另一位发帖人在 Whosebug 上询问过同样的问题，他似乎使用的代码来自同一个 Udacity Word2Vec 教程。所以，也许我是个笨蛋，但这个例子的代码是如此繁杂和复杂，以至于我不知道是什么解决了他的问题。

调用tf.reduce_means时出现错误：

loss = tf.reduce_mean(
    tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
                               train_labels, num_sampled, vocabulary_size))

就在调用 tf.reduce_mean 之前，关键变量具有以下数据类型。

train_dataset.dtype
>> tf.int32
train_labels.dtype
>> tf.int32
valid_dataset.dtype
>> tf.int32
embeddings.dtype
>> tf.float32_ref
softmax_weights.dtype
>> tf.float32_ref
softmax_biases.dtype
>> tf.float32_ref
embed.dtype
>> tf.float32

我在变量 train_dataset.dtype、train_labels.dtype 和 valid_dataset.dtype 的定义中尝试了所有数据类型的排列：使它们全部 int64、全部 float32、所有 float64，以及整数和浮点数的组合。没有任何效果。我没有尝试更改 softmax_weight 和 softmax_biases 的数据类型，因为我担心这可能会破坏优化算法。这些不需要是浮点数来支持在反向传播期间完成的微积分吗？（Tensorflow 通常是一个非常不透明的黑匣子，其文档几乎完全无用，所以我可以怀疑一些事情，但永远无法确定。）

出错时的程序流程：

调用 reduce_mean 后，程序控制转移到文件 nn_impl.py 中的 sampled_softmax_loss()，后者又调用 _compute_sampled_logits():

  logits, labels = _compute_sampled_logits(
      weights=weights,
      biases=biases,
      labels=labels,
      inputs=inputs,
      num_sampled=num_sampled,
      num_classes=num_classes,
      num_true=num_true,
      sampled_values=sampled_values,
      subtract_log_q=True,
      remove_accidental_hits=remove_accidental_hits,
      partition_strategy=partition_strategy,
      name=name)

此时我查看传入参数的数据类型，得到如下信息：

weights.dtype
>> tf.float32_ref
biases.dtype
>> tf.float32_ref
labels.dtype
>> tf.float32
inputs.dtype
>> tf.int32

在下一步发生异常，我被扔进文件 ansitowin32.py 中的 StreamWrapper class。运行最后，我得到以下回溯：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
    489                 as_ref=input_arg.is_ref,
--> 490                 preferred_dtype=default_dtype)
    491           except TypeError as err:

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype)
    740         if ret is None:
--> 741           ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
    742 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in _TensorTensorConversionFunction(t, dtype, name, as_ref)
    613         "Tensor conversion requested dtype %s for Tensor with dtype %s: %r"
--> 614         % (dtype.name, t.dtype.name, str(t)))
    615   return t

ValueError: Tensor conversion requested dtype int32 for Tensor with dtype float32: 'Tensor("sampled_softmax_loss/Reshape_1:0", shape=(?, 1, ?), dtype=float32, device=/device:CPU:0)'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-7-66d378b94a16> in <module>()
     34     loss = tf.reduce_mean(
     35       tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
---> 36                                train_labels, num_sampled, vocabulary_size))
     37 
     38     # Optimizer.

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in sampled_softmax_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name)
   1266       remove_accidental_hits=remove_accidental_hits,
   1267       partition_strategy=partition_strategy,
-> 1268       name=name)
   1269   sampled_losses = nn_ops.softmax_cross_entropy_with_logits(labels=labels,
   1270                                                             logits=logits)

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name)
   1005     row_wise_dots = math_ops.multiply(
   1006         array_ops.expand_dims(inputs, 1),
-> 1007         array_ops.reshape(true_w, new_true_w_shape))
   1008     # We want the row-wise dot plus biases which yields a
   1009     # [batch_size, num_true] tensor of true_logits.

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\math_ops.py in multiply(x, y, name)
    284 
    285 def multiply(x, y, name=None):
--> 286   return gen_math_ops._mul(x, y, name)
    287 
    288 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\gen_math_ops.py in _mul(x, y, name)
   1375     A `Tensor`. Has the same type as `x`.
   1376   """
-> 1377   result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
   1378   return result
   1379 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
    524                   "%s type %s of argument '%s'." %
    525                   (prefix, dtypes.as_dtype(attrs[input_arg.type_attr]).name,
--> 526                    inferred_from[input_arg.type_attr]))
    527 
    528           types = [values.dtype]

TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'.

完整程序如下：

# These are all the modules we'll be using later. 
# Make sure you can import them before proceeding further.

# %matplotlib inline

from __future__ import print_function
import collections
import math
import numpy as np
import os
import random
import tensorflow as tf
import zipfile
from matplotlib import pylab
from six.moves import range
from six.moves.urllib.request import urlretrieve
from sklearn.manifold import TSNE

print("Working directory = %s\n" % os.getcwd())

def read_data(filename):
    """Extract the first file enclosed in a zip file as a list of words"""
    with zipfile.ZipFile(filename) as f:
        data = tf.compat.as_str(f.read(f.namelist()[0])).split()
    return data

filename = 'text8.zip'

words = read_data(filename)
print('Data size %d' % len(words))

vocabulary_size = 50000

def build_dataset(words):
    count = [['UNK', -1]]
    count.extend(collections.Counter(words).most_common(vocabulary_size - 1))
    dictionary = dict()
    # Loop through the keys of the count collection dictionary
    # (apparently, zeroing out counts)
    for word, _ in count:
        dictionary[word] = len(dictionary)
    data = list()
    unk_count = 0  # count of unknown words
    for word in words:
        if word in dictionary:
            index = dictionary[word]
        else:
            index = 0  # dictionary['UNK']
            unk_count = unk_count + 1
        data.append(index)
    count[0][1] = unk_count
    reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
    return data, count, dictionary, reverse_dictionary


data, count, dictionary, reverse_dictionary = build_dataset(words)
print('Most common words (+UNK)', count[:5])
print('Sample data', data[:10])
del words  # Hint to reduce memory.

data_index = 0

def generate_batch(batch_size, num_skips, skip_window):
    global data_index
    assert batch_size % num_skips == 0
    assert num_skips <= 2 * skip_window
    batch = np.ndarray(shape=(batch_size), dtype=np.int32)
    labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)
    span = 2 * skip_window + 1 # [ skip_window target skip_window ]
    buffer = collections.deque(maxlen=span)
    for _ in range(span):
        buffer.append(data[data_index])
        data_index = (data_index + 1) % len(data)
    for i in range(batch_size // num_skips):
        target = skip_window  # target label at the center of the buffer
        targets_to_avoid = [ skip_window ]
        for j in range(num_skips):
            while target in targets_to_avoid:
                target = random.randint(0, span - 1)
            targets_to_avoid.append(target)
            batch[i * num_skips + j] = buffer[skip_window]
            labels[i * num_skips + j, 0] = buffer[target]
        buffer.append(data[data_index])
        data_index = (data_index + 1) % len(data)
    return batch, labels

print('data:', [reverse_dictionary[di] for di in data[:8]])

for num_skips, skip_window in [(2, 1), (4, 2)]:
    data_index = 0
    batch, labels = generate_batch(batch_size=8, num_skips=num_skips, skip_window=skip_window)
    print('\nwith num_skips = %d and skip_window = %d:' % (num_skips, skip_window))
    print('    batch:', [reverse_dictionary[bi] for bi in batch])
    print('    labels:', [reverse_dictionary[li] for li in labels.reshape(8)])

batch_size = 128
embedding_size = 128  # Dimension of the embedding vector.
skip_window = 1  # How many words to consider left and right.
num_skips = 2  # How many times to reuse an input to generate a label.
# We pick a random validation set to sample nearest neighbors. here we limit the
# validation samples to the words that have a low numeric ID, which by
# construction are also the most frequent.
valid_size = 16  # Random set of words to evaluate similarity on.
valid_window = 100  # Only pick dev samples in the head of the distribution.
valid_examples = np.array(random.sample(range(valid_window), valid_size))
num_sampled = 64  # Number of negative examples to sample.

graph = tf.Graph()

with graph.as_default(), tf.device('/cpu:0'):
    # Input data.
    train_dataset = tf.placeholder(tf.int32, shape=[batch_size])
    train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
    valid_dataset = tf.constant(valid_examples, dtype=tf.int32)

    # Variables.
    embeddings = tf.Variable(
        tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
    softmax_weights = tf.Variable(
        tf.truncated_normal([vocabulary_size, embedding_size],
                            stddev=1.0 / math.sqrt(embedding_size)))
    softmax_biases = tf.Variable(tf.zeros([vocabulary_size]))

    # Model.
    # Look up embeddings for inputs.
    embed = tf.nn.embedding_lookup(embeddings, train_dataset)
    # Compute the softmax loss, using a sample of the negative labels each time.
    loss = tf.reduce_mean(
        tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
                                   train_labels, num_sampled, vocabulary_size))

    # Optimizer.
    # Note: The optimizer will optimize the softmax_weights AND the embeddings.
    # This is because the embeddings are defined as a variable quantity and the
    # optimizer's `minimize` method will by default modify all variable quantities
    # that contribute to the tensor it is passed.
    # See docs on `tf.train.Optimizer.minimize()` for more details.
    optimizer = tf.train.AdagradOptimizer(1.0).minimize(loss)

    # Compute the similarity between minibatch examples and all embeddings.
    # We use the cosine distance:
    norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
    normalized_embeddings = embeddings / norm
    valid_embeddings = tf.nn.embedding_lookup(
        normalized_embeddings, valid_dataset)
    similarity = tf.matmul(valid_embeddings, tf.transpose(normalized_embeddings))

Answer 1

我遇到了同样的问题，看起来传递给损失函数的两个参数被调换了。如果您查看 'sample_softmax_loss' (https://www.tensorflow.org/api_docs/python/tf/nn/sampled_softmax_loss) 的 tensorflow 描述：

sampled_softmax_loss(
    weights,
    biases,
    labels,
    inputs,
    num_sampled,
    num_classes,
    num_true=1,
    sampled_values=None,
    remove_accidental_hits=True,
    partition_strategy='mod',
    name='sampled_softmax_loss'
)

第三个预期参数是'labels'，第四个是'inputs'。在提供的代码中，这两个参数似乎已经调换了。我有点困惑这怎么可能。也许这在旧版本的 TF 中曾经有所不同。无论如何，交换这两个参数将解决问题。

Word2Vec Tutorial: Tensorflow TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'

Word2Vec Tutorial: Tensorflow TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'

typeerror

word2vec

tensorflow