执行时会调用什么方法 embedding_layer(tf.constant([1, 2, 3]))

What method will be called when executing embedding_layer(tf.constant([1, 2, 3]))

以下代码摘自以下link:

https://www.tensorflow.org/text/guide/word_embeddings

import tensorflow as tf

# Embed a 1,000 word vocabulary into 5 dimensions.
embedding_layer = tf.keras.layers.Embedding(1000, 5)
print("embedding_layer: {}".format(embedding_layer))

result = embedding_layer(tf.constant([1, 2, 3]))
print("result: {}".format(result.numpy()))

embedding_layer: <keras.layers.embeddings.Embedding object at 0x7ffb180b17f0>
result: [[-0.04678862 -0.03500976 -0.04254207 -0.0452533   0.04933525]
 [-0.0366199  -0.01814463  0.04166402  0.02388224  0.03472105]
 [ 0.02966919  0.04294082  0.00715581  0.0376732   0.00529655]]

执行“embedding_layer(tf.constant([1, 2, 3]))”时,会从tf.keras.layers.Embeddingclass调用哪个方法?

是否调用了__init__方法?

以下代码会抛出以下错误:

embedding_layer = tf.keras.layers.Embedding(tf.constant([1, 2, 3]))

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_4537/3254652836.py in <cell line: 10>()
      8 print("result: {}".format(result.numpy()))
      9 
---> 10 embedding_layer = tf.keras.layers.Embedding(tf.constant([1, 2, 3]))

TypeError: __init__() missing 1 required positional argument: 'output_dim'

这里是嵌入的源代码:

class Embedding(Layer):
    
  def __init__(self,
               input_dim,
               output_dim,
               embeddings_initializer='uniform',
               embeddings_regularizer=None,
               activity_regularizer=None,
               embeddings_constraint=None,
               mask_zero=False,
               input_length=None,
               **kwargs):
    if 'input_shape' not in kwargs:
      if input_length:
        kwargs['input_shape'] = (input_length,)
      else:
        kwargs['input_shape'] = (None,)
    if input_dim <= 0 or output_dim <= 0:
      raise ValueError('Both `input_dim` and `output_dim` should be positive, '
                       'found input_dim {} and output_dim {}'.format(
                           input_dim, output_dim))
    if (not base_layer_utils.v2_dtype_behavior_enabled() and
        'dtype' not in kwargs):
      # In TF1, the dtype defaults to the input dtype which is typically int32,
      # so explicitly set it to floatx
      kwargs['dtype'] = backend.floatx()
    # We set autocast to False, as we do not want to cast floating- point inputs
    # to self.dtype. In call(), we cast to int32, and casting to self.dtype
    # before casting to int32 might cause the int32 values to be different due
    # to a loss of precision.
    kwargs['autocast'] = False
    super(Embedding, self).__init__(**kwargs)

    self.input_dim = input_dim
    self.output_dim = output_dim
    self.embeddings_initializer = initializers.get(embeddings_initializer)
    self.embeddings_regularizer = regularizers.get(embeddings_regularizer)
    self.activity_regularizer = regularizers.get(activity_regularizer)
    self.embeddings_constraint = constraints.get(embeddings_constraint)
    self.mask_zero = mask_zero
    self.supports_masking = mask_zero
    self.input_length = input_length

当运行时调用call方法:

result = embedding_layer(tf.constant([1, 2, 3]))

需要注意的是Embedding层在使用前首先需要初始化。在内部,在 __init__ 期间,Embedding 层根据您定义的词汇表的大小和嵌入维度(在您的例子中为 1000 和 5)创建一个查找 table。除非另有说明,否则每个 5 维向量均取自均匀分布。我建议检查如何在 build 方法中创建嵌入。