TensorFlow 2.0,keras.applications 中的错误(as_list() 未在未知的 TensorShape 上定义)

TensorFlow 2.0, error in keras.applications (as_list() is not defined on an unknown TensorShape)

关于SO有几个问题出现这个错误:

ValueError: as_list() is not defined on an unknown TensorShape.

以及 git 上的一些相关问题:1, 2

但是,对于出现此消息的原因,我还没有找到一致的答案,也没有找到针对我的具体问题的解决方案。整个管道过去与 tf2.0.0-alpha 一起使用,现在,在使用 Conda conda install tensorflow=2.0 python=3.6 安装后,管道已损坏。

简而言之,我使用生成器将return图像数据转为tf.data.Dataset.from_generator()方法。这工作正常,直到我尝试调用 model.fit() 方法,这导致以下错误。

Train for 750 steps, validate for 100 steps
Epoch 1/5
  1/750 [..............................] - ETA: 10sTraceback (most recent call last):
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/tmo/Projects/casa/image/src/train.py", line 148, in <module>
    Trainer().train_vgg16()
  File "/Users/tmo/Projects/casa/image/src/train.py", line 142, in train_vgg16
    validation_steps=100)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 728, in fit
    use_multiprocessing=use_multiprocessing)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 324, in fit
    total_epochs=epochs)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 123, in run_one_epoch
    batch_outs = execution_function(iterator)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 86, in execution_function
    distributed_function(input_fn))
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 457, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 503, in _call
    self._initialize(args, kwds, add_initializers_to=initializer_map)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 408, in _initialize
    *args, **kwds))
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1848, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2150, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2041, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 915, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 358, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 66, in distributed_function
    model, input_iterator, mode)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 112, in _prepare_feed_values
    inputs, targets, sample_weights = _get_input_from_iterator(inputs)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 149, in _get_input_from_iterator
    distribution_strategy_context.get_strategy(), x, y, sample_weights)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/keras/distribute/distributed_training_utils.py", line 308, in validate_distributed_dataset_inputs
    x_values_list = validate_per_replica_inputs(distribution_strategy, x)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/keras/distribute/distributed_training_utils.py", line 356, in validate_per_replica_inputs
    validate_all_tensor_shapes(x, x_values)
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/keras/distribute/distributed_training_utils.py", line 373, in validate_all_tensor_shapes
    x_shape = x_values[0].shape.as_list()
  File "/usr/local/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/framework/tensor_shape.py", line 1171, in as_list
    raise ValueError("as_list() is not defined on an unknown TensorShape.")
ValueError: as_list() is not defined on an unknown TensorShape.

这是加载和重塑每个图像的代码:

    def preprocess_image(self, image):
        """
        """
        image = tf.image.decode_jpeg(image, channels=3)
        image = tf.image.resize(image, self.hw)
        image /= 255.0  # normalize to [0,1] range

        image.set_shape([224, 224, 3])

        return image

应用于运行图像列表并产生预处理结果的生成器(例如 training_generator):

    def make_ts_dataset(self):

        AUTOTUNE = tf.data.experimental.AUTOTUNE

        BATCH_SIZE = 32
        image_count_training = len(self.X_train)
        image_count_validation = len(self.X_test)

        training_generator = GetTensor(hw=self.hw, train=True).make_tensor

        training_image_ds = tf.data.Dataset.from_generator(training_generator, tf.float32, [224, 224, 3])
        training_price_ds = tf.data.Dataset.from_tensor_slices(tf.cast(self.y_train, tf.float32))

        validation_generator = GetTensor(hw=self.hw, test=True).make_tensor

        validation_image_ds = tf.data.Dataset.from_generator(validation_generator, tf.float32, [224, 224, 3])
        validation_price_ds = tf.data.Dataset.from_tensor_slices(tf.cast(self.y_test, tf.float32))

        training_ds = tf.data.Dataset.zip((training_image_ds, training_price_ds))
        validation_ds = tf.data.Dataset.zip((validation_image_ds, validation_price_ds))

        training_ds = training_ds.shuffle(buffer_size=int(round(image_count_training)))
        training_ds = training_ds.repeat()
        training_ds = training_ds.batch(BATCH_SIZE)
        training_ds = training_ds.prefetch(buffer_size=AUTOTUNE)

        validation_ds = validation_ds.shuffle(buffer_size=int(round(image_count_validation)))
        validation_ds = validation_ds.repeat()
        validation_ds = validation_ds.batch(BATCH_SIZE)
        validation_ds = validation_ds.prefetch(buffer_size=AUTOTUNE)

        for image_batch, label_batch in training_ds.take(1):
            print(label_batch.shape, image_batch.shape)
            pass

        return training_ds, validation_ds

在所有点上,形状看起来都是正确的,即 (32,) (32, 224, 224, 3)

我正在使用来自 VGG16

的训练权重初始化权重
    def train_vgg16(self):

        training_ds, validation_ds = Trainer.make_ts_dataset(self)

        base_vgg = keras.applications.vgg16.VGG16(include_top=False, 
                                                  weights='imagenet', 
                                                  input_shape=(224, 224, 3))
        base_vgg.trainable = False

        print(base_vgg.summary())

        vgg_with_base = keras.Sequential([
            base_vgg,
            tf.keras.layers.GlobalMaxPooling2D(),
            tf.keras.layers.Dense(1024, activation=tf.nn.relu),
            tf.keras.layers.Dense(1024, activation=tf.nn.relu),
            tf.keras.layers.Dense(512, activation=tf.nn.relu),
            tf.keras.layers.Dense(1)])

        print(base_vgg.summary())

        vgg_with_base.compile(optimizer='adam',
                              loss='mse',
                              metrics=['mape'])

        vgg_with_base.fit(training_ds,
                          epochs=5,
                          validation_data=validation_ds,
                          steps_per_epoch=750,
                          validation_steps=100)

但是,训练从未启动,因为 x_shape = x_values[0].shape.as_list() 失败。

编辑 (11/12/19):

经过一番排查,发现错误是在keras.applications层发起的。

        base_vgg = keras.applications.vgg16.VGG16(include_top=False, 
                                                  weights='imagenet', 
                                                  input_shape=(224, 224, 3))

从模型中删除 base_vgg 并初始化训练工作正常。

keras.applications.vgg16.VGG16 中使用 tensorflow.keras.layers 中的 Input 明确定义形状解决了我的问题。

from tensorflow.keras.layers import Input

        base_vgg = keras.applications.vgg16.VGG16(include_top=False, 
                                                  weights='imagenet', 
                                                  input_tensor=Input(shape=(224,224,3)),
                                                  input_shape=(224, 224, 3))

虽然我仍然认为这更接近于错误而不是功能。

我在导入掩码后通过 tf.cast(label, tf.float32) 修复了这个错误。