gcloud ml-engine returns 大文件错误

Question

我有一个训练有素的模型，它接受的输入有点大。我通常将其作为形状为 (1,473,473,3) 的 numpy 数组来执行。当我把它放到 JSON 时，我最终得到了大约 9.2MB 的文件。即使我将其转换为 JSON 文件的 base64 编码，输入仍然相当大。

ml-engine predict 在发送 JSON 文件时拒绝了我的请求，并出现以下错误：

(gcloud.ml-engine.predict) HTTP request failed. Response: {
"error": {
    "code": 400,
    "message": "Request payload size exceeds the limit: 1572864 bytes.",
    "status": "INVALID_ARGUMENT"
  }
}

看来我无法将大小超过 1.5MB 的任何内容发送到 ML-engine。这肯定是一件事吗？其他人如何绕过对大数据进行在线预测？我是否必须启动一个计算引擎，或者我会运行遇到同样的问题吗？

编辑：

我从 Keras 模型开始并尝试导出到 tensorflow 服务。我将我的 Keras 模型加载到一个名为 'model' 的变量中，并定义了一个目录 "export_path"。我像这样构建 tensorflow 服务模型：

signature = predict_signature_def(inputs={'input': model.input},
                                outputs={'output': model.output})
builder = saved_model_builder.SavedModelBuilder(export_path)
builder.add_meta_graph_and_variables(
    sess=sess,
    tags=[tag_constants.SERVING],
    signature_def_map={
        signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature
    }
)
builder.save()

如何查找此 signature_def 的输入？ JSON 会像 {'input': 'https://storage.googleapis.com/projectid/bucket/filename'} 文件是 (1,473,473,3) numpy 数组吗？

第二次编辑：查看 Lak Lakshmanan 发布的代码，我尝试了几种不同的变体，但都没有成功读取图像 url 并尝试以这种方式解析文件。我尝试了以下但没有成功：

inputs = {'imageurl': tf.placeholder(tf.string, shape=[None])}
filename = tf.squeeze(inputs['imageurl']) 
image = read_and_preprocess(filename)#custom preprocessing function
image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])
features = {'image' : image}
inputs.update(features)
signature = predict_signature_def(inputs= inputs,
                                outputs={'output': model.output})


with K.get_session() as session:
    """Convert the Keras HDF5 model into TensorFlow SavedModel."""
    builder = saved_model_builder.SavedModelBuilder(export_path)
    builder.add_meta_graph_and_variables(
        sess=session,
        tags=[tag_constants.SERVING],
        signature_def_map={
            signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature
        }
    )
    builder.save()

我认为问题在于从图像url 占位符获取到构建特征的映射。想想我做错了什么？

Answer 1

我通常做的是让 json 引用 Google 云存储中的文件。用户首先必须将他们的文件上传到 gcs，然后调用预测。但这种方法还有其他优点，因为存储实用程序允许并行和多线程上传。

Keras/TensorFlow 2.0

在 TensorFlow 2.0 中，服务函数如下所示：

@tf.function(input_signature=[tf.TensorSpec([None,], dtype=tf.string)])
def predict_bytes(img_bytes):
    input_images = tf.map_fn(
        preprocess,
        img_bytes,
        fn_output_signature=tf.float32
    )
    batch_pred = model(input_images) # same as model.predict()
    top_prob = tf.math.reduce_max(batch_pred, axis=[1])
    pred_label_index = tf.math.argmax(batch_pred, axis=1)
    pred_label = tf.gather(tf.convert_to_tensor(CLASS_NAMES), pred_label_index)
    return {
        'probability': top_prob,
        'flower_type_int': pred_label_index,
        'flower_type_str': pred_label
    }

@tf.function(input_signature=[tf.TensorSpec([None,], dtype=tf.string)])
def predict_filename(imageurl):
    img_bytes = tf.map_fn(
        tf.io.read_file,
        filenames
    )
    result = predict_bytes(img_bytes)
    result['filename'] = filenames
    return result

shutil.rmtree('export', ignore_errors=True)
os.mkdir('export')
model.save('export/flowers_model3',
          signatures={
              'serving_default': predict_filename,
              'from_bytes': predict_bytes
          })

完整代码在这里： https://nbviewer.jupyter.org/github/GoogleCloudPlatform/practical-ml-vision-book/blob/master/09_deploying/09d_bytes.ipynb

张量流 1.0

在 TensorFlow 1.0 中，代码将如下所示：

def serving_input_fn():
    # Note: only handles one image at a time ... 
    inputs = {'imageurl': tf.placeholder(tf.string, shape=())}
    filename = tf.squeeze(inputs['imageurl']) # make it a scalar
    image = read_and_preprocess(filename)
    # make the outer dimension unknown (and not 1)
    image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])

features = {'image' : image}
return tf.estimator.export.ServingInputReceiver(features, inputs)

完整代码在这里： https://github.com/GoogleCloudPlatform/training-data-analyst/blob/61ab2e175a629a968024a5d09e9f4666126f4894/courses/machine_learning/deepdive/08_image/flowersmodel/trainer/model.py#L119

Answer 2

我尝试在 AI Platform 上对大图像进行运行预测时遇到了同样的错误。我通过在将图像发送到 AI Platform 之前先将图像编码为 PNG 格式来解决负载限制问题。

我的 Keras 模型不将 PNG 编码图像作为输入，因此我需要将 Keras 模型转换为 Tensorflow Estimator 并定义其 serving input function 包含将 PNG 编码图像解码回的代码我的模型期望的格式。

当模型需要两个不同的灰度图像作为输入时的示例代码：

import tensorflow as tf
from tensorflow.keras.estimator import model_to_estimator
from tensorflow.estimator.export import ServingInputReceiver

IMG_PNG_1 = "encoded_png_image_1"
IMG_PNG_2 = "encoded_png_image_2"


def create_serving_fn(image_height, image_width):
    def serving_input_fn():
        def preprocess_png(png_encoded_img):
            img = tf.reshape(png_encoded_img, shape=())
            img = tf.io.decode_png(img, channels=1)
            img = img / 255
            img = tf.expand_dims(img, axis=0)
            return img

        # receiver_tensors worked only when the shape parameter wasn't defined
        receiver_tensors = {
            IMG_PNG_1: tf.compat.v1.placeholder(tf.string),
            IMG_PNG_2: tf.compat.v1.placeholder(tf.string)
        }

        img_1 = preprocess_png(png_encoded_img=receiver_tensors[IMG_PNG_1])
        img_2 = preprocess_png(png_encoded_img=receiver_tensors[IMG_PNG_2])

        input_img_1 = tf.compat.v1.placeholder_with_default(img_1, shape=[None, image_height, image_width, 1])
        input_img_2 = tf.compat.v1.placeholder_with_default(img_2, shape=[None, image_height, image_width, 1])

        features = {
            "model_input_1": input_img_1,
            "model_input_2": input_img_2,
        }

        return ServingInputReceiver(features=features, receiver_tensors=receiver_tensors)

    return serving_input_fn

# Convert trained Keras model to Estimator
estimator = model_to_estimator(keras_model=model)
save_path = "location_of_the_SavedModel"
export_path = estimator.export_saved_model(
    export_dir_base=save_path,
    serving_input_receiver_fn=create_serving_fn(1000, 1000)
)

gcloud ml-engine returns 大文件错误

gcloud ml-engine returns error on large files

python

json

numpy

predict

google-cloud-ml

Keras/TensorFlow 2.0

张量流 1.0