gcloud ml-engine returns 大文件错误

gcloud ml-engine returns error on large files

我有一个训练有素的模型,它接受的输入有点大。我通常将其作为形状为 (1,473,473,3) 的 numpy 数组来执行。当我把它放到 JSON 时,我最终得到了大约 9.2MB 的文件。即使我将其转换为 JSON 文件的 base64 编码,输入仍然相当大。

ml-engine predict 在发送 JSON 文件时拒绝了我的请求,并出现以下错误:

(gcloud.ml-engine.predict) HTTP request failed. Response: {
"error": {
    "code": 400,
    "message": "Request payload size exceeds the limit: 1572864 bytes.",
    "status": "INVALID_ARGUMENT"
  }
}

看来我无法将大小超过 1.5MB 的任何内容发送到 ML-engine。这肯定是一件事吗?其他人如何绕过对大数据进行在线预测?我是否必须启动一个计算引擎,或者我会 运行 遇到同样的问题吗?

编辑:

我从 Keras 模型开始并尝试导出到 tensorflow 服务。我将我的 Keras 模型加载到一个名为 'model' 的变量中,并定义了一个目录 "export_path"。我像这样构建 tensorflow 服务模型:

signature = predict_signature_def(inputs={'input': model.input},
                                outputs={'output': model.output})
builder = saved_model_builder.SavedModelBuilder(export_path)
builder.add_meta_graph_and_variables(
    sess=sess,
    tags=[tag_constants.SERVING],
    signature_def_map={
        signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature
    }
)
builder.save()

如何查找此 signature_def 的输入? JSON 会像 {'input': 'https://storage.googleapis.com/projectid/bucket/filename'} 文件是 (1,473,473,3) numpy 数组吗?

第二次编辑: 查看 Lak Lakshmanan 发布的代码,我尝试了几种不同的变体,但都没有成功读取图像 url 并尝试以这种方式解析文件。我尝试了以下但没有成功:

inputs = {'imageurl': tf.placeholder(tf.string, shape=[None])}
filename = tf.squeeze(inputs['imageurl']) 
image = read_and_preprocess(filename)#custom preprocessing function
image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])
features = {'image' : image}
inputs.update(features)
signature = predict_signature_def(inputs= inputs,
                                outputs={'output': model.output})


with K.get_session() as session:
    """Convert the Keras HDF5 model into TensorFlow SavedModel."""
    builder = saved_model_builder.SavedModelBuilder(export_path)
    builder.add_meta_graph_and_variables(
        sess=session,
        tags=[tag_constants.SERVING],
        signature_def_map={
            signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature
        }
    )
    builder.save()

我认为问题在于从图像url 占位符获取到构建特征的映射。想想我做错了什么?

我通常做的是让 json 引用 Google 云存储中的文件。用户首先必须将他们的文件上传到 gcs,然后调用预测。但这种方法还有其他优点,因为存储实用程序允许并行和多线程上传。

Keras/TensorFlow 2.0

在 TensorFlow 2.0 中,服务函数如下所示:

@tf.function(input_signature=[tf.TensorSpec([None,], dtype=tf.string)])
def predict_bytes(img_bytes):
    input_images = tf.map_fn(
        preprocess,
        img_bytes,
        fn_output_signature=tf.float32
    )
    batch_pred = model(input_images) # same as model.predict()
    top_prob = tf.math.reduce_max(batch_pred, axis=[1])
    pred_label_index = tf.math.argmax(batch_pred, axis=1)
    pred_label = tf.gather(tf.convert_to_tensor(CLASS_NAMES), pred_label_index)
    return {
        'probability': top_prob,
        'flower_type_int': pred_label_index,
        'flower_type_str': pred_label
    }

@tf.function(input_signature=[tf.TensorSpec([None,], dtype=tf.string)])
def predict_filename(imageurl):
    img_bytes = tf.map_fn(
        tf.io.read_file,
        filenames
    )
    result = predict_bytes(img_bytes)
    result['filename'] = filenames
    return result

shutil.rmtree('export', ignore_errors=True)
os.mkdir('export')
model.save('export/flowers_model3',
          signatures={
              'serving_default': predict_filename,
              'from_bytes': predict_bytes
          })

完整代码在这里: https://nbviewer.jupyter.org/github/GoogleCloudPlatform/practical-ml-vision-book/blob/master/09_deploying/09d_bytes.ipynb

张量流 1.0

在 TensorFlow 1.0 中,代码将如下所示:

def serving_input_fn():
    # Note: only handles one image at a time ... 
    inputs = {'imageurl': tf.placeholder(tf.string, shape=())}
    filename = tf.squeeze(inputs['imageurl']) # make it a scalar
    image = read_and_preprocess(filename)
    # make the outer dimension unknown (and not 1)
    image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])

features = {'image' : image}
return tf.estimator.export.ServingInputReceiver(features, inputs)

完整代码在这里: https://github.com/GoogleCloudPlatform/training-data-analyst/blob/61ab2e175a629a968024a5d09e9f4666126f4894/courses/machine_learning/deepdive/08_image/flowersmodel/trainer/model.py#L119

我尝试在 AI Platform 上对大图像进行 运行 预测时遇到了同样的错误。我通过在将图像发送到 AI Platform 之前先将图像编码为 PNG 格式来解决负载限制问题。

我的 Keras 模型不将 PNG 编码图像作为输入,因此我需要将 Keras 模型转换为 Tensorflow Estimator 并定义其 serving input function 包含将 PNG 编码图像解码回的代码我的模型期望的格式。

当模型需要两个不同的灰度图像作为输入时的示例代码:

import tensorflow as tf
from tensorflow.keras.estimator import model_to_estimator
from tensorflow.estimator.export import ServingInputReceiver

IMG_PNG_1 = "encoded_png_image_1"
IMG_PNG_2 = "encoded_png_image_2"


def create_serving_fn(image_height, image_width):
    def serving_input_fn():
        def preprocess_png(png_encoded_img):
            img = tf.reshape(png_encoded_img, shape=())
            img = tf.io.decode_png(img, channels=1)
            img = img / 255
            img = tf.expand_dims(img, axis=0)
            return img

        # receiver_tensors worked only when the shape parameter wasn't defined
        receiver_tensors = {
            IMG_PNG_1: tf.compat.v1.placeholder(tf.string),
            IMG_PNG_2: tf.compat.v1.placeholder(tf.string)
        }

        img_1 = preprocess_png(png_encoded_img=receiver_tensors[IMG_PNG_1])
        img_2 = preprocess_png(png_encoded_img=receiver_tensors[IMG_PNG_2])

        input_img_1 = tf.compat.v1.placeholder_with_default(img_1, shape=[None, image_height, image_width, 1])
        input_img_2 = tf.compat.v1.placeholder_with_default(img_2, shape=[None, image_height, image_width, 1])

        features = {
            "model_input_1": input_img_1,
            "model_input_2": input_img_2,
        }

        return ServingInputReceiver(features=features, receiver_tensors=receiver_tensors)

    return serving_input_fn

# Convert trained Keras model to Estimator
estimator = model_to_estimator(keras_model=model)
save_path = "location_of_the_SavedModel"
export_path = estimator.export_saved_model(
    export_dir_base=save_path,
    serving_input_receiver_fn=create_serving_fn(1000, 1000)
)