当 Json 请求包含“_bytes”或 "b64" 时，google cloud ml-engine 会做什么？

Question

google 云文档 (see Binary data in prediction input) 指出：

Your encoded string must be formatted as a JSON object with a single key named b64. The following Python example encodes a buffer of raw JPEG data using the base64 library to make an instance:
{"image_bytes":{"b64": base64.b64encode(jpeg_data)}}
In your TensorFlow model code, you must name the aliases for your input and output tensors so that they end with '_bytes'.

我想详细了解此过程在 google 云端的工作原理。

ml-engine 是否自动解码 "b64" 之后的任何内容字符串到字节数据？
请求有这种嵌套结构时，是否只传入 "b64" 部分添加到服务输入功能并删除 "image_bytes" 钥匙？
每个请求是单独传递给服务输入函数还是他们是批处理的吗？
我们是否在服务输入函数返回的 ServingInputReceiver 中定义输入输出别名？

我找不到创建使用此嵌套结构来定义功能占位符的服务输入函数的方法。我只在我的中使用 "b64"，我不确定 gcloud ml-engine 在接收请求时做了什么。

此外，当使用 gcloud ml-engine local predict 进行本地预测时，使用嵌套结构发送请求失败，（意外键 image_bytes，因为它未在服务输入函数中定义）。但是当使用 gcloud ml-engine predict 进行预测时，即使服务输入函数不包含对 "image_bytes" 的引用，使用嵌套结构发送请求也能正常工作。 gcloud 预测在省略 "image_bytes" 并仅传入 "b64".

时也有效

提供输入功能的示例

def serving_input_fn():
    feature_placeholders = {'b64': tf.placeholder(dtype=tf.string,
                                                  shape=[None],
                                                  name='source')}
    single_image = tf.decode_raw(feature_placeholders['b64'], tf.float32)
    inputs = {'image': single_image}
    return tf.estimator.export.ServingInputReceiver(inputs, feature_placeholders)

我给出了使用图像的示例，但我认为这同样适用于以字节和 base64 编码发送的所有类型的数据。

有很多 Whosebug 问题，其中包含对包含“_bytes”和信息片段的需求的引用，但如果有人能像我那样更详细地解释发生了什么，我会发现它很有用格式化请求时不要太随意。

关于这个主题的 Whosebug 问题

how make correct predictions of jpeg image in cloud-ml
How convert a jpeg image into json file in Google machine learning
how make correct predictions of jpeg image in cloud-ml
Base64 images with Keras and Google Cloud ML

Answer 1

为了帮助澄清您的一些问题，请允许我从预测请求的基本剖析开始：

{"instances": [<instance>, <instance>, ...]}

其中 instance 是一个 JSON 对象（dict/map，我将在下文中使用 Python 术语 "dict"）和 attributes/keys 是输入的名称，其值包含该输入的数据。

云服务的作用（并且 gcloud ml-engine local predict 使用与服务相同的底层库）是获取字典列表（可以将其视为数据行）然后将其转换为列表的字典（可以被认为是包含实例批次的柱状数据）具有与原始数据中相同的键。例如，

{"instances": [{"x": 1, "y": "a"}, {"x": 3, "y": "b"}, {"x": 5, "y": "c"}]}

变成（内部）

{"x": [1, 3, 5], "y": ["a", "b", "c"]}

此字典中的键（因此，在原始请求的实例中）必须对应于传递给 ServingInputFnReceiver 的字典中的键。从这个例子中可以明显看出，服务 "batches" 所有数据，即所有实例作为单个批次被送入图中。这就是为什么输入形状的外部维度必须是 None —— 这是批量维度，在发出请求之前是未知的（因为每个请求可能有不同数量的实例）。当导出图表接受上述请求时，您可能会定义这样一个函数：

def serving_input_fn():
  inputs = {'x': tf.placeholder(dtype=tf.int32, shape=[None]),
            'y': tf.placeholder(dtype=tf.string, shape=[None]}
  return tf.estimator.export.ServingInputReceiver(inputs, inputs)

由于JSON不（直接）支持二进制数据，并且由于TensorFlow无法区分"strings"和"bytes"，我们需要对二进制数据进行特殊处理。首先，我们需要所述输入的名称以“_bytes”结尾，以帮助区分文本字符串和字节字符串。使用上面的示例，假设 y 包含二进制数据而不是文本。我们将声明如下：

def serving_input_fn():
  inputs = {'x': tf.placeholder(dtype=tf.int32, shape=[None]),
            'y_bytes': tf.placeholder(dtype=tf.string, shape=[None]}
  return tf.estimator.export.ServingInputReceiver(inputs, inputs)

请注意，唯一改变的是使用 y_bytes 而不是 y 作为输入的名称。

接下来，我们需要对数据进行真正的base64编码；在任何可以接受字符串的地方，我们都可以使用这样的对象：{"b64": ""}。改编运行示例，请求可能如下所示：

{
  "instances": [
    {"x": 1, "y_bytes": {"b64": "YQ=="}},
    {"x": 3, "y_bytes": {"b64": "Yg=="}},
    {"x": 5, "y_bytes": {"b64": "Yw=="}}
  ]
}

在这种情况下，服务与之前完全相同，但增加了一个步骤：它自动对字符串进行 base64 解码（和 "replaces" {"b64": ...} 对象字节）发送到 TensorFlow 之前。所以 TensorFlow 实际上以 dict 结束，就像以前一样：

{"x": [1, 3, 5], "y_bytes": ["a", "b", "c"]}

（注意输入的名字没有变）

当然，base64 文本数据有点毫无意义；你通常会这样做，例如，对于无法通过 JSON 以任何其他方式发送的图像数据，但我希望上面的示例无论如何都足以说明这一点。

还有一点需要说明：该服务支持 shorthand 类型。当您的 TensorFlow 模型只有一个输入时，无需在实例列表中的每个对象中不断重复该输入的名称。为了说明，想象导出一个只有 x:

的模型

def serving_input_fn():
  inputs = {'x': tf.placeholder(dtype=tf.int32, shape=[None])}
  return tf.estimator.export.ServingInputReceiver(inputs, inputs)

"long form" 请求如下所示：

{"instances": [{"x": 1}, {"x": 3}, {"x": 5}]}

相反，您可以在 shorthand 中发送请求，如下所示：

{"instances": [1, 3, 5]}

请注意，这甚至适用于 base64 编码数据。因此，例如，如果我们不只导出 x，而是只导出 y_bytes，我们可以简化来自以下的请求：

{
  "instances": [
    {"y_bytes": {"b64": "YQ=="}},
    {"y_bytes": {"b64": "Yg=="}},
    {"y_bytes": {"b64": "Yw=="}}
  ]
}

收件人：

{
  "instances": [
    {"b64": "YQ=="},
    {"b64": "Yg=="},
    {"b64": "Yw=="}
  ]
}

在许多情况下，这只是一个小胜利，但它确实有助于提高可读性，例如，当输入包含 CSV 数据时。

所以把它放在一起以适应您的特定场景，这就是您的服务功能应该是这样的：

def serving_input_fn():
  feature_placeholders = {
    'image_bytes': tf.placeholder(dtype=tf.string, shape=[None], name='source')}
    single_image = tf.decode_raw(feature_placeholders['image_bytes'], tf.float32)
    return tf.estimator.export.ServingInputReceiver(feature_placeholders, feature_placeholders)

与您当前代码的显着差异：

输入的名称是不是b64，而是image_bytes（可以是任何以_bytes结尾的东西）
feature_placeholders 用作 both 个参数 ServingInputReceiver

示例请求可能如下所示：

{
  "instances": [
    {"image_bytes": {"b64": "YQ=="}},
    {"image_bytes": {"b64": "Yg=="}},
    {"image_bytes": {"b64": "Yw=="}}
  ]
}

或者，可选地，简写：

{
  "instances": [
    {"b64": "YQ=="},
    {"b64": "Yg=="},
    {"b64": "Yw=="}
  ]
}

最后一个音符。 gcloud ml-engine local predict和gcloud ml-engine predict根据传入的文件内容构造请求，非常需要注意的是目前文件内容是不是一个完整、有效的请求，而是 --json-instances 文件的每一行都成为实例列表中的一个条目。具体在您的情况下，该文件将如下所示（换行符在这里有意义）：

{"image_bytes": {"b64": "YQ=="}}
{"image_bytes": {"b64": "Yg=="}}
{"image_bytes": {"b64": "Yw=="}}

或等价物 shorthand。 gcloud 将获取每一行并构建上面显示的实际请求。

当 Json 请求包含“_bytes”或 "b64" 时，google cloud ml-engine 会做什么？

What does google cloud ml-engine do when a Json request contains "_bytes" or "b64"?

gcloud

tensorflow-serving

google-cloud-ml

tensorflow-estimator