运行 图形期间出现异常:无法以字节形式从提要中获取元素
Exception during running the graph: Unable to get element from the feed as bytes
我正在使用光束管道将我的文本预处理为整数词袋,类似于此示例 https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/reddit_tft/reddit.py
words = tft.map(tf.string_split, inputs[name])
result[name + '_bow'] = tft.string_to_int(
words, frequency_threshold=frequency_threshold)
预处理和训练似乎工作正常。我训练了一个简单的线性模型并指向变换函数和 运行 一个实验。
saved_model.pbtxt 似乎保存了字典,我的目标是能够在 google 云 ml 上部署此模型以进行预测并使用原始文本作为输入进行查询:
{"inputs" : { "title": "E. D. Abbott Ltd", "text" : "Abbott of Farnham E D Abbott Limited was a British coachbuilding business" }}
当运行宁
gcloud ml-engine local predict \
--model-dir=$MODEL_DIR \
--json-instances="$DATA_DIR/test.json" \
我收到以下错误,不知道我做错了什么。
源代码/日志
WARNING:root:MetaGraph has multiple signatures 2. 对多重签名的支持是
有限的。默认情况下,我们 select 命名签名。
ERROR:root:Exception 在 运行 绘制图表期间:无法从提要 a 中获取元素
s 字节。
追溯(最近一次通话):
文件 "lib/googlecloudsdk/command_lib/ml_engine/local_predict.py",第 136 行,位于
主要的()
文件 "lib/googlecloudsdk/command_lib/ml_engine/local_predict.py",第 131 行,在 mai
n
实例=实例)
文件“/Users/xyz/Downloads/google-cloud-sdk/lib/third_party/cloud_ml_engin
e_sdk/prediction/prediction_lib.py",第 656 行,在 local_predict 中
_, 预测 = model.predict(实例)
文件“/Users/xyz/Downloads/google-cloud-sdk/lib/third_party/cloud_ml_engin
e_sdk/prediction/prediction_lib.py",第 553 行,在预测中
输出 = self._client.predict(列,统计数据)
文件“/Users/xyz/Downloads/google-cloud-sdk/lib/third_party/cloud_ml_engin
e_sdk/prediction/prediction_lib.py",第 382 行,在预测中
"Exception during running the graph: " + str(e))
prediction_lib.PredictionError: (4, '运行 处理图形时出现异常:无法 g
以字节形式从 Feed 获取元素。')
def feature_columns(vocab_size=100000):
result = []
for key in TEXT_COLUMNS:
column = tf.contrib.layers.sparse_column_with_integerized_feature(key, vocab_size, combiner='sum')
result.append(column)
return result
model_fn = tf.contrib.learn.LinearClassifier(
feature_columns=feature_columns(),
n_classes=15,
model_dir=output_dir
)
def get_transformed_reader_input_fn(transformed_metadata,
transformed_data_paths,
batch_size,
mode):
"""Wrap the get input features function to provide the runtime arguments."""
return input_fn_maker.build_training_input_fn(
metadata=transformed_metadata,
file_pattern=(
transformed_data_paths[0] if len(transformed_data_paths) == 1
else transformed_data_paths),
training_batch_size=batch_size,
label_keys=[LABEL_COLUMN],
reader=gzip_reader_fn,
key_feature_name='key',
reader_num_threads=4,
queue_capacity=batch_size * 2,
randomize_input=(mode != tf.contrib.learn.ModeKeys.EVAL),
num_epochs=(1 if mode == tf.contrib.learn.ModeKeys.EVAL else None))
transformed_metadata = metadata_io.read_metadata(
args.transformed_metadata_path)
raw_metadata = metadata_io.read_metadata(args.raw_metadata_path)
train_input_fn = get_transformed_reader_input_fn(
transformed_metadata, args.train_data_paths, args.batch_size,
tf.contrib.learn.ModeKeys.TRAIN)
eval_input_fn = get_transformed_reader_input_fn(
transformed_metadata, args.eval_data_paths, args.batch_size,
tf.contrib.learn.ModeKeys.EVAL)
serving_input_fn = input_fn_maker.build_parsing_transforming_serving_input_fn(
raw_metadata,
args.transform_savedmodel,
raw_label_keys=[],
raw_feature_keys=model.TEXT_COLUMNS)
export_strategy = tf.contrib.learn.utils.make_export_strategy(
serving_input_fn,
default_output_alternative_key=None,
exports_to_keep=5,
as_text=True)
return Experiment(
estimator=model_fn,
train_input_fn=train_input_fn,
eval_input_fn=eval_input_fn,
export_strategies=export_strategy,
eval_metrics=model.get_eval_metrics(),
train_monitors=[],
train_steps=args.train_steps,
eval_steps=args.eval_steps,
min_eval_frequency=1
)
build_parsing_transforming_serving_input_fn() 的文档说它创建了一个输入函数,将转换应用于编码为 tf.Examples 的原始数据作为序列化字符串。让事情变得更复杂的是,该字符串必须经过 base64 编码才能发送到预测服务(参见 section Data encoding)
我建议使用 build_default_transforming_serving_input_fn() 用于 json 输入。那么你的 json 文件应该只有
{ "title": "E. D. Abbott Ltd", "text" : "Abbott of Farnham E D Abbott Limited was a British coachbuilding business" }
{ "title": "another title", "text" : "more text" }
...
我正在使用光束管道将我的文本预处理为整数词袋,类似于此示例 https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/reddit_tft/reddit.py
words = tft.map(tf.string_split, inputs[name])
result[name + '_bow'] = tft.string_to_int(
words, frequency_threshold=frequency_threshold)
预处理和训练似乎工作正常。我训练了一个简单的线性模型并指向变换函数和 运行 一个实验。
saved_model.pbtxt 似乎保存了字典,我的目标是能够在 google 云 ml 上部署此模型以进行预测并使用原始文本作为输入进行查询:
{"inputs" : { "title": "E. D. Abbott Ltd", "text" : "Abbott of Farnham E D Abbott Limited was a British coachbuilding business" }}
当运行宁
gcloud ml-engine local predict \
--model-dir=$MODEL_DIR \
--json-instances="$DATA_DIR/test.json" \
我收到以下错误,不知道我做错了什么。
源代码/日志
WARNING:root:MetaGraph has multiple signatures 2. 对多重签名的支持是 有限的。默认情况下,我们 select 命名签名。 ERROR:root:Exception 在 运行 绘制图表期间:无法从提要 a 中获取元素 s 字节。 追溯(最近一次通话): 文件 "lib/googlecloudsdk/command_lib/ml_engine/local_predict.py",第 136 行,位于 主要的() 文件 "lib/googlecloudsdk/command_lib/ml_engine/local_predict.py",第 131 行,在 mai n 实例=实例) 文件“/Users/xyz/Downloads/google-cloud-sdk/lib/third_party/cloud_ml_engin e_sdk/prediction/prediction_lib.py",第 656 行,在 local_predict 中 _, 预测 = model.predict(实例) 文件“/Users/xyz/Downloads/google-cloud-sdk/lib/third_party/cloud_ml_engin e_sdk/prediction/prediction_lib.py",第 553 行,在预测中 输出 = self._client.predict(列,统计数据) 文件“/Users/xyz/Downloads/google-cloud-sdk/lib/third_party/cloud_ml_engin e_sdk/prediction/prediction_lib.py",第 382 行,在预测中 "Exception during running the graph: " + str(e)) prediction_lib.PredictionError: (4, '运行 处理图形时出现异常:无法 g 以字节形式从 Feed 获取元素。')
def feature_columns(vocab_size=100000):
result = []
for key in TEXT_COLUMNS:
column = tf.contrib.layers.sparse_column_with_integerized_feature(key, vocab_size, combiner='sum')
result.append(column)
return result
model_fn = tf.contrib.learn.LinearClassifier(
feature_columns=feature_columns(),
n_classes=15,
model_dir=output_dir
)
def get_transformed_reader_input_fn(transformed_metadata,
transformed_data_paths,
batch_size,
mode):
"""Wrap the get input features function to provide the runtime arguments."""
return input_fn_maker.build_training_input_fn(
metadata=transformed_metadata,
file_pattern=(
transformed_data_paths[0] if len(transformed_data_paths) == 1
else transformed_data_paths),
training_batch_size=batch_size,
label_keys=[LABEL_COLUMN],
reader=gzip_reader_fn,
key_feature_name='key',
reader_num_threads=4,
queue_capacity=batch_size * 2,
randomize_input=(mode != tf.contrib.learn.ModeKeys.EVAL),
num_epochs=(1 if mode == tf.contrib.learn.ModeKeys.EVAL else None))
transformed_metadata = metadata_io.read_metadata(
args.transformed_metadata_path)
raw_metadata = metadata_io.read_metadata(args.raw_metadata_path)
train_input_fn = get_transformed_reader_input_fn(
transformed_metadata, args.train_data_paths, args.batch_size,
tf.contrib.learn.ModeKeys.TRAIN)
eval_input_fn = get_transformed_reader_input_fn(
transformed_metadata, args.eval_data_paths, args.batch_size,
tf.contrib.learn.ModeKeys.EVAL)
serving_input_fn = input_fn_maker.build_parsing_transforming_serving_input_fn(
raw_metadata,
args.transform_savedmodel,
raw_label_keys=[],
raw_feature_keys=model.TEXT_COLUMNS)
export_strategy = tf.contrib.learn.utils.make_export_strategy(
serving_input_fn,
default_output_alternative_key=None,
exports_to_keep=5,
as_text=True)
return Experiment(
estimator=model_fn,
train_input_fn=train_input_fn,
eval_input_fn=eval_input_fn,
export_strategies=export_strategy,
eval_metrics=model.get_eval_metrics(),
train_monitors=[],
train_steps=args.train_steps,
eval_steps=args.eval_steps,
min_eval_frequency=1
)
build_parsing_transforming_serving_input_fn() 的文档说它创建了一个输入函数,将转换应用于编码为 tf.Examples 的原始数据作为序列化字符串。让事情变得更复杂的是,该字符串必须经过 base64 编码才能发送到预测服务(参见 section Data encoding)
我建议使用 build_default_transforming_serving_input_fn() 用于 json 输入。那么你的 json 文件应该只有
{ "title": "E. D. Abbott Ltd", "text" : "Abbott of Farnham E D Abbott Limited was a British coachbuilding business" }
{ "title": "another title", "text" : "more text" }
...