预测失败:处理输入时出错:预期的字符串,得到了字典
Prediction failed: Error processing input: Expected string, got dict
我已经完成了 TensorFlow 的入门教程 (https://www.tensorflow.org/get_started/get_started_for_beginners) 并对代码进行了一些小的更改以使其适应我的应用程序。我的案例的特征列如下:
transaction_column = tf.feature_column.categorical_column_with_vocabulary_list(key='Transaction', vocabulary_list=["buy", "rent"])
localization_column = tf.feature_column.categorical_column_with_vocabulary_list(key='Localization', vocabulary_list=["barcelona", "girona"])
dimensions_feature_column = tf.feature_column.numeric_column("Dimensions")
buy_price_feature_column = tf.feature_column.numeric_column("BuyPrice")
rent_price_feature_column = tf.feature_column.numeric_column("RentPrice")
my_feature_columns = [
tf.feature_column.indicator_column(transaction_column),
tf.feature_column.indicator_column(localization_column),
tf.feature_column.bucketized_column(source_column = dimensions_feature_column,
boundaries = [50, 75, 100]),
tf.feature_column.numeric_column(key='Rooms'),
tf.feature_column.numeric_column(key='Toilets'),
tf.feature_column.bucketized_column(source_column = buy_price_feature_column,
boundaries = [1, 180000, 200000, 225000, 250000, 275000, 300000]),
tf.feature_column.bucketized_column(source_column = rent_price_feature_column,
boundaries = [1, 700, 1000, 1300])
]
之后,我保存了模型,以便可以在 Cloud ML Engine 中使用它来进行预测。
为了导出模型,我添加了以下代码(在评估模型之后):
feature_spec = tf.feature_column.make_parse_example_spec(my_feature_columns)
export_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
servable_model_dir = "modeloutput"
servable_model_path = classifier.export_savedmodel(servable_model_dir, export_input_fn)
在 运行 编码后,我在 "modeloutput" 目录中获得了正确的模型文件,并在云端创建了模型(如 https://cloud.google.com/ml-engine/docs/tensorflow/getting-started-training-prediction#deploy_a_model_to_support_prediction、[=69= 中所述) ])
创建模型版本后,我只是尝试在云端使用以下命令使用此模型启动在线预测 Shell:
gcloud ml-engine predict --model $MODEL_NAME --version v1 --json-instances ../prediction.json
其中 $MODEL_NAME 是我的模型名称,prediction.json 是一个包含以下内容的 JSON 文件:
{"inputs":[
{
"Transaction":"rent",
"Localization":"girona",
"Dimensions":90,
"Rooms":4,
"Toilets":2,
"BuyPrice":0,
"RentPrice":1100
}
]
}
但是,预测失败,我收到以下错误消息:
"error": "Prediction failed: Error processing input: Expected string, got {u'BuyPrice': 0, u'Transaction': u'rent', u'Rooms': 4, u'Localization': u'girona', u'Toilets': 2, u'RentPrice': 1100, u'Dimensions': 90} of type 'dict' instead."
错误很明显,应该是字符串而不是字典。如果我检查我的 SavedModel SignatureDef,我会得到以下信息:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 12)
name: dnn/head/Tile:0
outputs['scores'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 12)
name: dnn/head/predictions/probabilities:0
Method name is: tensorflow/serving/classify
显然输入的预期数据类型是字符串 (DT_STRING),但我不知道如何格式化我的输入数据以便预测成功。我尝试以多种不同的方式编写输入 JSON,但我不断收到错误。
如果我查看教程中预测的执行方式 (https://www.tensorflow.org/get_started/get_started_for_beginners),我认为很明显预测输入是作为字典传递的(教程代码中的 predict_x)。
所以,我哪里错了?如何使用此输入数据进行预测?
感谢您的宝贵时间。
根据答案进行编辑 ------
根据@Lak 的第二个建议,我更新了导出模型的代码,现在它看起来像这样:
export_input_fn = serving_input_fn
servable_model_dir = "savedmodeloutput"
servable_model_path = classifier.export_savedmodel(servable_model_dir,
export_input_fn)
...
def serving_input_fn():
feature_placeholders = {
'Transaction': tf.placeholder(tf.string, [None]),
'Localization': tf.placeholder(tf.string, [None]),
'Dimensions': tf.placeholder(tf.float32, [None]),
'Rooms': tf.placeholder(tf.int32, [None]),
'Toilets': tf.placeholder(tf.int32, [None]),
'BuyPrice': tf.placeholder(tf.float32, [None]),
'RentPrice': tf.placeholder(tf.float32, [None])
}
features = {
key: tf.expand_dims(tensor, -1)
for key, tensor in feature_placeholders.items()
}
return tf.estimator.export.ServingInputReceiver(features, feature_placeholders)
之后,我创建了一个新模型并为其提供以下内容 JSON 以获得预测:
{
"Transaction":"rent",
"Localization":"girona",
"Dimensions":90.0,
"Rooms":4,
"Toilets":2,
"BuyPrice":0.0,
"RentPrice":1100.0
}
请注意,我在进行预测时收到错误 "Unexpected tensor name: inputs",因此从 JSON 结构中删除了 "inputs"。但是,现在我得到一个新的更丑陋的错误:
"error": "Prediction failed: Error during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details=\"NodeDef mentions attr 'T' not in Op index:int64>; NodeDef: dnn/input_from_feature_columns/input_layer/Transaction_indicator/to_sparse_input/indices = WhereT=DT_BOOL, _output_shapes=[[?,2]], _device=\"/job:localhost/replica:0/task:0/device:CPU:0\". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).\n\t [[Node: dnn/input_from_feature_columns/input_layer/Transaction_indicator/to_sparse_input/indices = WhereT=DT_BOOL, _output_shapes=[[?,2]], _device=\"/job:localhost/replica:0/task:0/device:CPU:0\"]]\")"
我再次检查了 SignatureDef,得到以下信息:
The given SavedModel SignatureDef contains the following input(s):
inputs['Toilets'] tensor_info:
dtype: DT_INT32
shape: (-1)
name: Placeholder_4:0
inputs['Rooms'] tensor_info:
dtype: DT_INT32
shape: (-1)
name: Placeholder_3:0
inputs['Localization'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: Placeholder_1:0
inputs['RentPrice'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder_6:0
inputs['BuyPrice'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder_5:0
inputs['Dimensions'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder_2:0
inputs['Transaction'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
outputs['class_ids'] tensor_info:
dtype: DT_INT64
shape: (-1, 1)
name: dnn/head/predictions/ExpandDims:0
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 1)
name: dnn/head/predictions/str_classes:0
outputs['logits'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 12)
name: dnn/logits/BiasAdd:0
outputs['probabilities'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 12)
name: dnn/head/predictions/probabilities:0
Method name is: tensorflow/serving/predict
我是不是有些步骤出错了?谢谢!
新更新
我已经运行进行了一次本地预测,并且已经成功执行,收到了预期的预测结果。使用的命令:
gcloud ml-engine local predict --model-dir $MODEL_DIR --json-instances=../prediction.json
其中 MODEL_DIR 是包含模型训练生成的文件的目录。
所以问题似乎出在导出模型上。以某种方式导出并稍后用于预测的模型不正确。我读过一些关于 TensorFlow 版本的文章,可能是问题的根源,但我不明白。我的整个代码不是用同一个TF版本执行的吗?
关于这一点有什么想法吗?
谢谢!
问题出在您的服务输入函数中。您正在使用 build_parsing_serving_input_receiver_fn
,如果您要发送 tf.Example
字符串,则应使用该函数:
解决此问题的两种方法:
- 发送
tf.Example
example = tf.train.Example(features=tf.train.Features(feature=
{'transaction': tf.train.Feature(bytes_list=tf.train.BytesList(value=['rent'])),
'rentPrice': tf.train.Feature(float32_list=tf.train.Float32List(value=[1000.0))
}))
string_to_send = example.SerializeToString()
- 更改服务输入函数,以便您可以发送 JSON:
def serving_input_fn():
feature_placeholders = {
'transaction': tf.placeholder(tf.string, [None]),
...
'rentPrice': tf.placeholder(tf.float32, [None]),
}
features = {
key: tf.expand_dims(tensor, -1)
for key, tensor in feature_placeholders.items()
}
return tf.estimator.export.ServingInputReceiver(features, feature_placeholders)
export_input_fn = serving_input_fn
问题已解决:)
经过几次实验,我最终发现我必须使用最新的运行时版本 (1.8) 创建模型:
gcloud ml-engine versions create v2 --model $MODEL_NAME --origin $MODEL_BINARIES --runtime-version 1.8
我已经完成了 TensorFlow 的入门教程 (https://www.tensorflow.org/get_started/get_started_for_beginners) 并对代码进行了一些小的更改以使其适应我的应用程序。我的案例的特征列如下:
transaction_column = tf.feature_column.categorical_column_with_vocabulary_list(key='Transaction', vocabulary_list=["buy", "rent"])
localization_column = tf.feature_column.categorical_column_with_vocabulary_list(key='Localization', vocabulary_list=["barcelona", "girona"])
dimensions_feature_column = tf.feature_column.numeric_column("Dimensions")
buy_price_feature_column = tf.feature_column.numeric_column("BuyPrice")
rent_price_feature_column = tf.feature_column.numeric_column("RentPrice")
my_feature_columns = [
tf.feature_column.indicator_column(transaction_column),
tf.feature_column.indicator_column(localization_column),
tf.feature_column.bucketized_column(source_column = dimensions_feature_column,
boundaries = [50, 75, 100]),
tf.feature_column.numeric_column(key='Rooms'),
tf.feature_column.numeric_column(key='Toilets'),
tf.feature_column.bucketized_column(source_column = buy_price_feature_column,
boundaries = [1, 180000, 200000, 225000, 250000, 275000, 300000]),
tf.feature_column.bucketized_column(source_column = rent_price_feature_column,
boundaries = [1, 700, 1000, 1300])
]
之后,我保存了模型,以便可以在 Cloud ML Engine 中使用它来进行预测。 为了导出模型,我添加了以下代码(在评估模型之后):
feature_spec = tf.feature_column.make_parse_example_spec(my_feature_columns)
export_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
servable_model_dir = "modeloutput"
servable_model_path = classifier.export_savedmodel(servable_model_dir, export_input_fn)
在 运行 编码后,我在 "modeloutput" 目录中获得了正确的模型文件,并在云端创建了模型(如 https://cloud.google.com/ml-engine/docs/tensorflow/getting-started-training-prediction#deploy_a_model_to_support_prediction、[=69= 中所述) ])
创建模型版本后,我只是尝试在云端使用以下命令使用此模型启动在线预测 Shell:
gcloud ml-engine predict --model $MODEL_NAME --version v1 --json-instances ../prediction.json
其中 $MODEL_NAME 是我的模型名称,prediction.json 是一个包含以下内容的 JSON 文件:
{"inputs":[
{
"Transaction":"rent",
"Localization":"girona",
"Dimensions":90,
"Rooms":4,
"Toilets":2,
"BuyPrice":0,
"RentPrice":1100
}
]
}
但是,预测失败,我收到以下错误消息:
"error": "Prediction failed: Error processing input: Expected string, got {u'BuyPrice': 0, u'Transaction': u'rent', u'Rooms': 4, u'Localization': u'girona', u'Toilets': 2, u'RentPrice': 1100, u'Dimensions': 90} of type 'dict' instead."
错误很明显,应该是字符串而不是字典。如果我检查我的 SavedModel SignatureDef,我会得到以下信息:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 12)
name: dnn/head/Tile:0
outputs['scores'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 12)
name: dnn/head/predictions/probabilities:0
Method name is: tensorflow/serving/classify
显然输入的预期数据类型是字符串 (DT_STRING),但我不知道如何格式化我的输入数据以便预测成功。我尝试以多种不同的方式编写输入 JSON,但我不断收到错误。 如果我查看教程中预测的执行方式 (https://www.tensorflow.org/get_started/get_started_for_beginners),我认为很明显预测输入是作为字典传递的(教程代码中的 predict_x)。
所以,我哪里错了?如何使用此输入数据进行预测?
感谢您的宝贵时间。
根据答案进行编辑 ------
根据@Lak 的第二个建议,我更新了导出模型的代码,现在它看起来像这样:
export_input_fn = serving_input_fn
servable_model_dir = "savedmodeloutput"
servable_model_path = classifier.export_savedmodel(servable_model_dir,
export_input_fn)
...
def serving_input_fn():
feature_placeholders = {
'Transaction': tf.placeholder(tf.string, [None]),
'Localization': tf.placeholder(tf.string, [None]),
'Dimensions': tf.placeholder(tf.float32, [None]),
'Rooms': tf.placeholder(tf.int32, [None]),
'Toilets': tf.placeholder(tf.int32, [None]),
'BuyPrice': tf.placeholder(tf.float32, [None]),
'RentPrice': tf.placeholder(tf.float32, [None])
}
features = {
key: tf.expand_dims(tensor, -1)
for key, tensor in feature_placeholders.items()
}
return tf.estimator.export.ServingInputReceiver(features, feature_placeholders)
之后,我创建了一个新模型并为其提供以下内容 JSON 以获得预测:
{
"Transaction":"rent",
"Localization":"girona",
"Dimensions":90.0,
"Rooms":4,
"Toilets":2,
"BuyPrice":0.0,
"RentPrice":1100.0
}
请注意,我在进行预测时收到错误 "Unexpected tensor name: inputs",因此从 JSON 结构中删除了 "inputs"。但是,现在我得到一个新的更丑陋的错误:
"error": "Prediction failed: Error during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details=\"NodeDef mentions attr 'T' not in Op index:int64>; NodeDef: dnn/input_from_feature_columns/input_layer/Transaction_indicator/to_sparse_input/indices = WhereT=DT_BOOL, _output_shapes=[[?,2]], _device=\"/job:localhost/replica:0/task:0/device:CPU:0\". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).\n\t [[Node: dnn/input_from_feature_columns/input_layer/Transaction_indicator/to_sparse_input/indices = WhereT=DT_BOOL, _output_shapes=[[?,2]], _device=\"/job:localhost/replica:0/task:0/device:CPU:0\"]]\")"
我再次检查了 SignatureDef,得到以下信息:
The given SavedModel SignatureDef contains the following input(s):
inputs['Toilets'] tensor_info:
dtype: DT_INT32
shape: (-1)
name: Placeholder_4:0
inputs['Rooms'] tensor_info:
dtype: DT_INT32
shape: (-1)
name: Placeholder_3:0
inputs['Localization'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: Placeholder_1:0
inputs['RentPrice'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder_6:0
inputs['BuyPrice'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder_5:0
inputs['Dimensions'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder_2:0
inputs['Transaction'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
outputs['class_ids'] tensor_info:
dtype: DT_INT64
shape: (-1, 1)
name: dnn/head/predictions/ExpandDims:0
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 1)
name: dnn/head/predictions/str_classes:0
outputs['logits'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 12)
name: dnn/logits/BiasAdd:0
outputs['probabilities'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 12)
name: dnn/head/predictions/probabilities:0
Method name is: tensorflow/serving/predict
我是不是有些步骤出错了?谢谢!
新更新
我已经运行进行了一次本地预测,并且已经成功执行,收到了预期的预测结果。使用的命令:
gcloud ml-engine local predict --model-dir $MODEL_DIR --json-instances=../prediction.json
其中 MODEL_DIR 是包含模型训练生成的文件的目录。 所以问题似乎出在导出模型上。以某种方式导出并稍后用于预测的模型不正确。我读过一些关于 TensorFlow 版本的文章,可能是问题的根源,但我不明白。我的整个代码不是用同一个TF版本执行的吗? 关于这一点有什么想法吗?
谢谢!
问题出在您的服务输入函数中。您正在使用 build_parsing_serving_input_receiver_fn
,如果您要发送 tf.Example
字符串,则应使用该函数:
解决此问题的两种方法:
- 发送
tf.Example
example = tf.train.Example(features=tf.train.Features(feature= {'transaction': tf.train.Feature(bytes_list=tf.train.BytesList(value=['rent'])), 'rentPrice': tf.train.Feature(float32_list=tf.train.Float32List(value=[1000.0)) })) string_to_send = example.SerializeToString()
- 更改服务输入函数,以便您可以发送 JSON:
def serving_input_fn(): feature_placeholders = { 'transaction': tf.placeholder(tf.string, [None]), ... 'rentPrice': tf.placeholder(tf.float32, [None]), } features = { key: tf.expand_dims(tensor, -1) for key, tensor in feature_placeholders.items() } return tf.estimator.export.ServingInputReceiver(features, feature_placeholders) export_input_fn = serving_input_fn
问题已解决:)
经过几次实验,我最终发现我必须使用最新的运行时版本 (1.8) 创建模型:
gcloud ml-engine versions create v2 --model $MODEL_NAME --origin $MODEL_BINARIES --runtime-version 1.8