如何 return SageMaker Inference 中的所有标签和分数？

Question

我正在使用 sagemaker.huggingface 中的 HuggingFacePredictor 来推断一些文本，我想获得所有标签分数。

有没有办法得到端点的响应：

{
    "labels": ["help", "Greeting", "Farewell"] ,
    "score": [0.81, 0.1, 0.09],
}

（或类似）

而不是：

{
    "label": "help",
    "score": 0.81,
}

这是一些示例代码：

import boto3

from sagemaker.huggingface import HuggingFacePredictor
from sagemaker.session import Session

sagemaker_session = Session(boto_session=boto3.session.Session())

predictor = HuggingFacePredictor(
    endpoint_name=project, sagemaker_session=sagemaker_session
)
prediciton = predictor.predict({"inputs": text})[0]

Answer 1

对于您当前的代码示例，您不太清楚您正在执行什么具体任务，但为了这个答案，我假设您正在进行文本分类。

不过，最重要的是，我们可以在 Huggingface's Sagemaker reference document 中阅读以下内容（由我粗体突出显示）：

The Inference Toolkit accepts inputs in the inputs key, and supports additional pipelines parameters in the parameters key. You can provide any of the supported kwargs from pipelines as parameters.

如果我们查看 accepted arguments by the TextClassificationPipeline，我们可以看到确实有一个 returns 所有样本：

return_all_scores (bool, optional, defaults to False) — Whether to return scores for all labels.

虽然不幸的是我无法访问 Sagemaker 推理，但我可以运行一个示例来说明本地管道的输出：

from transformers import pipeline
# uses 2-way sentiment classification model per default
pipe = pipeline("text-classification") 

pipe("I am really angry right now >:(", return_all_scores=True)
# Output: [[{'label': 'NEGATIVE', 'score': 0.9989138841629028},
#           {'label': 'POSITIVE', 'score': 0.0010860705515369773}]]

基于 Sagemaker 预期的略有不同的输入格式，结合 this notebook 中给出的示例，我假设您自己的示例代码中更正的输入应该如下所示：

{
    "inputs": text,
    "parameters": {"return_all_scores": True}
}

如何 return SageMaker Inference 中的所有标签和分数？

How to return all labels and scores in SageMaker Inference?

python

nlp

amazon-web-services

amazon-sagemaker

huggingface-transformers