pandas 数据框不会根据条件更新

The pandas dataframe do not get updated based on a condition

我有一个数据框,我需要根据条件更新列(我正在尝试使用 Microsoft azure API 标记文本,然后将标签保存回原始数据框,以便以后使用我可以计算准确度)。但奇怪的是数据框没有得到更新!!

这是示例代码:

from azure.core.credentials import AzureKeyCredential
from azure.ai.textanalytics import TextAnalyticsClient


key = "key"
endpoint = "https://endpoint"

text_analytics_client = TextAnalyticsClient(endpoint=endpoint,   credential=AzureKeyCredential(key))

df = pd.DataFrame({'id':[1,2,3], 'text': ['im ok', 'you arent ok', 'its fine'],
                   'Sentiment':['positive', 'negative', 'neutral']})
n = 10

for i in range(0, df.shape[0], n):
    result = text_analytics_client.analyze_sentiment(df.iloc[i:i + n].to_dict('records'))
######in case you do not have azure credentials to get this code run, the out of the result is like this:
######[AnalyzeSentimentResult(id=2, sentiment=negative, warnings= [], statistics=None, confidence_scores=SentimentConfidenceScores(positive=0.01, neutral=0.16, negative=0.83), sentences=[SentenceSentiment(text=you arent ok, sentiment=negative, confidence_scores=SentimentConfidenceScores(positive=0.01, neutral=0.16, negative=0.83), length=12, offset=0, mined_opinions=[])], is_error=False), AnalyzeSentimentResult(id=3, sentiment=positive, warnings=[], statistics=None, confidence_scores=SentimentConfidenceScores(positive=0.98, neutral=0.01, negative=0.01), sentences=[SentenceSentiment(text=its fine, sentiment=positive, confidence_scores=SentimentConfidenceScores(positive=0.98, neutral=0.01, negative=0.01), length=8, offset=0, mined_opinions=[])], is_error=False)]

    for idx, doc in enumerate(result):
        print(doc.sentiment) ##this will print out a value
        id_res = result[idx]['id']
        #print(id_res) this will print out the correct id
        df.loc[df.id == id_res, 'label'] = doc.sentiment
        print(df) ### but here when the dataframe is printed the label column is NAN

我搜索并找到了多个链接,例如 , or this。在所有三个示例中,他们都在做与我相同的事情,但我的数据框没有更新,这是我得到的结果:

   id          text   Sentiment label
0   1         im ok  positive   NaN
1   2  you arent ok  negative   NaN
2   3      its fine   neutral   NaN

详情

我添加了一些细节,希望对您有所帮助。正如我在代码中评论的那样 res_result 有一个正确的 id。当我将此 df.loc[df.id == id_res, 'label'] 替换为 df.loc[df.id == 1, 'label'] 时,它成功更新了那些行,否则它不会更新!!!!

感谢任何有关如何解决此问题的意见。

问题出在这一行:

df.loc[df.id == id_res, 'label'] = doc.sentiment

df.id 是 int 类型,id_res 是 string 类型。如果您将 id_res 转换为 int 那么这将是一个有效的比较,您将得到您正在寻找的输出:

df.loc[df.id == int(id_res), 'label'] = doc.sentiment

输出:

   id          text Sentiment     label
0   1         im ok  positive   neutral
1   2  you arent ok  negative  negative
2   3      its fine   neutral  positive