将 pandas 数据框转换为 JSON,字符串分隔

Convert pandas data frame to JSON with strings separated

我有一个名为 'df' 的 pandas.dataframe,格式如下:

group_name Positive_Sentiment Negative_Sentiment
group1 helpful, great support slow customer service, weak interface, bad management

我想将此数据框转换为具有以下格式的 JSON 文件:

[{
"Group Name": "group1",
"Postive Sentiment": [
"helpful",
"great support"
],
"Negative Sentiment": [
"slow customer service",
"weak interface",
"bad management"
]
}
]

到目前为止我用过这个:

    import json
    b = []
    for i in range(len(df)):
        x={}
        x['Group Name']=df.iloc[i]['group_name']
        x['Positive Sentiment']= [df.iloc[i]['Positive_Sentiment']]
        x['Negative Sentiment']= [df.iloc[i]['Negative_Sentiment']]
        b.append(x)
    
    ##Export
    with open('AnalysisResults.json', 'w') as f:
        json.dump(b, f, indent = 2)

这导致:

[{
"Group Name": "group1",
"Postive Sentiment": [
"helpful,
great support"
],
"Negative Sentiment": [
"slow customer service,
weak interface,
bad UX"
]
}
]

可以看出离得很近了。关键的区别是每行的整个内容周围有双引号(例如,“有帮助,很好的支持”),而不是行中每个逗号分隔的字符串(例如,“有帮助”,“很好的支持”)。我想在每个字符串周围加上双引号。

您可以将 split(",") 应用于您的列:


from io import StringIO
import pandas as pd
import json

inp = StringIO("""group_name    Positive_Sentiment  Negative_Sentiment
group1  helpful, great support  slow customer service, weak interface, bad management
group2  great, good support     interface meeeh, bad management""")

df = pd.read_csv(inp, sep="\s{2,}")

def split_and_strip(sentiment):
         [x.strip() for x in sentiment.split(",")]

df["Positive_Sentiment"] = df["Positive_Sentiment"].apply(split_and_strip)
df["Negative_Sentiment"] = df["Negative_Sentiment"].apply(split_and_strip)

print(json.dumps(df.to_dict(orient="record"), indent=4))

# to save directly to a file:
with open("your_file.json", "w+") as f:
    json.dump(df.to_dict(orient="record"), f, indent=4)

输出:

[
    {
        "group_name": "group1",
        "Positive_Sentiment": [
            "helpful",
            "great support"
        ],
        "Negative_Sentiment": [
            "slow customer service",
            "weak interface",
            "bad management"
        ]
    },
    {
        "group_name": "group2",
        "Positive_Sentiment": [
            "great",
            "good support"
        ],
        "Negative_Sentiment": [
            "interface meeeh",
            "bad management"
        ]
    }
]