将包含字符串的 json 文件加载到 python 中的 pandas 数据帧中

Load json file with string into pandas dataframe in python

我有一个 json 数据集示例:

['[{"id":"123","product":"dell","date":"2019-01-01","sales":5,"created_at":"2019-01-26 15:00:00"}, {"id":"124","product":"apple","date":"2019-01-02","sales":7,"created_at":"2019-01-27 15:00:00"}]']

我想从这个 json 数据创建一个 pandas 数据框但是当我使用 json_normalize 方法时我得到 AttributeError: 'str' object has no attribute 'values'.

预期的输出应该是这样的:

id   product  date        sales   created_at
123.  dell.   2019-01-01.  5.     2019-01-26 15:00:00
124.  apple.  2019-01-02.  7.     2019-01-27 15:00:00

thats when i get the output which is a list of string

如果json.load...的结果像

a = ['[{"id":"123","product":"dell","date":"2019-01-01","sales":5,"created_at":"2019-01-26 15:00:00"}, {"id":"124","product":"apple","date":"2019-01-02","sales":7,"created_at":"2019-01-27 15:00:00"}]']

然后再做一次...

>>> b = json.loads(a[0])
>>> b
[{'id': '123', 'product': 'dell', 'date': '2019-01-01', 'sales': 5, 'created_at': '2019-01-26 15:00:00'}, {'id': '124', 'product': 'apple', 'date': '2019-01-02', 'sales': 7, 'created_at': '2019-01-27 15:00:00'}]
>>> pd.DataFrame(b)
    id product        date  sales           created_at
0  123    dell  2019-01-01      5  2019-01-26 15:00:00
1  124   apple  2019-01-02      7  2019-01-27 15:00:00
>>>

遗憾的是,您无法提前知道需要这样做。您需要先检查数据。除非你足够幸运,拥有制作这些 json 的 thing 的规范。

import json
import pandas as pd
x = ['[{"id":"123","product":"dell","date":"2019-01-01","sales":5,"created_at":"2019-01-26 15:00:00"}, {"id":"124","product":"apple","date":"2019-01-02","sales":7,"created_at":"2019-01-27 15:00:00"}]']
# Take 0th element of list
x = x[0]
info = json.loads(x)
df = pd.json_normalize(info)
df