将包含字符串的 json 文件加载到 python 中的 pandas 数据帧中
Load json file with string into pandas dataframe in python
我有一个 json 数据集示例:
['[{"id":"123","product":"dell","date":"2019-01-01","sales":5,"created_at":"2019-01-26 15:00:00"}, {"id":"124","product":"apple","date":"2019-01-02","sales":7,"created_at":"2019-01-27 15:00:00"}]']
我想从这个 json 数据创建一个 pandas 数据框但是当我使用 json_normalize 方法时我得到 AttributeError: 'str' object has no attribute 'values'.
预期的输出应该是这样的:
id product date sales created_at
123. dell. 2019-01-01. 5. 2019-01-26 15:00:00
124. apple. 2019-01-02. 7. 2019-01-27 15:00:00
thats when i get the output which is a list of string
如果json.load...的结果像
a = ['[{"id":"123","product":"dell","date":"2019-01-01","sales":5,"created_at":"2019-01-26 15:00:00"}, {"id":"124","product":"apple","date":"2019-01-02","sales":7,"created_at":"2019-01-27 15:00:00"}]']
然后再做一次...
>>> b = json.loads(a[0])
>>> b
[{'id': '123', 'product': 'dell', 'date': '2019-01-01', 'sales': 5, 'created_at': '2019-01-26 15:00:00'}, {'id': '124', 'product': 'apple', 'date': '2019-01-02', 'sales': 7, 'created_at': '2019-01-27 15:00:00'}]
>>> pd.DataFrame(b)
id product date sales created_at
0 123 dell 2019-01-01 5 2019-01-26 15:00:00
1 124 apple 2019-01-02 7 2019-01-27 15:00:00
>>>
遗憾的是,您无法提前知道需要这样做。您需要先检查数据。除非你足够幸运,拥有制作这些 json 的 thing 的规范。
import json
import pandas as pd
x = ['[{"id":"123","product":"dell","date":"2019-01-01","sales":5,"created_at":"2019-01-26 15:00:00"}, {"id":"124","product":"apple","date":"2019-01-02","sales":7,"created_at":"2019-01-27 15:00:00"}]']
# Take 0th element of list
x = x[0]
info = json.loads(x)
df = pd.json_normalize(info)
df
我有一个 json 数据集示例:
['[{"id":"123","product":"dell","date":"2019-01-01","sales":5,"created_at":"2019-01-26 15:00:00"}, {"id":"124","product":"apple","date":"2019-01-02","sales":7,"created_at":"2019-01-27 15:00:00"}]']
我想从这个 json 数据创建一个 pandas 数据框但是当我使用 json_normalize 方法时我得到 AttributeError: 'str' object has no attribute 'values'.
预期的输出应该是这样的:
id product date sales created_at
123. dell. 2019-01-01. 5. 2019-01-26 15:00:00
124. apple. 2019-01-02. 7. 2019-01-27 15:00:00
thats when i get the output which is a list of string
如果json.load...的结果像
a = ['[{"id":"123","product":"dell","date":"2019-01-01","sales":5,"created_at":"2019-01-26 15:00:00"}, {"id":"124","product":"apple","date":"2019-01-02","sales":7,"created_at":"2019-01-27 15:00:00"}]']
然后再做一次...
>>> b = json.loads(a[0])
>>> b
[{'id': '123', 'product': 'dell', 'date': '2019-01-01', 'sales': 5, 'created_at': '2019-01-26 15:00:00'}, {'id': '124', 'product': 'apple', 'date': '2019-01-02', 'sales': 7, 'created_at': '2019-01-27 15:00:00'}]
>>> pd.DataFrame(b)
id product date sales created_at
0 123 dell 2019-01-01 5 2019-01-26 15:00:00
1 124 apple 2019-01-02 7 2019-01-27 15:00:00
>>>
遗憾的是,您无法提前知道需要这样做。您需要先检查数据。除非你足够幸运,拥有制作这些 json 的 thing 的规范。
import json
import pandas as pd
x = ['[{"id":"123","product":"dell","date":"2019-01-01","sales":5,"created_at":"2019-01-26 15:00:00"}, {"id":"124","product":"apple","date":"2019-01-02","sales":7,"created_at":"2019-01-27 15:00:00"}]']
# Take 0th element of list
x = x[0]
info = json.loads(x)
df = pd.json_normalize(info)
df