如何将嵌套 JSON 的最低级别提取到 Python 中的 DataFrame 中?
How to extract lowest level of a nested JSON into a DataFrame in Python?
我希望仅提取 Python 中 JSON 的最低级别对象。例如,API 中的前几条记录如下所示:
{
"season": 2021,
"charts": {
"ARI": {
"TE": [
{
"team": "ARI",
"position": "TE",
"depth": "1",
"playerId": "1443",
"name": "Dan Arnold"
},
{
"team": "ARI",
"position": "TE",
"depth": "2",
"playerId": "599",
"name": "Maxx Williams"
}
],
"K": [
{
"team": "ARI",
"position": "K",
"depth": "1",
"playerId": "1121",
"name": "Zane Gonzalez"
},
{
我最终想将所有这些结果放入具有以下结构的 DataFrame 中:
| team | position | depth | playerId | name |
|:---- |:-------- |:----- |:-------- |:---- |
我试过以下代码的变体但没有成功:
import requests as rq
import pandas as pd
# Retrieve Depth Charts
json_depthCharts = rq.get(f"https://api.fantasynerds.com/v1/nfl/depth?apikey={API_KEY}").json()
df_depthCharts = pd.json_normalize(json_depthCharts, 'charts', ['charts', 'team'])
print(df_depthCharts)
如有任何见解,我们将不胜感激!
尝试json_normalize()
+melt()
+explode()
+Dataframe()
:
df=pd.DataFrame(pd.json_normalize(json_depthCharts).melt('season')['value'].explode().tolist())
或
通过 stack()
+drop()
组合代替 melt()
的其他方式,其余所有方法保持不变:
df=pd.DataFrame(pd.json_normalize(json_depthCharts).drop(columns='season').stack().explode().tolist())
df
的输出:
team position depth playerId name
0 ARI TE 1 1443 Dan Arnold
1 ARI TE 2 599 Maxx Williams
2 ARI K 1 1121 Zane Gonzalez
3 ARI K 2 1454 Brett Maher
4 ARI LWR 1 338 DeAndre Hopkins
... ... ... ... ... ...
932 WAS LDE 2 179 Ryan Kerrigan
933 WAS RCB 1 647 Ronald Darby
934 WAS RB 1 1957 Antonio Gibson
935 WAS RB 2 1542 Bryce Love
936 WAS NB 1 1733 Jimmy Moreland
937 rows × 5 columns
我希望仅提取 Python 中 JSON 的最低级别对象。例如,API 中的前几条记录如下所示:
{
"season": 2021,
"charts": {
"ARI": {
"TE": [
{
"team": "ARI",
"position": "TE",
"depth": "1",
"playerId": "1443",
"name": "Dan Arnold"
},
{
"team": "ARI",
"position": "TE",
"depth": "2",
"playerId": "599",
"name": "Maxx Williams"
}
],
"K": [
{
"team": "ARI",
"position": "K",
"depth": "1",
"playerId": "1121",
"name": "Zane Gonzalez"
},
{
我最终想将所有这些结果放入具有以下结构的 DataFrame 中:
| team | position | depth | playerId | name |
|:---- |:-------- |:----- |:-------- |:---- |
我试过以下代码的变体但没有成功:
import requests as rq
import pandas as pd
# Retrieve Depth Charts
json_depthCharts = rq.get(f"https://api.fantasynerds.com/v1/nfl/depth?apikey={API_KEY}").json()
df_depthCharts = pd.json_normalize(json_depthCharts, 'charts', ['charts', 'team'])
print(df_depthCharts)
如有任何见解,我们将不胜感激!
尝试json_normalize()
+melt()
+explode()
+Dataframe()
:
df=pd.DataFrame(pd.json_normalize(json_depthCharts).melt('season')['value'].explode().tolist())
或
通过 stack()
+drop()
组合代替 melt()
的其他方式,其余所有方法保持不变:
df=pd.DataFrame(pd.json_normalize(json_depthCharts).drop(columns='season').stack().explode().tolist())
df
的输出:
team position depth playerId name
0 ARI TE 1 1443 Dan Arnold
1 ARI TE 2 599 Maxx Williams
2 ARI K 1 1121 Zane Gonzalez
3 ARI K 2 1454 Brett Maher
4 ARI LWR 1 338 DeAndre Hopkins
... ... ... ... ... ...
932 WAS LDE 2 179 Ryan Kerrigan
933 WAS RCB 1 647 Ronald Darby
934 WAS RB 1 1957 Antonio Gibson
935 WAS RB 2 1542 Bryce Love
936 WAS NB 1 1733 Jimmy Moreland
937 rows × 5 columns