将字典列表转换为具有动态 header 列的 pandas 数据框
Transforming List of dicts to pandas dataframe with dynamic header columns
假设我有听写列表:
my lists = [
{'rank': 2, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B01MG0ORBL'
},
{'rank': 18, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B0735C9RDZ'
},
{'rank': 21, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B07FPVR858'
},
{'rank': 126, 'keyword_name': 'mens wallet', 'volume': , 'asin': 'B01MG0ORBL'
},
{'rank': 128, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B0735C9RDZ'
},
{'rank': 136, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B07FPVR858'
},
{'rank': 19, 'keyword_name': 'leather wallets', 'volume': , 'asin': 'B0735C9RDZ'
},
{'rank': 10, 'keyword_name': 'wallets for men', 'volume': 566, 'asin': 'B07FPVR858'
},
{'rank': 16, 'keyword_name': 'wallets for men', 'volume': 566, 'asin': 'B0735C9RDZ'
},
]
我想按 asin 和 keyword_name 分组,因为它们在字典列表中出现不止一次,所以我的目标是拥有一个如下所示的数据框:
**keyword_name volume B01MG0ORBL B0735C9RDZ B07FPVR858** // column headers
mens wallet 456677 2 126 18 128 19 16 21 10
leather wallets 23
wallets for men 566 16 10
所以最初我在想
d = [{d['asin']:d['rank'] for d in l} for l in my_lists]
pd.dataframe(d)
// save as xlsx file
writer = pd.ExcelWriter(f"{path}/sheet.xlsx", engine="xlsxwriter")
d.to_excel(
writer, sheet_name="Organic", startrow=0, header=True, index=False
)
但不可能,因为它会 运行 出错 TypeError: string indices must be integers
。
您可以创建 DataFrame
,然后使用 list
旋转:
df = pd.DataFrame(my_lists)
df = df.pivot_table(index=['keyword_name','volume'],
columns='asin',
values='rank',
aggfunc=list)
print (df)
asin B01MG0ORBL B0735C9RDZ B07FPVR858
keyword_name volume
leather wallets 23 NaN [19] NaN
mens wallet 456677 [2, 126] [18, 128] [21, 136]
wallets for men 566 NaN [16] [10]
或将转换后的值加入字符串:
df = pd.DataFrame(my_lists)
df = (df.assign(rank=df['rank'].astype(str))
.pivot_table(index=['keyword_name','volume'],
columns='asin',
values='rank',
aggfunc=' '.join,
fill_value=''))
print (df)
asin B01MG0ORBL B0735C9RDZ B07FPVR858
keyword_name volume
leather wallets 23 19
mens wallet 456677 2 126 18 128 21 136
wallets for men 566 16 10
假设我有听写列表:
my lists = [
{'rank': 2, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B01MG0ORBL'
},
{'rank': 18, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B0735C9RDZ'
},
{'rank': 21, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B07FPVR858'
},
{'rank': 126, 'keyword_name': 'mens wallet', 'volume': , 'asin': 'B01MG0ORBL'
},
{'rank': 128, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B0735C9RDZ'
},
{'rank': 136, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B07FPVR858'
},
{'rank': 19, 'keyword_name': 'leather wallets', 'volume': , 'asin': 'B0735C9RDZ'
},
{'rank': 10, 'keyword_name': 'wallets for men', 'volume': 566, 'asin': 'B07FPVR858'
},
{'rank': 16, 'keyword_name': 'wallets for men', 'volume': 566, 'asin': 'B0735C9RDZ'
},
]
我想按 asin 和 keyword_name 分组,因为它们在字典列表中出现不止一次,所以我的目标是拥有一个如下所示的数据框:
**keyword_name volume B01MG0ORBL B0735C9RDZ B07FPVR858** // column headers
mens wallet 456677 2 126 18 128 19 16 21 10
leather wallets 23
wallets for men 566 16 10
所以最初我在想
d = [{d['asin']:d['rank'] for d in l} for l in my_lists]
pd.dataframe(d)
// save as xlsx file
writer = pd.ExcelWriter(f"{path}/sheet.xlsx", engine="xlsxwriter")
d.to_excel(
writer, sheet_name="Organic", startrow=0, header=True, index=False
)
但不可能,因为它会 运行 出错 TypeError: string indices must be integers
。
您可以创建 DataFrame
,然后使用 list
旋转:
df = pd.DataFrame(my_lists)
df = df.pivot_table(index=['keyword_name','volume'],
columns='asin',
values='rank',
aggfunc=list)
print (df)
asin B01MG0ORBL B0735C9RDZ B07FPVR858
keyword_name volume
leather wallets 23 NaN [19] NaN
mens wallet 456677 [2, 126] [18, 128] [21, 136]
wallets for men 566 NaN [16] [10]
或将转换后的值加入字符串:
df = pd.DataFrame(my_lists)
df = (df.assign(rank=df['rank'].astype(str))
.pivot_table(index=['keyword_name','volume'],
columns='asin',
values='rank',
aggfunc=' '.join,
fill_value=''))
print (df)
asin B01MG0ORBL B0735C9RDZ B07FPVR858
keyword_name volume
leather wallets 23 19
mens wallet 456677 2 126 18 128 21 136
wallets for men 566 16 10