如何在传递给 Plotly 函数之前重塑 Pandas 数据框?
How to reshape a Pandas data frame before passing to a Plotly function?
我正在尝试使用 Plotly
.
中的 Table()
函数创建 table 数据
我的数据如下:
import pandas as pd
test_df = pd.DataFrame({'Manufacturer':['Mercedes', 'Buick', 'Ford', 'Buick', 'Buick', 'Ford', 'Buick', 'Chrysler', 'Ford', 'Buick', 'Chrysler', 'Ford', 'Buick', 'Ford', 'Ford', 'Chrysler', 'Chrysler', 'Ford', 'Chrysler', 'Chrysler', 'Chrysler', 'Buick'],
'Metric':['MPG', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score'],
'Statistic':['External', 'Min', 'Max', 'Average', 'Median', '90th', '95th', '99th', 'Min', 'Max', 'Average', 'Median', '90th', '95th', '99th','Min', 'Max', 'Average', 'Median', '90th', '95th', '99th'],
'Value':[22, 3.405, 100.29, 4.62, 4.425, 5.34, 5.83, 7.75, 2.6323, 210, 4.193, 3.28, 5.04, 6.36, 11.01, 3.72, 43, 4.98, 4.82, 5.775, 6.18, 7.182],
})
我希望能够创建如下所示的 table:
Manufacturer Min Max Average Median 90th 95th 99th
Buick 3.405 210 4.62 4.425 5.04 5.83 7.182
Chrysler 3.72 43 4.193 4.82 5.775 6.18 7.75
Ford 2.6323 100.29 4.98 3.28 5.34 6.36 11.01
这样做的代码如下所示(硬编码时):
import plotly.graph_objects as go
go.Figure(go.Table(
header=dict(
values=["Manufacturer", "Min", "Max",
"Average", "Median", "90th",
"95th", "99th"],
font=dict(size=10),
align="left"
),
cells=dict(
values=[['Buick', 'Ford', 'Chrysler'], # Headers (could change based on the source file)
[3.405, 3.72, 2.6323], # Min values
[210, 43, 100.29], # Max values
[4.62, 4.193, 4.98], # Average values
[4.425, 4.82, 3.28], # Median values
[5.04, 5.775, 5.34], # 90th percentile values
[5.83, 6.18, 6.36], # 95th percentile values
[7.182, 7.75, 11.01] # 99th percentile values
],
align = "left")
))
根据 https://plotly.com/python/table/ 上的文档,cells
参数需要一个列表列表并且 可以采用 Pandas 数据框 (太棒了!)。
使用文档中的示例,传递 Pandas 数据框的代码如下所示:
# THIS IS THE EXAMPLE FROM THE DOCS (SHOWING THE USE OF A DATA FRAME)
fig = go.Figure(data=[go.Table(
header=dict(values=list(df.columns),
fill_color='paleturquoise',
align='left'),
cells=dict(values=[df.Rank, df.State, df.Postal, df.Population],
fill_color='lavender',
align='left'))
])
我最勇敢的尝试失败了:
仅按 'Score' 条记录过滤:
test_df_subset = test_df[(test_df['Metric'] == 'Score') & (test_df['Manufacturer'].isin(['Buick', 'Ford', 'Chrysler']))]
创建一个枢轴 table:
temp_df = pd.pivot_table(data=test_df_subset,index=['Statistic', 'Manufacturer'])
拆散枢轴 table:
temp_df.unstack(0)
问题:我将如何重塑我的 test_df
数据框以便能够将其传递给 [=24= 中的 data
和 cells
参数]函数?
提前致谢!
你非常接近,这是一种方式
import plotly.graph_objects as go
cols_ = ["Manufacturer", "Min", "Max",
"Average", "Median", "90th",
"95th", "99th"]
manufacturers = ['Buick', 'Ford', 'Chrysler']
#this is what you are looking for
df_ = (test_df[test_df['Manufacturer'].isin(manufacturers)]
.set_index(['Manufacturer', 'Statistic'])
['Value'].unstack()
.reset_index()[cols_]
)
go.Figure(go.Table(
header=dict(
values=cols_,
font=dict(size=10),
align="left"
),
cells=dict(
values=df_.T, # note the T here
align = "left")
))
与你的方法相比,我认为 df_
(在我的符号中)等同于 temp_df.unstack(0)['Value'].reset_index()[cols_]
与你的符号并使用 cols_
按预期排序
我正在尝试使用 Plotly
.
Table()
函数创建 table 数据
我的数据如下:
import pandas as pd
test_df = pd.DataFrame({'Manufacturer':['Mercedes', 'Buick', 'Ford', 'Buick', 'Buick', 'Ford', 'Buick', 'Chrysler', 'Ford', 'Buick', 'Chrysler', 'Ford', 'Buick', 'Ford', 'Ford', 'Chrysler', 'Chrysler', 'Ford', 'Chrysler', 'Chrysler', 'Chrysler', 'Buick'],
'Metric':['MPG', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score'],
'Statistic':['External', 'Min', 'Max', 'Average', 'Median', '90th', '95th', '99th', 'Min', 'Max', 'Average', 'Median', '90th', '95th', '99th','Min', 'Max', 'Average', 'Median', '90th', '95th', '99th'],
'Value':[22, 3.405, 100.29, 4.62, 4.425, 5.34, 5.83, 7.75, 2.6323, 210, 4.193, 3.28, 5.04, 6.36, 11.01, 3.72, 43, 4.98, 4.82, 5.775, 6.18, 7.182],
})
我希望能够创建如下所示的 table:
Manufacturer Min Max Average Median 90th 95th 99th
Buick 3.405 210 4.62 4.425 5.04 5.83 7.182
Chrysler 3.72 43 4.193 4.82 5.775 6.18 7.75
Ford 2.6323 100.29 4.98 3.28 5.34 6.36 11.01
这样做的代码如下所示(硬编码时):
import plotly.graph_objects as go
go.Figure(go.Table(
header=dict(
values=["Manufacturer", "Min", "Max",
"Average", "Median", "90th",
"95th", "99th"],
font=dict(size=10),
align="left"
),
cells=dict(
values=[['Buick', 'Ford', 'Chrysler'], # Headers (could change based on the source file)
[3.405, 3.72, 2.6323], # Min values
[210, 43, 100.29], # Max values
[4.62, 4.193, 4.98], # Average values
[4.425, 4.82, 3.28], # Median values
[5.04, 5.775, 5.34], # 90th percentile values
[5.83, 6.18, 6.36], # 95th percentile values
[7.182, 7.75, 11.01] # 99th percentile values
],
align = "left")
))
根据 https://plotly.com/python/table/ 上的文档,cells
参数需要一个列表列表并且 可以采用 Pandas 数据框 (太棒了!)。
使用文档中的示例,传递 Pandas 数据框的代码如下所示:
# THIS IS THE EXAMPLE FROM THE DOCS (SHOWING THE USE OF A DATA FRAME)
fig = go.Figure(data=[go.Table(
header=dict(values=list(df.columns),
fill_color='paleturquoise',
align='left'),
cells=dict(values=[df.Rank, df.State, df.Postal, df.Population],
fill_color='lavender',
align='left'))
])
我最勇敢的尝试失败了:
仅按 'Score' 条记录过滤:
test_df_subset = test_df[(test_df['Metric'] == 'Score') & (test_df['Manufacturer'].isin(['Buick', 'Ford', 'Chrysler']))]
创建一个枢轴 table:
temp_df = pd.pivot_table(data=test_df_subset,index=['Statistic', 'Manufacturer'])
拆散枢轴 table:
temp_df.unstack(0)
问题:我将如何重塑我的 test_df
数据框以便能够将其传递给 [=24= 中的 data
和 cells
参数]函数?
提前致谢!
你非常接近,这是一种方式
import plotly.graph_objects as go
cols_ = ["Manufacturer", "Min", "Max",
"Average", "Median", "90th",
"95th", "99th"]
manufacturers = ['Buick', 'Ford', 'Chrysler']
#this is what you are looking for
df_ = (test_df[test_df['Manufacturer'].isin(manufacturers)]
.set_index(['Manufacturer', 'Statistic'])
['Value'].unstack()
.reset_index()[cols_]
)
go.Figure(go.Table(
header=dict(
values=cols_,
font=dict(size=10),
align="left"
),
cells=dict(
values=df_.T, # note the T here
align = "left")
))
与你的方法相比,我认为 df_
(在我的符号中)等同于 temp_df.unstack(0)['Value'].reset_index()[cols_]
与你的符号并使用 cols_
按预期排序