如何将散景 vbar 图表参数与 groupby 对象一起使用?
how use bokeh vbar chart parameter with groupby object?
问题
下面的代码是散景文档中的分组 vbar 图表示例。
这个例子有些地方我无法理解。
Where 'cyl_mfr' is come from in factor_cmap() and vbar()?
'mpg_mean' , is it calculating the mean of 'mpg' column? if then,
why 'mpg_sum' doesn't work?
我想像这个例子一样制作我自己的 vbar 图表。
代码
from bokeh.io import show, output_file
from bokeh.models import ColumnDataSource, HoverTool
from bokeh.plotting import figure
from bokeh.palettes import Spectral5
from bokeh.sampledata.autompg import autompg_clean as df
from bokeh.transform import factor_cmap
output_file("bars.html")
df.cyl = df.cyl.astype(str)
df.yr = df.yr.astype(str)
group = df.groupby(('cyl', 'mfr'))
source = ColumnDataSource(group)
index_cmap = factor_cmap('cyl_mfr', palette=Spectral5,
factors=sorted(df.cyl.unique()), end=1)
p = figure(plot_width=800, plot_height=300, title="Mean MPG by # Cylinders
and Manufacturer",
x_range=group, toolbar_location=None, tools="")
p.vbar(x='cyl_mfr', top='mpg_mean', width=1, source=source,
line_color="white", fill_color=index_cmap, )
p.y_range.start = 0
p.x_range.range_padding = 0.05
p.xgrid.grid_line_color = None
p.xaxis.axis_label = "Manufacturer grouped by # Cylinders"
p.xaxis.major_label_orientation = 1.2
p.outline_line_color = None
p.add_tools(HoverTool(tooltips=[("MPG", "@mpg_mean"), ("Cyl, Mfr",
"@cyl_mfr")]))
show(p)
group = df.groupby(('cyl', 'mfr'))
构成 <pandas.core.groupby.DataFrameGroupBy object at 0x0xxx>
。如果将其传递给 ColumnDataSource
,bokeh 会发挥很多作用,并且已经计算出很多统计数据
df.columns
Index(['mpg', 'cyl', 'displ', 'hp', 'weight', 'accel', 'yr', 'origin', 'name', 'mfr'],
source.column_names
['accel_count', 'accel_mean', 'accel_std', 'accel_min',
'accel_25%', 'accel_50%', 'accel_75%', 'accel_max', 'displ_count',
'displ_mean', 'displ_std', 'displ_min', 'displ_25%', 'displ_50%',
'displ_75%', 'displ_max', 'hp_count', 'hp_mean', 'hp_std',
'hp_min', 'hp_25%', 'hp_50%', 'hp_75%', 'hp_max', 'mpg_count',
'mpg_mean', 'mpg_std', 'mpg_min', 'mpg_25%', 'mpg_50%',
'mpg_75%', 'mpg_max', 'weight_count', 'weight_mean', 'weight_std',
'weight_min', 'weight_25%', 'weight_50%', 'weight_75%',
'weight_max', 'yr_count', 'yr_mean', 'yr_std', 'yr_min',
'yr_25%', 'yr_50%', 'yr_75%', 'yr_max', 'cyl_mfr']
cyl_mfr
是您按串联分组的 2 列的标签。在source
中,这已成为一列元组
mpg_sum
不计算。如果你不能计算总和,你需要自己计算。
问题
下面的代码是散景文档中的分组 vbar 图表示例。 这个例子有些地方我无法理解。
Where 'cyl_mfr' is come from in factor_cmap() and vbar()?
'mpg_mean' , is it calculating the mean of 'mpg' column? if then, why 'mpg_sum' doesn't work?
我想像这个例子一样制作我自己的 vbar 图表。
代码
from bokeh.io import show, output_file
from bokeh.models import ColumnDataSource, HoverTool
from bokeh.plotting import figure
from bokeh.palettes import Spectral5
from bokeh.sampledata.autompg import autompg_clean as df
from bokeh.transform import factor_cmap
output_file("bars.html")
df.cyl = df.cyl.astype(str)
df.yr = df.yr.astype(str)
group = df.groupby(('cyl', 'mfr'))
source = ColumnDataSource(group)
index_cmap = factor_cmap('cyl_mfr', palette=Spectral5,
factors=sorted(df.cyl.unique()), end=1)
p = figure(plot_width=800, plot_height=300, title="Mean MPG by # Cylinders
and Manufacturer",
x_range=group, toolbar_location=None, tools="")
p.vbar(x='cyl_mfr', top='mpg_mean', width=1, source=source,
line_color="white", fill_color=index_cmap, )
p.y_range.start = 0
p.x_range.range_padding = 0.05
p.xgrid.grid_line_color = None
p.xaxis.axis_label = "Manufacturer grouped by # Cylinders"
p.xaxis.major_label_orientation = 1.2
p.outline_line_color = None
p.add_tools(HoverTool(tooltips=[("MPG", "@mpg_mean"), ("Cyl, Mfr",
"@cyl_mfr")]))
show(p)
group = df.groupby(('cyl', 'mfr'))
构成 <pandas.core.groupby.DataFrameGroupBy object at 0x0xxx>
。如果将其传递给 ColumnDataSource
,bokeh 会发挥很多作用,并且已经计算出很多统计数据
df.columns
Index(['mpg', 'cyl', 'displ', 'hp', 'weight', 'accel', 'yr', 'origin', 'name', 'mfr'],
source.column_names
['accel_count', 'accel_mean', 'accel_std', 'accel_min', 'accel_25%', 'accel_50%', 'accel_75%', 'accel_max', 'displ_count', 'displ_mean', 'displ_std', 'displ_min', 'displ_25%', 'displ_50%', 'displ_75%', 'displ_max', 'hp_count', 'hp_mean', 'hp_std', 'hp_min', 'hp_25%', 'hp_50%', 'hp_75%', 'hp_max', 'mpg_count', 'mpg_mean', 'mpg_std', 'mpg_min', 'mpg_25%', 'mpg_50%', 'mpg_75%', 'mpg_max', 'weight_count', 'weight_mean', 'weight_std', 'weight_min', 'weight_25%', 'weight_50%', 'weight_75%', 'weight_max', 'yr_count', 'yr_mean', 'yr_std', 'yr_min', 'yr_25%', 'yr_50%', 'yr_75%', 'yr_max', 'cyl_mfr']
cyl_mfr
是您按串联分组的 2 列的标签。在source
中,这已成为一列元组mpg_sum
不计算。如果你不能计算总和,你需要自己计算。