色调条形图的 Seaborn 解决方法
Seaborn workaround for hue barplot
我在 Jupyter notebook 上有以下 DataFrame,它使用 seaborn 条形图进行绘图:
data = {'day_index': [0, 1, 2, 3, 4, 5, 6],
'avg_duration': [708.852242, 676.7021900000001, 684.572677, 708.92534, 781.767476, 1626.575057, 1729.155673],
'trips': [114586, 120936, 118882, 117868, 108036, 43740, 37508]}
df = pd.DataFrame(data)
daysOfWeek = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
plt.figure(figsize=(16,10));
sns.set_style('ticks')
ax = sns.barplot(data=df, \
x='day_index', \
y='avg_duration', \
hue='trips', \
palette=sns.color_palette("Reds_d", n_colors=7, desat=1))
ax.set_xlabel("Week Days", fontsize=18, alpha=0.8)
ax.set_ylabel("Duration (seconds)", fontsize=18, alpha=0.8)
ax.set_title("Week's average Trip Duration", fontsize=24)
ax.set_xticklabels(daysOfWeek, fontsize=16)
ax.legend(fontsize=15)
sns.despine()
plt.show()
情节A:
可以看出,条形与 x_ticklabels 不匹配,而且非常细。
如果我删除 hue='trips'
部分,这一切都已修复,这是一个已知的 seaborn 问题。
虽然在可视化中显示行程数量非常重要,因此:有没有办法绕过 seaborn(可能直接使用 matplotlib)添加 hue 属性?
我认为在这种情况下您不需要指定 hue
参数:
In [136]: ax = sns.barplot(data=dfGroupedAgg, \
...: x='day_index', \
...: y='avg_duration', \
...: palette=sns.color_palette("Reds_d", n_colors=7, desat=1))
...:
您可以添加行程数量作为注释:
def autolabel(rects, labels=None, height_factor=1.05):
for i, rect in enumerate(rects):
height = rect.get_height()
if labels is not None:
try:
label = labels[i]
except (TypeError, KeyError):
label = ' '
else:
label = '%d' % int(height)
ax.text(rect.get_x() + rect.get_width()/2., height_factor*height,
'{}'.format(label),
ha='center', va='bottom')
autolabel(ax.patches, labels=df.trips, height_factor=1.02)
hue
参数可能只对向绘图引入新维度有意义,而不是在同一维度上显示另一个数量。
最好在没有 hue
参数的情况下绘制条形图(实际上将其称为 hue 会产生误导),并根据 "trips"
列中的值简单地为条形图着色。
这个问题也显示了这一点:。
此处的代码如下所示:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
di = np.arange(0,7)
avg = np.array([708.852242,676.702190,684.572677,708.925340,781.767476,
1626.575057,1729.155673])
trips = np.array([114586,120936,118882,117868,108036,43740,37508])
df = pd.DataFrame(np.c_[di, avg, trips], columns=["day_index","avg_duration", "trips"])
daysOfWeek = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', \
'Friday', 'Saturday', 'Sunday']
plt.figure(figsize=(10,7));
sns.set_style('ticks')
v = df.trips.values
colors=plt.cm.viridis((v-v.min())/(v.max()-v.min()))
ax = sns.barplot(data=df, x='day_index', y='avg_duration', palette=colors)
for index, row in df.iterrows():
ax.text(row.day_index,row.avg_duration, row.trips, color='black', ha="center")
ax.set_xlabel("Week Days", fontsize=16, alpha=0.8)
ax.set_ylabel("Duration (seconds)", fontsize=16, alpha=0.8)
ax.set_title("Week's average Trip Duration", fontsize=18)
ax.set_xticklabels(daysOfWeek, fontsize=14)
ax.legend(fontsize=15)
sns.despine()
plt.show()
从彩色地图构建图例
- 删除
hue
。如前所述,使用此参数时条不会居中,因为它们是根据色调级别数放置的,在这种情况下有 7 个级别。
- 使用
palette
参数而不是 hue
,将柱直接放在报价上。
- 此选项需要“手动”将
'trips'
与颜色相关联并创建图例。
patches
使用 Patch
创建图例中的每个项目。 (例如,与颜色和名称关联的矩形)。
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.patches import Patch
daysOfWeek = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
# specify the colors
colors = sns.color_palette('Reds_d', n_colors=len(df))
# create the plot
plt.figure(figsize=(16,10))
p = sns.barplot(data=df, x='day_index', y='avg_duration', palette=colors)
# plot cosmetics
p.set_xlabel("Week Days", fontsize=18, alpha=0.8)
p.set_ylabel("Average Duration (seconds)", fontsize=18, alpha=0.8)
p.set_title("Week's average Trip Duration", fontsize=24)
p.set_xticklabels(daysOfWeek, fontsize=16)
sns.despine()
# setup the legend
# map names to colors
cmap = dict(zip(df.trips, colors))
# create the rectangles for the legend
patches = [Patch(color=v, label=k) for k, v in cmap.items()]
# add the legend
plt.legend(title='Number of Trips', handles=patches, bbox_to_anchor=(1.04, 0.5), loc='center left', borderaxespad=0, fontsize=15)
这是解决方案
ax = sns.barplot(data=df, \
x='day_index', \
y='avg_duration', \
hue='trips', \
dodge=False, \
palette=sns.color_palette("Reds_d", n_colors=7, desat=1))
我在 Jupyter notebook 上有以下 DataFrame,它使用 seaborn 条形图进行绘图:
data = {'day_index': [0, 1, 2, 3, 4, 5, 6],
'avg_duration': [708.852242, 676.7021900000001, 684.572677, 708.92534, 781.767476, 1626.575057, 1729.155673],
'trips': [114586, 120936, 118882, 117868, 108036, 43740, 37508]}
df = pd.DataFrame(data)
daysOfWeek = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
plt.figure(figsize=(16,10));
sns.set_style('ticks')
ax = sns.barplot(data=df, \
x='day_index', \
y='avg_duration', \
hue='trips', \
palette=sns.color_palette("Reds_d", n_colors=7, desat=1))
ax.set_xlabel("Week Days", fontsize=18, alpha=0.8)
ax.set_ylabel("Duration (seconds)", fontsize=18, alpha=0.8)
ax.set_title("Week's average Trip Duration", fontsize=24)
ax.set_xticklabels(daysOfWeek, fontsize=16)
ax.legend(fontsize=15)
sns.despine()
plt.show()
情节A:
可以看出,条形与 x_ticklabels 不匹配,而且非常细。
如果我删除 hue='trips'
部分,这一切都已修复,这是一个已知的 seaborn 问题。
虽然在可视化中显示行程数量非常重要,因此:有没有办法绕过 seaborn(可能直接使用 matplotlib)添加 hue 属性?
我认为在这种情况下您不需要指定 hue
参数:
In [136]: ax = sns.barplot(data=dfGroupedAgg, \
...: x='day_index', \
...: y='avg_duration', \
...: palette=sns.color_palette("Reds_d", n_colors=7, desat=1))
...:
您可以添加行程数量作为注释:
def autolabel(rects, labels=None, height_factor=1.05):
for i, rect in enumerate(rects):
height = rect.get_height()
if labels is not None:
try:
label = labels[i]
except (TypeError, KeyError):
label = ' '
else:
label = '%d' % int(height)
ax.text(rect.get_x() + rect.get_width()/2., height_factor*height,
'{}'.format(label),
ha='center', va='bottom')
autolabel(ax.patches, labels=df.trips, height_factor=1.02)
hue
参数可能只对向绘图引入新维度有意义,而不是在同一维度上显示另一个数量。
最好在没有 hue
参数的情况下绘制条形图(实际上将其称为 hue 会产生误导),并根据 "trips"
列中的值简单地为条形图着色。
这个问题也显示了这一点:
此处的代码如下所示:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
di = np.arange(0,7)
avg = np.array([708.852242,676.702190,684.572677,708.925340,781.767476,
1626.575057,1729.155673])
trips = np.array([114586,120936,118882,117868,108036,43740,37508])
df = pd.DataFrame(np.c_[di, avg, trips], columns=["day_index","avg_duration", "trips"])
daysOfWeek = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', \
'Friday', 'Saturday', 'Sunday']
plt.figure(figsize=(10,7));
sns.set_style('ticks')
v = df.trips.values
colors=plt.cm.viridis((v-v.min())/(v.max()-v.min()))
ax = sns.barplot(data=df, x='day_index', y='avg_duration', palette=colors)
for index, row in df.iterrows():
ax.text(row.day_index,row.avg_duration, row.trips, color='black', ha="center")
ax.set_xlabel("Week Days", fontsize=16, alpha=0.8)
ax.set_ylabel("Duration (seconds)", fontsize=16, alpha=0.8)
ax.set_title("Week's average Trip Duration", fontsize=18)
ax.set_xticklabels(daysOfWeek, fontsize=14)
ax.legend(fontsize=15)
sns.despine()
plt.show()
从彩色地图构建图例
- 删除
hue
。如前所述,使用此参数时条不会居中,因为它们是根据色调级别数放置的,在这种情况下有 7 个级别。 - 使用
palette
参数而不是hue
,将柱直接放在报价上。 - 此选项需要“手动”将
'trips'
与颜色相关联并创建图例。patches
使用Patch
创建图例中的每个项目。 (例如,与颜色和名称关联的矩形)。
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.patches import Patch
daysOfWeek = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
# specify the colors
colors = sns.color_palette('Reds_d', n_colors=len(df))
# create the plot
plt.figure(figsize=(16,10))
p = sns.barplot(data=df, x='day_index', y='avg_duration', palette=colors)
# plot cosmetics
p.set_xlabel("Week Days", fontsize=18, alpha=0.8)
p.set_ylabel("Average Duration (seconds)", fontsize=18, alpha=0.8)
p.set_title("Week's average Trip Duration", fontsize=24)
p.set_xticklabels(daysOfWeek, fontsize=16)
sns.despine()
# setup the legend
# map names to colors
cmap = dict(zip(df.trips, colors))
# create the rectangles for the legend
patches = [Patch(color=v, label=k) for k, v in cmap.items()]
# add the legend
plt.legend(title='Number of Trips', handles=patches, bbox_to_anchor=(1.04, 0.5), loc='center left', borderaxespad=0, fontsize=15)
这是解决方案
ax = sns.barplot(data=df, \
x='day_index', \
y='avg_duration', \
hue='trips', \
dodge=False, \
palette=sns.color_palette("Reds_d", n_colors=7, desat=1))