调整水平条形图 matplotlib 以适应条形图

Question

我正在制作水平条形图，但在调整 ylim 或其他参数以使我的标签更清晰并使所有标签适合 y 轴方面遇到困难。我玩过 ylim，文本大小可以更大或更小，但条形不适合 y 轴。知道正确的方法吗？

我的代码：

import matplotlib.pyplot as plt #we load the library that contains the plotting capabilities
from operator import itemgetter
D=[]        
for att, befor, after in zip(df_portion['attributes'], df_portion['2005_2011 (%)'], df_portion['2012_2015 (%)']):
    i=(att, befor, after)
    D.append(i)
Dsort = sorted(D, key=itemgetter(1), reverse=False) #sort the list in order of usage
attri = [x[0] for x in Dsort] 
aft  = [x[1] for x in Dsort]
bef  = [x[2] for x in Dsort] 

ind = np.arange(len(attri))
width=3

ax = plt.subplot(111)
ax.barh(ind, aft, width,align='center',alpha=1, color='r', label='from 2012 to 2015') #a horizontal bar chart (use .bar instead of .barh for vertical)
ax.barh(ind - width, bef, width, align='center',  alpha=1, color='b', label='from 2005 to 2008') #a horizontal bar chart (use .bar instead of .barh for vertical)
ax.set(yticks=ind, yticklabels=attri,ylim=[1, len(attri)/2])
plt.xlabel('Frequency distribution (%)')
plt.title('Frequency distribution (%) of common attributes between 2005_2008 and between 2012_2015')
plt.legend()
plt.show()

这是上面代码的情节

Answer 1

要使标签适合，您需要设置较小的字体大小，或使用较大的图形大小。更改 ylim 将只显示条形的子集（以防 ylim 设置得太窄），或者将显示更多空白（当 ylim 较大时）。

代码中最大的问题是 width 太大。宽度的两倍需要适应 1.0 的距离（刻度通过 ind 放置，这是一个数组 0,1,2,...）。由于 matplotlib 将水平条形图的粗细称为“高度”，因此在下面的示例代码中使用了这个名称。使用 align='edge' 可以直接定位条形（align='center' 会将它们移动一半的“高度”）。

Pandas 具有根据一行或多行对数据帧进行排序的简单函数。

说明思路的代码：

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# first create some test data
df = pd.DataFrame({'attributes': ["alpha", "beta", "gamma", "delta", "epsilon", "zata", "eta", "theta", "iota",
                                  "kappa", "lambda", "mu", "nu", "xi", "omikron", "pi", "rho", "sigma", "tau",
                                  "upsilon", "phi", "chi", "psi", "omega"]})
totals_2005_2011 = np.random.uniform(100, 10000, len(df))
totals_2012_2015 = totals_2005_2011 * np.random.uniform(0.70, 2, len(df))
df['2005_2011 (%)'] = totals_2005_2011 / totals_2005_2011.sum() * 100
df['2012_2015 (%)'] = totals_2012_2015 / totals_2012_2015.sum() * 100

# sort all rows via the '2005_2011 (%)' column, sort from large to small
df = df.sort_values('2005_2011 (%)', ascending=False)

ind = np.arange(len(df))
height = 0.3  # two times height needs to be at most 1

fig, ax = plt.subplots(figsize=(12, 6))
ax.barh(ind, df['2012_2015 (%)'], height, align='edge', alpha=1, color='crimson', label='from 2012 to 2015')
ax.barh(ind - height, df['2005_2011 (%)'], height, align='edge', alpha=1, color='dodgerblue', label='from 2005 to 2011')
ax.set_yticks(ind)
ax.set_yticklabels(df['attributes'], fontsize=10)
ax.grid(axis='x')

ax.set_xlabel('Frequency distribution (%)')
ax.set_title('Frequency distribution (%) of common attributes between 2005_2011 and between 2012_2015')
ax.legend()
ax.margins(y=0.01)  # use smaller margins in the y-direction
plt.tight_layout()
plt.show()

seaborn 库有一些函数可以创建每个属性具有多个条形图的条形图，而无需手动 fiddle 设置条形图位置。 Seaborn 更喜欢“长格式”的数据，可以通过 pandas' melt().

创建

示例代码：

import seaborn as sns

df = df.sort_values('2005_2011 (%)', ascending=True)
df_long = df.melt(id_vars='attributes', value_vars=['2005_2011 (%)', '2012_2015 (%)'],
                  var_name='period', value_name='distribution')
fig, ax = plt.subplots(figsize=(12, 6))
sns.barplot(data=df_long, y='attributes', x='distribution', hue='period', palette='turbo', ax=ax)
ax.set_xlabel('Frequency distribution (%)')
ax.set_title('Frequency distribution (%) of common attributes between 2005_2011 and between 2012_2015')
ax.grid(axis='x')
ax.tick_params(axis='y', labelsize=12)
sns.despine()
plt.tight_layout()
plt.show()

调整水平条形图 matplotlib 以适应条形图

adjusting horizontal bar chart matplotlib to accommodate the bars

python

matplotlib