仅在一个级别上绘制带有子图的多索引数据

Question

使用代码格式的示例数据进行编辑：

import pandas as pd

data = {'Station':  ['station 1', 'station 1', 'station 1', 'station 1','station 2', 'station 2','station 2', 'station 2', 'station 2', 'station 2'],
        'month': ['March', 'Arpil', 'March', 'Arpil','March', 'Arpil','March', 'Arpil', 'March', 'Arpil'],
         'x1': ['22', '42', '11', '56','28', '33','87', '34', '11', '25'],
      'x2': ['52', '47', '31', '52','38', '35','47', '54', '10', '45']
        }

df = pd.DataFrame(data)

df = df.groupby(['Station','month']).mean()

df.transpose().plot(subplots=True, kind='bar')

我无法获得站的单独子图，每个子图中的一组条形图（例如，x1），每个簇代表一个月（参见 Excel 图代表性子图）。

我有一个按以下方式组织的数据框（为了提出这个问题，我已将其转换为 excel）：

我希望在站级生成子图，每个月都有针对每种质量（X2、X2..等）的聚类条。我不认为我在描述方面做得很好，所以我在 Excel.

中创建了一个示例子图

理想情况下，我希望每个车站都有一个看起来像这样的子图。我已经想出如何为每个站生成一个子图，但我不知道如何整合时间元素（和聚类条）。

Answer 1

我认为 pandas matplotlib 的便利包装器本身不支持子图中的分组条形图。 pandas documentation 说 subplots (bool, default False): Make separate subplots for each column. 即使是 2x2 多索引列也被解释为 4 列并分布到 4 个子图中。
但这看起来是一个完美的案例，可以使用 seaborn 来简化此类数据可视化：

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

data = {'Station':  ['station 1', 'station 1', 'station 1', 'station 1','station 2', 'station 2','station 2', 'station 2', 'station 2', 'station 2'],
        'month': ['March', 'Arpil', 'March', 'Arpil','March', 'Arpil','March', 'Arpil', 'March', 'Arpil'],
         'x1': ['22', '42', '11', '56','28', '33','87', '34', '11', '25'],
      'x2': ['52', '47', '31', '52','38', '35','47', '54', '10', '45']
        }
#data import and conversion to numeric 
df = pd.DataFrame(data)
df = df.apply(pd.to_numeric, errors="ignore")

#transformation of the df into the long form
df_plot = pd.melt(df, id_vars=["Station", "month"], value_vars=["x1", "x2"], var_name="cat")
#generating the category plot
sns.catplot(x="cat", y="value", hue="month", col="Station", data=df_plot, kind="bar", ci=None)

plt.show()

示例输出：

pandas/matplotlib 等价物稍微冗长一些：

import matplotlib.pyplot as plt
import pandas as pd

data = {'Station':  ['station 1', 'station 1', 'station 1', 'station 1','station 2', 'station 2','station 2', 'station 2', 'station 2', 'station 2'],
        'month': ['March', 'Arpil', 'March', 'Arpil','March', 'Arpil','March', 'Arpil', 'March', 'Arpil'],
         'x1': ['22', '42', '11', '56','28', '33','87', '34', '11', '25'],
      'x2': ['52', '47', '31', '52','38', '35','47', '54', '10', '45']
        }
#data import and conversion to numeric 
df = pd.DataFrame(data)
df = df.apply(pd.to_numeric, errors="ignore")

#data aggregation
df_plot = df.groupby(['Station','month'], sort=False).mean().unstack()

fig, axes = plt.subplots(1, len(df_plot), sharex=True, sharey=True, figsize=(10, 5))
for station, curr_ax in zip(df_plot.index, axes):
    df_plot.loc[station].unstack().plot(kind="bar", ax=curr_ax, title=station)

plt.show()

仅在一个级别上绘制带有子图的多索引数据

Plotting multi-indexed data with subplots only at one level

python

matplotlib

pandas