从具有多索引列的 Pandas DataFrame 创建箱线图

Create boxplot from Pandas DataFrame with multiindex columns

我有带多索引列的 DataFrame(alphas、接收者、运行)。

我有 10 个 运行s、多个 alpha 和 2 个接收器。

我想为超过 10 运行s 的每个 alpha 值创建包含两个框(每个接收器一个 - rec1、rec2)的箱线图。

像这样:


repetitions = vectors_toPlot.repetition.unique()
alphas = [0.02, 0.03] # this is example, final version will have more values
receivers = ["rec1", "rec2"]
index = pd.MultiIndex.from_product(iterables, names=['alphas', 'receiver', 'run'])

multiDf = pd.DataFrame(columns=index)
# fill it with values

print(multiDf.head())

alphas          0.02                                                  \
receiver        rec1                                                   
run              0.0         1.0         2.0         3.0         4.0   
0         11744000.0  11744000.0  11744000.0  11744000.0  11744000.0   
1         11744000.0  11744000.0  11744000.0  11744000.0  11744000.0   
2         12331200.0  12331200.0  12331200.0  12331200.0  12331200.0   
3         12624800.0  12624800.0  12624800.0  12624800.0  12624800.0   
4         12331200.0  12331200.0  12331200.0  12331200.0  12331200.0   

我尝试了 df.boxplot()bycolumns 的各种组合,但我无法理解它。

您可能需要 sns' boxplot:

# set up the index
alphas = [0.02, 0.03] # this is example, final version will have more values
receivers = ["rec1", "rec2"]
runs = np.arange(4)
index = pd.MultiIndex.from_product([alphas, receivers, runs], names=['alphas', 'receiver', 'run'])

# toy data
np.random.seed(1)
df = pd.DataFrame(np.random.uniform(0,1, (10,len(index))), columns=index)

# plot
sns.boxplot(x='alphas', y=0, hue='receiver', data=df.unstack().reset_index())

输出