从具有多索引列的 Pandas DataFrame 创建箱线图
Create boxplot from Pandas DataFrame with multiindex columns
我有带多索引列的 DataFrame(alphas、接收者、运行)。
我有 10 个 运行s、多个 alpha 和 2 个接收器。
我想为超过 10 运行s 的每个 alpha 值创建包含两个框(每个接收器一个 - rec1、rec2)的箱线图。
像这样:
repetitions = vectors_toPlot.repetition.unique()
alphas = [0.02, 0.03] # this is example, final version will have more values
receivers = ["rec1", "rec2"]
index = pd.MultiIndex.from_product(iterables, names=['alphas', 'receiver', 'run'])
multiDf = pd.DataFrame(columns=index)
# fill it with values
print(multiDf.head())
alphas 0.02 \
receiver rec1
run 0.0 1.0 2.0 3.0 4.0
0 11744000.0 11744000.0 11744000.0 11744000.0 11744000.0
1 11744000.0 11744000.0 11744000.0 11744000.0 11744000.0
2 12331200.0 12331200.0 12331200.0 12331200.0 12331200.0
3 12624800.0 12624800.0 12624800.0 12624800.0 12624800.0
4 12331200.0 12331200.0 12331200.0 12331200.0 12331200.0
我尝试了 df.boxplot()
与 by
和 columns
的各种组合,但我无法理解它。
您可能需要 sns' boxplot:
# set up the index
alphas = [0.02, 0.03] # this is example, final version will have more values
receivers = ["rec1", "rec2"]
runs = np.arange(4)
index = pd.MultiIndex.from_product([alphas, receivers, runs], names=['alphas', 'receiver', 'run'])
# toy data
np.random.seed(1)
df = pd.DataFrame(np.random.uniform(0,1, (10,len(index))), columns=index)
# plot
sns.boxplot(x='alphas', y=0, hue='receiver', data=df.unstack().reset_index())
输出
我有带多索引列的 DataFrame(alphas、接收者、运行)。
我有 10 个 运行s、多个 alpha 和 2 个接收器。
我想为超过 10 运行s 的每个 alpha 值创建包含两个框(每个接收器一个 - rec1、rec2)的箱线图。
像这样:
repetitions = vectors_toPlot.repetition.unique()
alphas = [0.02, 0.03] # this is example, final version will have more values
receivers = ["rec1", "rec2"]
index = pd.MultiIndex.from_product(iterables, names=['alphas', 'receiver', 'run'])
multiDf = pd.DataFrame(columns=index)
# fill it with values
print(multiDf.head())
alphas 0.02 \
receiver rec1
run 0.0 1.0 2.0 3.0 4.0
0 11744000.0 11744000.0 11744000.0 11744000.0 11744000.0
1 11744000.0 11744000.0 11744000.0 11744000.0 11744000.0
2 12331200.0 12331200.0 12331200.0 12331200.0 12331200.0
3 12624800.0 12624800.0 12624800.0 12624800.0 12624800.0
4 12331200.0 12331200.0 12331200.0 12331200.0 12331200.0
我尝试了 df.boxplot()
与 by
和 columns
的各种组合,但我无法理解它。
您可能需要 sns' boxplot:
# set up the index
alphas = [0.02, 0.03] # this is example, final version will have more values
receivers = ["rec1", "rec2"]
runs = np.arange(4)
index = pd.MultiIndex.from_product([alphas, receivers, runs], names=['alphas', 'receiver', 'run'])
# toy data
np.random.seed(1)
df = pd.DataFrame(np.random.uniform(0,1, (10,len(index))), columns=index)
# plot
sns.boxplot(x='alphas', y=0, hue='receiver', data=df.unstack().reset_index())
输出