构面图中的条形排列不正确
Incorrect arrangement of bars in facet plot
情况
我有一个 table 具有随机 forrest 模型中最重要的特征。这个 table 看起来像:
我使用了屏幕截图,因为这是获得概览的最简单方法,下面是 MWE。
我的目标是为每个 Label
(名为“Label”的列)绘制条形图。
问题
如果我们融化 table 并查看标签 p24D
...
...我们看到 p24D
与 m24U
结合显示 0
的值。但是如果我创建我的情节它看起来像:
我们在 p24D-facet
中看到一个大于 0
的柱,它被标记为 m24U
。
结论
对我来说,它似乎为每个组采用了正确的条形高度和分布,但 x 标签是错误的,因为它们取自最后绘制的面。
问题
我想为所有方面保留相同的标签,但条形应该分配给正确的标签。我该怎么做?
MWE
绘图的数据框和代码
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df_imp = pd.DataFrame({'m12D': {0: 0.25987975843758654, 1: 0.18727707779383243, 2: 0.39100295701375354, 3: 0.06384800127268568,
4: 0.07999849502412754, 5: 0.13640019148970256, 6: 0.1412876367877005, 7: 0.09121120297702168,
8: 0.0, 9: 0.0},
'm01D': {0: 0.0724816118081828, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.06251755368842625, 6: 0.0,
7: 0.0, 8: 0.0, 9: 0.0},
'm06D': {0: 0.06145081213633308, 1: 0.15137716985355018, 2: 0.10217899239161463, 3: 0.0,
4: 0.0, 5: 0.0, 6: 0.06612051972885429, 7: 0.09406588670435026, 8: 0.0, 9: 0.0},
'm12U': {0: 0.047766259908712215, 1: 0.11029620061232079, 2: 0.058189273034798476, 3: 0.0,
4: 0.0, 5: 0.13754236428929292, 6: 0.24967144685607753, 7: 0.3109784881004455,
8: 0.07243867278541272, 9: 0.06597783007344389},
'm05D': {0: 0.04166653999225189, 1: 0.07952487761091377, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0,
6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'Label': {0: 'p03D', 1: 'p06D', 2: 'p12D', 3: 'p24D', 4: 'p48D', 5: 'p03U', 6: 'p06U',
7: 'p12U', 8: 'p24U', 9: 'p48U'},
'down': {0: 0.0, 1: 0.05780607422803258, 2: 0.0, 3: 0.06127000511754594, 4: 0.05020367686447687,
5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'm06U': {0: 0.0, 1: 0.0, 2: 0.08466651640727402, 3: 0.0, 4: 0.0, 5: 0.05309415831396426,
6: 0.10234265392288792, 7: 0.10916311424468256, 8: 0.039101554776822, 9: 0.04173916824046613},
'p06T': {0: 0.0, 1: 0.0, 2: 0.056553945901284826, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0,
8: 0.0, 9: 0.0},
'm24D': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.05209052270599571, 4: 0.06155322163002999, 5: 0.0,
6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'p24T': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.05411155410020583, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0,
8: 0.0, 9: 0.0},
'wday_6': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.059271604760677284, 4: 0.0, 5: 0.0,
6: 0.0, 7: 0.0, 8: 0.035176474643147146, 9: 0.0},
'p48T': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.04623563899446924, 5: 0.0, 6: 0.0, 7: 0.0,
8: 0.0, 9: 0.0},
'wday_4': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.04342742013282083, 5: 0.0, 6: 0.0,
7: 0.0, 8: 0.0, 9: 0.0},
'm02D': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.049231996352442,
6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'up_sum': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.05893778686091927,
7: 0.0, 8: 0.0667566604504024, 9: 0.09670821121474267},
'm05U': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0590711833092717,
8: 0.0, 9: 0.0},
'm24U': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0,
7: 0.0, 8: 0.0844568188980733, 9: 0.11262510249213625},
'wday_5': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0,
9: 0.03522099307752065}})
df_imp2 = df_imp.fillna(0)
df_imp2 = df_imp2.melt(id_vars="Label")
df_imp2 = df_imp2.rename(columns={"value":"Importance", "variable":"Feature"})
sns.set_theme(style="white")
g = sns.FacetGrid(df_imp2, col="Label", height=1.5, aspect=5, col_wrap=2, margin_titles=True, despine=False)
g.map(sns.barplot, "Feature", "Importance", order=['m12D', 'm01D', 'm06D', 'm12U', 'm05D', 'down', 'm06U', 'p06T',
'm24D', 'p24T', 'wday_6', 'p48T', 'wday_4', 'm02D', 'up_sum', 'm05U',
'm24U', 'wday_5']))
g.figure.subplots_adjust(wspace=0.02, hspace=0.4)
g.set(yticks=np.arange(0,0.4,0.1))
g.set_xticklabels(rotation=30)
plt.show()
好吧,seaborbn 已经给出了这个警告的原因:
UserWarning: Using the barplot function without specifying 'order' is likely to produce an incorrect plot.
那么在绘制数据之前先对数据进行排序如何:
df_imp2 = df_imp.fillna(0)
df_imp2 = df_imp2.melt(id_vars="Label")
df_imp2 = df_imp2.rename(columns={"value":"Importance", "variable":"Feature"})
df_imp2 = df_imp2.sort_values(['Label','Feature']) # sort them before plot
sns.set_theme(style="white")
g = sns.FacetGrid(df_imp2, col="Label", height=1.5, aspect=5, col_wrap=2, margin_titles=True, despine=False)
g.map(sns.barplot, "Feature", "Importance")
g.figure.subplots_adjust(wspace=0.02, hspace=0.4)
g.set(yticks=np.arange(0,0.4,0.1))
g.set_xticklabels(rotation=30)
情况
我有一个 table 具有随机 forrest 模型中最重要的特征。这个 table 看起来像:
我使用了屏幕截图,因为这是获得概览的最简单方法,下面是 MWE。
我的目标是为每个 Label
(名为“Label”的列)绘制条形图。
问题
如果我们融化 table 并查看标签 p24D
...
...我们看到 p24D
与 m24U
结合显示 0
的值。但是如果我创建我的情节它看起来像:
我们在 p24D-facet
中看到一个大于 0
的柱,它被标记为 m24U
。
结论
对我来说,它似乎为每个组采用了正确的条形高度和分布,但 x 标签是错误的,因为它们取自最后绘制的面。
问题
我想为所有方面保留相同的标签,但条形应该分配给正确的标签。我该怎么做?
MWE
绘图的数据框和代码
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df_imp = pd.DataFrame({'m12D': {0: 0.25987975843758654, 1: 0.18727707779383243, 2: 0.39100295701375354, 3: 0.06384800127268568,
4: 0.07999849502412754, 5: 0.13640019148970256, 6: 0.1412876367877005, 7: 0.09121120297702168,
8: 0.0, 9: 0.0},
'm01D': {0: 0.0724816118081828, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.06251755368842625, 6: 0.0,
7: 0.0, 8: 0.0, 9: 0.0},
'm06D': {0: 0.06145081213633308, 1: 0.15137716985355018, 2: 0.10217899239161463, 3: 0.0,
4: 0.0, 5: 0.0, 6: 0.06612051972885429, 7: 0.09406588670435026, 8: 0.0, 9: 0.0},
'm12U': {0: 0.047766259908712215, 1: 0.11029620061232079, 2: 0.058189273034798476, 3: 0.0,
4: 0.0, 5: 0.13754236428929292, 6: 0.24967144685607753, 7: 0.3109784881004455,
8: 0.07243867278541272, 9: 0.06597783007344389},
'm05D': {0: 0.04166653999225189, 1: 0.07952487761091377, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0,
6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'Label': {0: 'p03D', 1: 'p06D', 2: 'p12D', 3: 'p24D', 4: 'p48D', 5: 'p03U', 6: 'p06U',
7: 'p12U', 8: 'p24U', 9: 'p48U'},
'down': {0: 0.0, 1: 0.05780607422803258, 2: 0.0, 3: 0.06127000511754594, 4: 0.05020367686447687,
5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'm06U': {0: 0.0, 1: 0.0, 2: 0.08466651640727402, 3: 0.0, 4: 0.0, 5: 0.05309415831396426,
6: 0.10234265392288792, 7: 0.10916311424468256, 8: 0.039101554776822, 9: 0.04173916824046613},
'p06T': {0: 0.0, 1: 0.0, 2: 0.056553945901284826, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0,
8: 0.0, 9: 0.0},
'm24D': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.05209052270599571, 4: 0.06155322163002999, 5: 0.0,
6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'p24T': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.05411155410020583, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0,
8: 0.0, 9: 0.0},
'wday_6': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.059271604760677284, 4: 0.0, 5: 0.0,
6: 0.0, 7: 0.0, 8: 0.035176474643147146, 9: 0.0},
'p48T': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.04623563899446924, 5: 0.0, 6: 0.0, 7: 0.0,
8: 0.0, 9: 0.0},
'wday_4': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.04342742013282083, 5: 0.0, 6: 0.0,
7: 0.0, 8: 0.0, 9: 0.0},
'm02D': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.049231996352442,
6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'up_sum': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.05893778686091927,
7: 0.0, 8: 0.0667566604504024, 9: 0.09670821121474267},
'm05U': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0590711833092717,
8: 0.0, 9: 0.0},
'm24U': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0,
7: 0.0, 8: 0.0844568188980733, 9: 0.11262510249213625},
'wday_5': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0,
9: 0.03522099307752065}})
df_imp2 = df_imp.fillna(0)
df_imp2 = df_imp2.melt(id_vars="Label")
df_imp2 = df_imp2.rename(columns={"value":"Importance", "variable":"Feature"})
sns.set_theme(style="white")
g = sns.FacetGrid(df_imp2, col="Label", height=1.5, aspect=5, col_wrap=2, margin_titles=True, despine=False)
g.map(sns.barplot, "Feature", "Importance", order=['m12D', 'm01D', 'm06D', 'm12U', 'm05D', 'down', 'm06U', 'p06T',
'm24D', 'p24T', 'wday_6', 'p48T', 'wday_4', 'm02D', 'up_sum', 'm05U',
'm24U', 'wday_5']))
g.figure.subplots_adjust(wspace=0.02, hspace=0.4)
g.set(yticks=np.arange(0,0.4,0.1))
g.set_xticklabels(rotation=30)
plt.show()
好吧,seaborbn 已经给出了这个警告的原因:
UserWarning: Using the barplot function without specifying 'order' is likely to produce an incorrect plot.
那么在绘制数据之前先对数据进行排序如何:
df_imp2 = df_imp.fillna(0)
df_imp2 = df_imp2.melt(id_vars="Label")
df_imp2 = df_imp2.rename(columns={"value":"Importance", "variable":"Feature"})
df_imp2 = df_imp2.sort_values(['Label','Feature']) # sort them before plot
sns.set_theme(style="white")
g = sns.FacetGrid(df_imp2, col="Label", height=1.5, aspect=5, col_wrap=2, margin_titles=True, despine=False)
g.map(sns.barplot, "Feature", "Importance")
g.figure.subplots_adjust(wspace=0.02, hspace=0.4)
g.set(yticks=np.arange(0,0.4,0.1))
g.set_xticklabels(rotation=30)