构面图中的条形排列不正确

Incorrect arrangement of bars in facet plot

情况

我有一个 table 具有随机 forrest 模型中最重要的特征。这个 table 看起来像:

我使用了屏幕截图,因为这是获得概览的最简单方法,下面是 MWE。

我的目标是为每个 Label(名为“Label”的列)绘制条形图。


问题

如果我们融化 table 并查看标签 p24D...

...我们看到 p24Dm24U 结合显示 0 的值。但是如果我创建我的情节它看起来像:

我们在 p24D-facet 中看到一个大于 0 的柱,它被标记为 m24U


结论

对我来说,它似乎为每个组采用了正确的条形高度和分布,但 x 标签是错误的,因为它们取自最后绘制的面。


问题

我想为所有方面保留相同的标签,但条形应该分配给正确的标签。我该怎么做?


MWE

绘图的数据框和代码

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df_imp = pd.DataFrame({'m12D': {0: 0.25987975843758654, 1: 0.18727707779383243,  2: 0.39100295701375354,  3: 0.06384800127268568,
  4: 0.07999849502412754,  5: 0.13640019148970256,  6: 0.1412876367877005,  7: 0.09121120297702168,
  8: 0.0,  9: 0.0}, 
 'm01D': {0: 0.0724816118081828, 1: 0.0,  2: 0.0,  3: 0.0,  4: 0.0,  5: 0.06251755368842625,  6: 0.0,
  7: 0.0,  8: 0.0,  9: 0.0},
 'm06D': {0: 0.06145081213633308,  1: 0.15137716985355018,  2: 0.10217899239161463,  3: 0.0,
  4: 0.0,  5: 0.0,  6: 0.06612051972885429,  7: 0.09406588670435026,  8: 0.0,  9: 0.0},
 'm12U': {0: 0.047766259908712215,  1: 0.11029620061232079,  2: 0.058189273034798476,  3: 0.0,
  4: 0.0,  5: 0.13754236428929292,  6: 0.24967144685607753,  7: 0.3109784881004455,
  8: 0.07243867278541272,  9: 0.06597783007344389},
 'm05D': {0: 0.04166653999225189,  1: 0.07952487761091377,  2: 0.0,  3: 0.0,  4: 0.0,  5: 0.0,
  6: 0.0,  7: 0.0,  8: 0.0,  9: 0.0},
 'Label': {0: 'p03D',  1: 'p06D',  2: 'p12D',  3: 'p24D',  4: 'p48D',  5: 'p03U',  6: 'p06U',
  7: 'p12U',  8: 'p24U',  9: 'p48U'}, 
 'down': {0: 0.0,  1: 0.05780607422803258,  2: 0.0,  3: 0.06127000511754594,  4: 0.05020367686447687,
  5: 0.0,  6: 0.0,  7: 0.0,  8: 0.0,  9: 0.0},
 'm06U': {0: 0.0,  1: 0.0,  2: 0.08466651640727402,  3: 0.0,  4: 0.0,  5: 0.05309415831396426,  
  6: 0.10234265392288792,  7: 0.10916311424468256,  8: 0.039101554776822,  9: 0.04173916824046613},
 'p06T': {0: 0.0,  1: 0.0,  2: 0.056553945901284826,  3: 0.0,  4: 0.0,  5: 0.0,  6: 0.0,  7: 0.0,
  8: 0.0,  9: 0.0},
 'm24D': {0: 0.0,  1: 0.0,  2: 0.0,  3: 0.05209052270599571,  4: 0.06155322163002999,  5: 0.0,
  6: 0.0,  7: 0.0,  8: 0.0,  9: 0.0},
 'p24T': {0: 0.0,  1: 0.0,  2: 0.0,  3: 0.05411155410020583,  4: 0.0,  5: 0.0,  6: 0.0,  7: 0.0,
  8: 0.0,  9: 0.0},
 'wday_6': {0: 0.0,  1: 0.0,  2: 0.0,  3: 0.059271604760677284,  4: 0.0,  5: 0.0,
  6: 0.0,  7: 0.0,  8: 0.035176474643147146,  9: 0.0},
 'p48T': {0: 0.0,  1: 0.0,  2: 0.0,  3: 0.0,  4: 0.04623563899446924,  5: 0.0,  6: 0.0,  7: 0.0,
  8: 0.0,  9: 0.0},
 'wday_4': {0: 0.0,  1: 0.0,  2: 0.0,  3: 0.0,  4: 0.04342742013282083,  5: 0.0,  6: 0.0,
  7: 0.0,  8: 0.0,  9: 0.0},
 'm02D': {0: 0.0,  1: 0.0,  2: 0.0,  3: 0.0,  4: 0.0,  5: 0.049231996352442,
  6: 0.0,  7: 0.0,  8: 0.0,  9: 0.0},
 'up_sum': {0: 0.0,  1: 0.0,  2: 0.0,  3: 0.0,  4: 0.0,  5: 0.0,  6: 0.05893778686091927,
  7: 0.0,  8: 0.0667566604504024,  9: 0.09670821121474267},
 'm05U': {0: 0.0,  1: 0.0,  2: 0.0,  3: 0.0,  4: 0.0,  5: 0.0,  6: 0.0,  7: 0.0590711833092717,
  8: 0.0,  9: 0.0},
 'm24U': {0: 0.0,  1: 0.0,  2: 0.0,  3: 0.0,  4: 0.0,  5: 0.0,  6: 0.0,
  7: 0.0,  8: 0.0844568188980733,  9: 0.11262510249213625},
 'wday_5': {0: 0.0,  1: 0.0,  2: 0.0,  3: 0.0,  4: 0.0,  5: 0.0,  6: 0.0,  7: 0.0,  8: 0.0,
  9: 0.03522099307752065}})

df_imp2 = df_imp.fillna(0)
df_imp2 = df_imp2.melt(id_vars="Label")
df_imp2 = df_imp2.rename(columns={"value":"Importance", "variable":"Feature"})
sns.set_theme(style="white")
g = sns.FacetGrid(df_imp2, col="Label", height=1.5, aspect=5, col_wrap=2, margin_titles=True, despine=False)
g.map(sns.barplot, "Feature", "Importance", order=['m12D', 'm01D', 'm06D', 'm12U', 'm05D',  'down', 'm06U', 'p06T',
       'm24D', 'p24T', 'wday_6', 'p48T', 'wday_4', 'm02D', 'up_sum', 'm05U',
       'm24U', 'wday_5']))
g.figure.subplots_adjust(wspace=0.02, hspace=0.4)
g.set(yticks=np.arange(0,0.4,0.1))

g.set_xticklabels(rotation=30)

plt.show()

好吧,seaborbn 已经给出了这个警告的原因:

UserWarning: Using the barplot function without specifying 'order' is likely to produce an incorrect plot.

那么在绘制数据之前先对数据进行排序如何:

df_imp2 = df_imp.fillna(0)
df_imp2 = df_imp2.melt(id_vars="Label")
df_imp2 = df_imp2.rename(columns={"value":"Importance", "variable":"Feature"})
df_imp2 = df_imp2.sort_values(['Label','Feature']) # sort them before plot
sns.set_theme(style="white")
g = sns.FacetGrid(df_imp2, col="Label", height=1.5, aspect=5, col_wrap=2, margin_titles=True, despine=False)
g.map(sns.barplot, "Feature", "Importance")
g.figure.subplots_adjust(wspace=0.02, hspace=0.4)
g.set(yticks=np.arange(0,0.4,0.1))

g.set_xticklabels(rotation=30)