seaborn 热图 pandas 对 isnull 的计算

seaborn heatmap pandas calculation on isnull

生成数据帧的一系列计算以提供 NaN 占总行数的百分比,如图所示:

data = df.isnull().sum()/len(df)*100

RecordID          0.000000
ContactID         0.000000
EmailAddress      0.000000
ExternalID      100.000000
Date              0.000000
Name              0.000000
Owner            67.471362
Priority          0.000000
Status            0.000000
Subject           0.000000
Description       0.000000
Type              0.000000
dtype: float64

我想做的是在 seaborn sns.heatmap(data) 中将其表示为热图,吸引读者注意那些有 100 和 67% 的人,不幸的是我遇到了这个错误

IndexError: Inconsistent shape between the condition and the input (got (12, 1) and (12,))

完整追溯:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-17-05db696a3a9b> in <module>()
----> 1 sns.heatmap(data)

~\AppData\Local\Programs\Python\Python36-32\lib\site-packages\seaborn\matrix.py in heatmap(data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, linewidths, linecolor, cbar, cbar_kws, cbar_ax, square, xticklabels, yticklabels, mask, ax, **kwargs)
    515     plotter = _HeatMapper(data, vmin, vmax, cmap, center, robust, annot, fmt,
    516                           annot_kws, cbar, cbar_kws, xticklabels,
--> 517                           yticklabels, mask)
    518 
    519     # Add the pcolormesh kwargs here

~\AppData\Local\Programs\Python\Python36-32\lib\site-packages\seaborn\matrix.py in __init__(self, data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, cbar, cbar_kws, xticklabels, yticklabels, mask)
    114         mask = _matrix_mask(data, mask)
    115 
--> 116         plot_data = np.ma.masked_where(np.asarray(mask), plot_data)
    117 
    118         # Get good names for the rows and columns

~\AppData\Local\Programs\Python\Python36-32\lib\site-packages\numpy\ma\core.py in masked_where(condition, a, copy)
   1934     if cshape and cshape != ashape:
   1935         raise IndexError("Inconsistent shape between the condition and the input"
-> 1936                          " (got %s and %s)" % (cshape, ashape))
   1937     if hasattr(a, '_mask'):
   1938         cond = mask_or(cond, a._mask)

IndexError: Inconsistent shape between the condition and the input (got (12, 1) and (12,))

我的研究是否在 numpy 广播规则周围遇到了很多问题,或者 3 年前的错误 - none 其中非常有用。

一如既往的感谢。

您的 data 变量是 pd.Series 的一个实例,它本质上是一维的。然而, sns.heatmap 需要二维输入。例如,快速修复如下:

sns.heatmap(data.to_frame())