使用 python 和 seaborn 从数据框生成热图

generate a heatmap from a dataframe with python and seaborn

我是 Python 的新手,也是 seaborn 的新手。

我有一个名为 df 的 pandas 数据框,它看起来像:

TIMESTAMP ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F2 ACT_TIME_AERATEUR_1_F3 ACT_TIME_AERATEUR_1_F4 ACT_TIME_AERATEUR_1_F5 ACT_TIME_AERATEUR_1_F6 
2015-08-01 23:00:00 80 0 0 0 10 0
2015-08-01 23:20:00 60 0 20 0 10 10
2015-08-01 23:40:00 80 10 0 0 10 10
2015-08-01 00:00:00 60 10 20 40 10 10


df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 38840 entries, 0 to 38839
Data columns (total 7 columns):
TIMESTAMP                 38840 non-null datetime64[ns]
ACT_TIME_AERATEUR_1_F1    38696 non-null float64
ACT_TIME_AERATEUR_1_F3    38697 non-null float64
ACT_TIME_AERATEUR_1_F5    38695 non-null float64
ACT_TIME_AERATEUR_1_F6    38695 non-null float64
ACT_TIME_AERATEUR_1_F7    38693 non-null float64
ACT_TIME_AERATEUR_1_F8    38696 non-null float64
dtypes: datetime64[ns](1), float64(6)
memory usage: 2.1 MB

我尝试使用此代码制作热图:

data = sns.load_dataset("df")

# Draw a heatmap with the numeric values in each cell
sns.heatmap(data, annot=True, fmt="d", linewidths=.5)

但是不行 你能帮我找出错误吗?

谢谢

编辑 第一的 , 我从 csv 文件加载数据框:

df1 = pd.read_csv('C:/Users/Demonstrator/Downloads/Listeequipement.csv',delimiter=';', parse_dates=[0], infer_datetime_format = True)

然后,我 select 只有日期为 '2015-08-01 23:10:00' 和 '2015-08-02 00:00:00'

的行
    import seaborn as sns
    df1['TIMESTAMP']= pd.to_datetime(df1_no_missing['TIMESTAMP'], '%d-%m-%y %H:%M:%S')
    df1['date'] = df_no_missing['TIMESTAMP'].dt.date
    df1['time'] = df_no_missing['TIMESTAMP'].dt.time
    date_debut = pd.to_datetime('2015-08-01 23:10:00')
    date_fin = pd.to_datetime('2015-08-02 00:00:00')
    df1 = df1[(df1['TIMESTAMP'] >= date_debut) & (df1['TIMESTAMP'] < date_fin)]

Then, construct the heatmap :
sns.heatmap(df1.iloc[:,2:],annot=True, fmt="d", linewidths=.5)

我收到这个错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-363-a054889ebec3> in <module>()
      7 df1 = df1[(df1['TIMESTAMP'] >= date_debut) & (df1['TIMESTAMP'] < date_fin)]
      8 
----> 9 sns.heatmap(df1.iloc[:,2:],annot=True, fmt="d", linewidths=.5)

C:\Users\Demonstrator\Anaconda3\lib\site-packages\seaborn\matrix.py in

heatmap(data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, linewidths, linecolor, cbar, cbar_kws, cbar_ax, square, ax, xticklabels, yticklabels, mask, **kwargs) 483 plotter = _HeatMapper(data, vmin, vmax, cmap, center, robust, annot, fmt, 484 annot_kws, cbar, cbar_kws, xticklabels, --> 485 yticklabels, mask) 486 487 # Add the pcolormesh kwargs here

C:\Users\Demonstrator\Anaconda3\lib\site-packages\seaborn\matrix.py in

init(self, data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, cbar, cbar_kws, xticklabels, yticklabels, mask) 165 # Determine good default values for the colormapping 166 self._determine_cmap_params(plot_data, vmin, vmax, --> 167 cmap, center, robust) 168 169 # Sort out the annotations

C:\Users\Demonstrator\Anaconda3\lib\site-packages\seaborn\matrix.py in

_determine_cmap_params(self, plot_data, vmin, vmax, cmap, center, robust) 202 cmap, center, robust): 203 """Use some heuristics to set good defaults for colorbar and range.""" --> 204 calc_data = plot_data.data[~np.isnan(plot_data.data)] 205 if vmin is None: 206 vmin = np.percentile(calc_data, 2) if robust else calc_data.min()

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types

according to the casting rule ''safe''

在将时间戳变量(即前两列)传递给sns.heatmap之前删除它,也不需要加载数据集,只需使用:

sns.heatmap(df.iloc[:,2:],annot=True, fmt="d", linewidths=.5)

编辑

好的,这是您的数据框,只是为了节省时间更改了列名

df
Out[9]: 
           v1        v2  v3  v4  v5  v6  v7  v8
0  2015-08-01  23:00:00  80   0   0   0  10   0
1  2015-08-01  23:20:00  60   0  20   0  10  10
2  2015-08-01  23:40:00  80  10   0   0  10  10
3  2015-08-01  00:00:00  60  10  20  40  10  10

现在 seaborn 无法正确识别热图的时间戳变量,因此我们将删除前两列并将数据帧传递给 seaborn

import seaborn as sns
sns.heatmap(df.iloc[:,2:],annot=True, fmt="d", linewidths=.5)

所以我们得到的结果是

如果使用此方法没有得到结果,请编辑您的问题以包含其余代码。那么这不是问题。

因为您没有将时间戳分配为索引。 行名是索引。这样做:

df1.set_index("TIMESTAMP", inplace=1)

此问题的另一个修复方法是:

ax = sns.heatmap(df1.iloc[:, 1:6:], annot=True, linewidths=.5)
ax.set_yticklabels([i.strftime("%Y-%m-%d %H:%M:%S") for i in df1.TIMESTAMP], rotation=0)