热图可视化
HeatMap visualization
我有一个数据框 df1
df1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 38840 entries, 0 to 38839
Data columns (total 7 columns):
TIMESTAMP 38840 non-null datetime64[ns]
ACT_TIME_AERATEUR_1_F1 38696 non-null float64
ACT_TIME_AERATEUR_1_F3 38697 non-null float64
ACT_TIME_AERATEUR_1_F5 38695 non-null float64
ACT_TIME_AERATEUR_1_F6 38695 non-null float64
ACT_TIME_AERATEUR_1_F7 38693 non-null float64
ACT_TIME_AERATEUR_1_F8 38696 non-null float64
dtypes: datetime64[ns](1), float64(6)
memory usage: 2.1 MB
看起来像这样:
TIMESTAMP ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 ACT_TIME_AERATEUR_1_F5 ACT_TIME_AERATEUR_1_F6 ACT_TIME_AERATEUR_1_F7
ACT_TIME_AERATEUR_1_F8
2015-08-01 05:10:00 100 100 100 100 100 100
2015-08-01 05:20:00 100 100 100 100 100 100
2015-08-01 05:30:00 100 100 100 100 100 100
2015-08-01 05:40:00 100 100 100 100 100 100
我尝试用 seaborn 创建一个热图来可视化两个日期之间的数据(例如这里在 '2015-08-01 23:10:00' 和 '2015-08-02 02:00:00 之间'):
我喜欢这样:
df1['TIMESTAMP']= pd.to_datetime(df_no_missing['TIMESTAMP'], '%d-%m-%y %H:%M:%S')
df1['date'] = df_no_missing['TIMESTAMP'].dt.date
df1['time'] = df_no_missing['TIMESTAMP'].dt.time
date_debut = pd.to_datetime('2015-08-01 23:10:00')
date_fin = pd.to_datetime('2015-08-02 02:00:00')
df1 = df1[(df1['TIMESTAMP'] >= date_debut) & (df1['TIMESTAMP'] < date_fin)]
sns.heatmap(df1.iloc[:,1:6:],annot=True, linewidths=.5)
我得到了附件中的热图
我现在的问题是如何将热图左侧的数字 (145...161) 替换为相应的时间戳值 (2015-08-01 05:10:00, 2015-08 -01 05:20:00, 2015-08-01 05:30:00, ...)
谢谢
最佳
我尝试进行修改:
df1.set_index("TIMESTAMP", inplace=1)
sns.heatmap(df1.iloc[:, 1:6:], annot=True, linewidths=.5)
ax = plt.gca()
ax.set_yticklabels([i.strftime("%Y-%m-%d %H:%M:%S") for i in df1.TIMESTAMP], rotation=0)
编辑
但我收到错误和警告:
C:\Users\Demonstrator\Anaconda3\lib\site-packages\ipykernel\__main__.py:2:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
from ipykernel import kernelapp as app
C:\Users\Demonstrator\Anaconda3\lib\site-packages\ipykernel\__main__.py:3:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
app.launch_new_instance()
C:\Users\Demonstrator\Anaconda3\lib\site-packages\ipykernel\__main__.py:4:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-129-cec498d88cac> in <module>()
9
10 #sns.heatmap(df1.iloc[:,1:6:],annot=True, linewidths=.5)
---> 11 sns.heatmap(df1.iloc[:, 1:6:], annot=True, linewidths=.5)
12 ax = plt.gca()
13 ax.set_yticklabels([i.strftime("%Y-%m-%d %H:%M:%S") for i in df1.TIMESTAMP], rotation=0)
C:\Users\Demonstrator\Anaconda3\lib\site-packages\seaborn\matrix.py in
heatmap(data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws,
linewidths, linecolor, cbar, cbar_kws, cbar_ax, square, ax,
xticklabels, yticklabels, mask, **kwargs)
483 plotter = _HeatMapper(data, vmin, vmax, cmap, center, robust, annot, fmt,
484 annot_kws, cbar, cbar_kws, xticklabels,
--> 485 yticklabels, mask)
486
487 # Add the pcolormesh kwargs here
C:\Users\Demonstrator\Anaconda3\lib\site-packages\seaborn\matrix.py in
init(self, data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, cbar, cbar_kws, xticklabels, yticklabels, mask)
165 # Determine good default values for the colormapping
166 self._determine_cmap_params(plot_data, vmin, vmax,
--> 167 cmap, center, robust)
168
169 # Sort out the annotations
C:\Users\Demonstrator\Anaconda3\lib\site-packages\seaborn\matrix.py in
_determine_cmap_params(self, plot_data, vmin, vmax, cmap, center, robust)
204 calc_data = plot_data.data[~np.isnan(plot_data.data)]
205 if vmin is None:
--> 206 vmin = np.percentile(calc_data, 2) if robust else calc_data.min()
207 if vmax is None:
208 vmax = np.percentile(calc_data, 98) if robust else calc_data.max()
C:\Users\Demonstrator\Anaconda3\lib\site-packages\numpy\core\_methods.py
in _amin(a, axis, out, keepdims)
27
28 def _amin(a, axis=None, out=None, keepdims=False):
---> 29 return umr_minimum(a, axis, None, out, keepdims)
30
31 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
ValueError: zero-size array to reduction operation minimum which has no identity
@jeanrjc,看最后一张图,有个问题:图片太小了,右边有两条竖线(刻度)。我希望我现在清楚了
这是因为 TIMESTAMP
不是您的索引,来自 sns.heatmap
文档字符串:
yticklabels : list-like, int, or bool, optional
If True, plot the row names of the dataframe. If False, don't plot
the row names. If list-like, plot these alternate labels as the
yticklabels. If an integer, use the index names but plot only every
n label.
行名作为索引。
因此您可以相应地设置索引:
df1.set_index("TIMESTAMP", inplace=1)
并使用您的 sns
命令,它几乎可以正常工作。问题是你会有一个丑陋的日期表示。
或者,您可以这样做,而不是更改索引:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
...
...
ax = sns.heatmap(df1.iloc[:, 1:6:], annot=True, linewidths=.5)
ax.set_yticklabels([i.strftime("%Y-%m-%d %H:%M:%S") for i in df1.TIMESTAMP], rotation=0)
HTH
我有一个数据框 df1
df1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 38840 entries, 0 to 38839
Data columns (total 7 columns):
TIMESTAMP 38840 non-null datetime64[ns]
ACT_TIME_AERATEUR_1_F1 38696 non-null float64
ACT_TIME_AERATEUR_1_F3 38697 non-null float64
ACT_TIME_AERATEUR_1_F5 38695 non-null float64
ACT_TIME_AERATEUR_1_F6 38695 non-null float64
ACT_TIME_AERATEUR_1_F7 38693 non-null float64
ACT_TIME_AERATEUR_1_F8 38696 non-null float64
dtypes: datetime64[ns](1), float64(6)
memory usage: 2.1 MB
看起来像这样:
TIMESTAMP ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 ACT_TIME_AERATEUR_1_F5 ACT_TIME_AERATEUR_1_F6 ACT_TIME_AERATEUR_1_F7
ACT_TIME_AERATEUR_1_F8
2015-08-01 05:10:00 100 100 100 100 100 100
2015-08-01 05:20:00 100 100 100 100 100 100
2015-08-01 05:30:00 100 100 100 100 100 100
2015-08-01 05:40:00 100 100 100 100 100 100
我尝试用 seaborn 创建一个热图来可视化两个日期之间的数据(例如这里在 '2015-08-01 23:10:00' 和 '2015-08-02 02:00:00 之间'): 我喜欢这样:
df1['TIMESTAMP']= pd.to_datetime(df_no_missing['TIMESTAMP'], '%d-%m-%y %H:%M:%S')
df1['date'] = df_no_missing['TIMESTAMP'].dt.date
df1['time'] = df_no_missing['TIMESTAMP'].dt.time
date_debut = pd.to_datetime('2015-08-01 23:10:00')
date_fin = pd.to_datetime('2015-08-02 02:00:00')
df1 = df1[(df1['TIMESTAMP'] >= date_debut) & (df1['TIMESTAMP'] < date_fin)]
sns.heatmap(df1.iloc[:,1:6:],annot=True, linewidths=.5)
我得到了附件中的热图
我现在的问题是如何将热图左侧的数字 (145...161) 替换为相应的时间戳值 (2015-08-01 05:10:00, 2015-08 -01 05:20:00, 2015-08-01 05:30:00, ...)
谢谢
最佳
我尝试进行修改:
df1.set_index("TIMESTAMP", inplace=1)
sns.heatmap(df1.iloc[:, 1:6:], annot=True, linewidths=.5)
ax = plt.gca()
ax.set_yticklabels([i.strftime("%Y-%m-%d %H:%M:%S") for i in df1.TIMESTAMP], rotation=0)
编辑
但我收到错误和警告:
C:\Users\Demonstrator\Anaconda3\lib\site-packages\ipykernel\__main__.py:2:
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy from ipykernel import kernelapp as app C:\Users\Demonstrator\Anaconda3\lib\site-packages\ipykernel\__main__.py:3:
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy app.launch_new_instance() C:\Users\Demonstrator\Anaconda3\lib\site-packages\ipykernel\__main__.py:4:
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-129-cec498d88cac> in <module>() 9 10 #sns.heatmap(df1.iloc[:,1:6:],annot=True, linewidths=.5) ---> 11 sns.heatmap(df1.iloc[:, 1:6:], annot=True, linewidths=.5) 12 ax = plt.gca() 13 ax.set_yticklabels([i.strftime("%Y-%m-%d %H:%M:%S") for i in df1.TIMESTAMP], rotation=0) C:\Users\Demonstrator\Anaconda3\lib\site-packages\seaborn\matrix.py in
heatmap(data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, linewidths, linecolor, cbar, cbar_kws, cbar_ax, square, ax, xticklabels, yticklabels, mask, **kwargs) 483 plotter = _HeatMapper(data, vmin, vmax, cmap, center, robust, annot, fmt, 484 annot_kws, cbar, cbar_kws, xticklabels, --> 485 yticklabels, mask) 486 487 # Add the pcolormesh kwargs here
C:\Users\Demonstrator\Anaconda3\lib\site-packages\seaborn\matrix.py in
init(self, data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, cbar, cbar_kws, xticklabels, yticklabels, mask) 165 # Determine good default values for the colormapping 166 self._determine_cmap_params(plot_data, vmin, vmax, --> 167 cmap, center, robust) 168 169 # Sort out the annotations
C:\Users\Demonstrator\Anaconda3\lib\site-packages\seaborn\matrix.py in
_determine_cmap_params(self, plot_data, vmin, vmax, cmap, center, robust) 204 calc_data = plot_data.data[~np.isnan(plot_data.data)] 205 if vmin is None: --> 206 vmin = np.percentile(calc_data, 2) if robust else calc_data.min() 207 if vmax is None: 208 vmax = np.percentile(calc_data, 98) if robust else calc_data.max()
C:\Users\Demonstrator\Anaconda3\lib\site-packages\numpy\core\_methods.py
in _amin(a, axis, out, keepdims) 27 28 def _amin(a, axis=None, out=None, keepdims=False): ---> 29 return umr_minimum(a, axis, None, out, keepdims) 30 31 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
ValueError: zero-size array to reduction operation minimum which has no identity
@jeanrjc,看最后一张图,有个问题:图片太小了,右边有两条竖线(刻度)。我希望我现在清楚了
这是因为 TIMESTAMP
不是您的索引,来自 sns.heatmap
文档字符串:
yticklabels : list-like, int, or bool, optional If True, plot the row names of the dataframe. If False, don't plot the row names. If list-like, plot these alternate labels as the yticklabels. If an integer, use the index names but plot only every n label.
行名作为索引。
因此您可以相应地设置索引:
df1.set_index("TIMESTAMP", inplace=1)
并使用您的 sns
命令,它几乎可以正常工作。问题是你会有一个丑陋的日期表示。
或者,您可以这样做,而不是更改索引:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
...
...
ax = sns.heatmap(df1.iloc[:, 1:6:], annot=True, linewidths=.5)
ax.set_yticklabels([i.strftime("%Y-%m-%d %H:%M:%S") for i in df1.TIMESTAMP], rotation=0)
HTH