按时间分组的值热图 - seaborn

heatmap of values grouped by time - seaborn

我正在绘制按时间分组的变量计数作为热图。然而,当同时包括小时和分钟时,计数非常低,因此生成的热图并不能真正提供任何真正的洞察力。是否可以在更大的时间块内对计数进行分组?我希望测试一些不同的时段(5、10 分钟)。

我也希望在 x 轴上绘制时间。类似于附加的输出。

import seaborn as sns
import pandas as pd
from datetime import datetime
from datetime import timedelta

start = datetime(1900,1,1,10,0,0)
end = datetime(1900,1,1,13,0,0)

seconds = (end - start).total_seconds()

step = timedelta(minutes = 1)

array = []
for i in range(0, int(seconds), int(step.total_seconds())):
    array.append(start + timedelta(seconds=i))

array = [i.strftime('%Y-%m-%d %H:%M%:%S') for i in array]

df2 = pd.DataFrame(array).rename(columns = {0:'Time'})
df2['Count'] = np.random.uniform(0.0, 0.5, size = len(df2))
df2['Count'] = df2['Count'].round(1)

df2['Time'] = pd.to_datetime(df2['Time'])
df2['Hour'] = df2['Time'].dt.hour
df2['Min'] = df2['Time'].dt.minute

g = df2.groupby(['Hour','Min','Count'])

count_df = g['Count'].nunique().unstack()

count_df.fillna(0, inplace = True)

sns.heatmap(count_df)

实现此目的的方法是创建一个包含数字的列,其中包含分钟数的重复元素。 例如:

minutes = 3
x = [0,1,2]
np.repeat(x, repeats=minutes, axis=0)
>>>> [0,0,0,1,1,1,2,2,2]

然后使用此列对数据进行分组。

因此您的代码如下所示:

...
minutes = 5
x = [i for i in range(int(df2.shape[0]/5))]
df2['group'] = np.repeat(x, repeats=minutes, axis=0)

g = df2.groupby(['Min', 'Count'])

count_df = g['Count'].nunique().unstack()

count_df.fillna(0, inplace = True)

为了处理这种情况,我认为使用数据下采样会很容易。更改阈值也很容易。输出图中的轴标签需要修改,但我们推荐这种方法。

import seaborn as sns
import pandas as pd
import numpy as np
from datetime import datetime
from datetime import timedelta

start = datetime(1900,1,1,10,0,0)
end = datetime(1900,1,1,13,0,0)

seconds = (end - start).total_seconds()

step = timedelta(minutes = 1)

array = []
for i in range(0, int(seconds), int(step.total_seconds())):
    array.append(start + timedelta(seconds=i))

array = [i.strftime('%Y-%m-%d %H:%M:%S') for i in array]

df2 = pd.DataFrame(array).rename(columns = {0:'Time'})
df2['Count'] = np.random.uniform(0.0, 0.5, size = len(df2))
df2['Count'] = df2['Count'].round(1)

df2['Time'] = pd.to_datetime(df2['Time'])
df2['Hour'] = df2['Time'].dt.hour
df2['Min'] = df2['Time'].dt.minute

df2.set_index('Time', inplace=True)
count_df = df2.resample('10min')['Count'].value_counts().unstack()
count_df.fillna(0, inplace = True)

sns.heatmap(count_df.T)