如何计算 Pandas 系列或数据帧的最大 15 分钟总和
How to calculate the maximum 15-min sum from a Pandas Series or Dataframe
Pandas 这里是新手。我有一个数据集,其中包含带有时间戳的流量计数。我想知道哪个 15 分钟间隔的计数总和最多,以及这个总和的值。
数据可能如下所示:
import random
ts = pd.Series(range(1000),index=random.sample(pd.date_range('2015-02-01 06:00:00',periods=3000,freq='1min'),1000)).sort_index()
2015-02-01 06:06:00 314
2015-02-01 06:08:00 154
2015-02-01 06:09:00 914
2015-02-01 06:13:00 84
2015-02-01 06:18:00 880
2015-02-01 06:22:00 912
2015-02-01 06:28:00 410
2015-02-01 06:32:00 391
2015-02-01 06:34:00 270
2015-02-01 06:35:00 984
2015-02-01 06:36:00 271
2015-02-01 06:37:00 722
2015-02-01 06:38:00 748
2015-02-01 06:40:00 313
2015-02-01 06:42:00 277
2015-02-01 06:43:00 604
2015-02-01 06:49:00 888
2015-02-01 06:50:00 943
2015-02-01 06:51:00 124
2015-02-01 06:52:00 806
Pandas有没有办法做到这一点?
不使用 pandas 本机函数的简单解决方案
from datetime import timedelta
start = ts.index[0]
end = ts.index[len(ts)-1]
dur = timedelta(minutes=15)
max_val = 0
while start < end:
cum_sum = ts[start : start+dur].sum()
if cum_sum > max_val:
max_val = cum_sum
max_seg = (start, start+dur)
start = star+dur
print max_val
print max_seg
这是我想出的:
def find_peak_15_minutes(data_frame, column):
max_sum = 0
start_of_max15 = 0
for start in data_frame[column].values:
series_sum = data_frame[column][data_frame[column].between(start, start + 15)].count()
if series_sum > max_sum:
max_sum = series_sum
start_of_max15 = start
return (start_of_max15, max_sum)
Pandas 这里是新手。我有一个数据集,其中包含带有时间戳的流量计数。我想知道哪个 15 分钟间隔的计数总和最多,以及这个总和的值。
数据可能如下所示:
import random
ts = pd.Series(range(1000),index=random.sample(pd.date_range('2015-02-01 06:00:00',periods=3000,freq='1min'),1000)).sort_index()
2015-02-01 06:06:00 314
2015-02-01 06:08:00 154
2015-02-01 06:09:00 914
2015-02-01 06:13:00 84
2015-02-01 06:18:00 880
2015-02-01 06:22:00 912
2015-02-01 06:28:00 410
2015-02-01 06:32:00 391
2015-02-01 06:34:00 270
2015-02-01 06:35:00 984
2015-02-01 06:36:00 271
2015-02-01 06:37:00 722
2015-02-01 06:38:00 748
2015-02-01 06:40:00 313
2015-02-01 06:42:00 277
2015-02-01 06:43:00 604
2015-02-01 06:49:00 888
2015-02-01 06:50:00 943
2015-02-01 06:51:00 124
2015-02-01 06:52:00 806
Pandas有没有办法做到这一点?
不使用 pandas 本机函数的简单解决方案
from datetime import timedelta
start = ts.index[0]
end = ts.index[len(ts)-1]
dur = timedelta(minutes=15)
max_val = 0
while start < end:
cum_sum = ts[start : start+dur].sum()
if cum_sum > max_val:
max_val = cum_sum
max_seg = (start, start+dur)
start = star+dur
print max_val
print max_seg
这是我想出的:
def find_peak_15_minutes(data_frame, column):
max_sum = 0
start_of_max15 = 0
for start in data_frame[column].values:
series_sum = data_frame[column][data_frame[column].between(start, start + 15)].count()
if series_sum > max_sum:
max_sum = series_sum
start_of_max15 = start
return (start_of_max15, max_sum)