将 lambda 函数应用于日期时间
Applying lambda function to datetime
我正在使用以下代码在列表中查找差异 <=1 的聚类
from itertools import groupby
from operator import itemgetter
data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
for k, g in groupby(enumerate(data), lambda (i, x): (i-x)):
print map(itemgetter(1), g)
但是,如果我将 data
更改为日期时间数组以查找相隔仅 1 小时的日期时间簇,则会失败。
我正在尝试以下操作:
>>> data
array([datetime.datetime(2016, 10, 1, 8, 0),
datetime.datetime(2016, 10, 1, 9, 0),
datetime.datetime(2016, 10, 1, 10, 0), ...,
datetime.datetime(2019, 1, 3, 9, 0),
datetime.datetime(2019, 1, 3, 10, 0),
datetime.datetime(2019, 1, 3, 11, 0)], dtype=object)
from itertools import groupby
from operator import itemgetter
data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
for k, g in groupby(enumerate(data), lambda (i, x): (i-x).total_seconds()/3600):
print map(itemgetter(1), g)
错误是:
for k, g in groupby(enumerate(data), lambda (i, x): int((i-x).total_seconds()/3600)):
TypeError: unsupported operand type(s) for -: 'int' and 'datetime.datetime'
网上有很多解决方案,但我想应用这个特定的方法来学习。
如果您想要获取项目的所有子序列,使得每个项目都比前一个项目晚一个小时(而不是每个项目彼此相距不到一小时的项目集群),您需要迭代对 (data[i-1], data[i])
.当前,您只是迭代 (i, data[i])
,当您尝试从 i
中减去 data[i]
时会引发 TypeError
。一个工作示例可能如下所示:
from itertools import izip
def find_subsequences(data):
if len(data) <= 1:
return []
current_group = [data[0]]
delta = 3600
results = []
for current, next in izip(data, data[1:]):
if abs((next - current).total_seconds()) > delta:
# Here, `current` is the last item of the previous subsequence
# and `next` is the first item of the next subsequence.
if len(current_group) >= 2:
results.append(current_group)
current_group = [next]
continue
current_group.append(next)
return results
我正在使用以下代码在列表中查找差异 <=1 的聚类
from itertools import groupby
from operator import itemgetter
data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
for k, g in groupby(enumerate(data), lambda (i, x): (i-x)):
print map(itemgetter(1), g)
但是,如果我将 data
更改为日期时间数组以查找相隔仅 1 小时的日期时间簇,则会失败。
我正在尝试以下操作:
>>> data
array([datetime.datetime(2016, 10, 1, 8, 0),
datetime.datetime(2016, 10, 1, 9, 0),
datetime.datetime(2016, 10, 1, 10, 0), ...,
datetime.datetime(2019, 1, 3, 9, 0),
datetime.datetime(2019, 1, 3, 10, 0),
datetime.datetime(2019, 1, 3, 11, 0)], dtype=object)
from itertools import groupby
from operator import itemgetter
data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
for k, g in groupby(enumerate(data), lambda (i, x): (i-x).total_seconds()/3600):
print map(itemgetter(1), g)
错误是:
for k, g in groupby(enumerate(data), lambda (i, x): int((i-x).total_seconds()/3600)):
TypeError: unsupported operand type(s) for -: 'int' and 'datetime.datetime'
网上有很多解决方案,但我想应用这个特定的方法来学习。
如果您想要获取项目的所有子序列,使得每个项目都比前一个项目晚一个小时(而不是每个项目彼此相距不到一小时的项目集群),您需要迭代对 (data[i-1], data[i])
.当前,您只是迭代 (i, data[i])
,当您尝试从 i
中减去 data[i]
时会引发 TypeError
。一个工作示例可能如下所示:
from itertools import izip
def find_subsequences(data):
if len(data) <= 1:
return []
current_group = [data[0]]
delta = 3600
results = []
for current, next in izip(data, data[1:]):
if abs((next - current).total_seconds()) > delta:
# Here, `current` is the last item of the previous subsequence
# and `next` is the first item of the next subsequence.
if len(current_group) >= 2:
results.append(current_group)
current_group = [next]
continue
current_group.append(next)
return results