将 lambda 函数应用于日期时间

Question

我正在使用以下代码在列表中查找差异 <=1 的聚类

from itertools import groupby
from operator import itemgetter
data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
for k, g in groupby(enumerate(data), lambda (i, x): (i-x)):
    print map(itemgetter(1), g)

但是，如果我将 data 更改为日期时间数组以查找相隔仅 1 小时的日期时间簇，则会失败。

我正在尝试以下操作：

>>> data
array([datetime.datetime(2016, 10, 1, 8, 0),
       datetime.datetime(2016, 10, 1, 9, 0),
       datetime.datetime(2016, 10, 1, 10, 0), ...,
       datetime.datetime(2019, 1, 3, 9, 0),
       datetime.datetime(2019, 1, 3, 10, 0),
       datetime.datetime(2019, 1, 3, 11, 0)], dtype=object)

    from itertools import groupby
    from operator import itemgetter
    data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
    for k, g in groupby(enumerate(data), lambda (i, x): (i-x).total_seconds()/3600):
        print map(itemgetter(1), g)

错误是：

    for k, g in groupby(enumerate(data), lambda (i, x): int((i-x).total_seconds()/3600)):
TypeError: unsupported operand type(s) for -: 'int' and 'datetime.datetime'

网上有很多解决方案，但我想应用这个特定的方法来学习。

Answer 1

如果您想要获取项目的所有子序列，使得每个项目都比前一个项目晚一个小时（而不是每个项目彼此相距不到一小时的项目集群），您需要迭代对 (data[i-1], data[i]).当前，您只是迭代 (i, data[i])，当您尝试从 i 中减去 data[i] 时会引发 TypeError。一个工作示例可能如下所示：

from itertools import izip

def find_subsequences(data):
    if len(data) <= 1:
        return []

    current_group = [data[0]]
    delta = 3600
    results = []

    for current, next in izip(data, data[1:]):
        if abs((next - current).total_seconds()) > delta:
            # Here, `current` is the last item of the previous subsequence
            # and `next` is the first item of the next subsequence.
            if len(current_group) >= 2:
                results.append(current_group)

            current_group = [next]
            continue

        current_group.append(next)

    return results

将 lambda 函数应用于日期时间

Applying lambda function to datetime

python

lambda