获取日期时间范围列表的并集和交集 python
Get unions and intersections of list of datetime ranges python
我有两个 datetime
范围列表。
例如
l1 = [(datetime.datetime(2018, 8, 29, 1, 0, 0), datetime.datetime(2018, 8, 29, 3, 0, 0)), (datetime.datetime(2018, 8, 29, 6, 0, 0), datetime.datetime(2018, 8, 29, 9, 0, 0))]
l2 = [(datetime.datetime(2018, 8, 29, 2, 0, 0), datetime.datetime(2018, 8, 29, 4, 0, 0)), (datetime.datetime(2018, 8, 29, 5, 0, 0), datetime.datetime(2018, 8, 29, 7, 0, 0))]
我想获得 l1
和 l2
的并集。
所需的输出是:
union = [(datetime.datetime(2018, 8, 29, 1, 0, 0), datetime.datetime(2018, 8, 29, 4, 0, 0)), (datetime.datetime(2018, 8, 29, 5, 0, 0), datetime.datetime(2018, 8, 29, 9, 0, 0))]
intersection = [(datetime.datetime(2018, 8, 29, 2, 0, 0), datetime.datetime(2018, 8, 29, 3, 0, 0)), (datetime.datetime(2018, 8, 29, 6, 0, 0), datetime.datetime(2018, 8, 29, 7, 0, 0))]
实际数据可能不会如此完美对齐。
您对日期范围的并集和交集的定义可以简单描述为:-
联盟:
In []:
from itertools import product
[(min(s1, s2), max(e1, e2)) for (s1, e1), (s2, e2) in product(l1, l2) if s1 <= e2 and e1 >= s2]
Out[]:
[(datetime.datetime(2018, 8, 29, 1, 0), datetime.datetime(2018, 8, 29, 4, 0)),
(datetime.datetime(2018, 8, 29, 5, 0), datetime.datetime(2018, 8, 29, 9, 0))]
路口:
In []:
[(max(s1, s2), min(e1, e2)) for (s1, e1), (s2, e2) in product(l1, l2) if s1 <= e2 and e1 >= s2]
Out[]:
[(datetime.datetime(2018, 8, 29, 2, 0), datetime.datetime(2018, 8, 29, 3, 0)),
(datetime.datetime(2018, 8, 29, 6, 0), datetime.datetime(2018, 8, 29, 7, 0))]
您可以将 <=
和 >=
替换为 <
和 >
如果它们必须严格重叠而不仅仅是接触。
答案 here 对您提出的问题非常有用,因为它可以压缩重叠范围的数组:
from operator import itemgetter
def consolidate(intervals):
sorted_intervals = sorted(intervals, key=itemgetter(0))
if not sorted_intervals: # no intervals to merge
return
# low and high represent the bounds of the current run of merges
low, high = sorted_intervals[0]
for iv in sorted_intervals[1:]:
if iv[0] <= high: # new interval overlaps current run
high = max(high, iv[1]) # merge with the current run
else: # current run is over
yield low, high # yield accumulated interval
low, high = iv # start new run
yield low, high # end the final run
l1
和 l2
的合并只是 l1
和 l2
中所有范围的合并:
def union(l1, l2):
return consolidate([*l1, *l2])
l1
和 l2
的交叉由 AChampion 的代码充分完成(如果 l1
中的任何范围与 l2
中的任何范围之间存在任何重叠,即重叠应该在结果中),但它可能导致范围碎片化;我们可以使用相同的函数来连接相邻或重叠的范围:
from itertools import product
def intersection(l1, l2):
result = ((max(s1, s2), min(e1, e2)) for (s1, e1), (s2, e2) in product(l1, l2) if s1 < e2 and e1 > s2)
return consolidate(result)
一个例子:
l1 = [(1, 7), (4, 8), (10, 15), (20, 30), (50, 60)]
l2 = [(3, 6), (8, 11), (15, 20)]
print(list(union(l1, l2))) # [(1, 30), (50, 60)]
print(list(intersection(l1, l2))) # [(3, 6), (10, 11)]
(为清楚起见,该示例使用整数,但它适用于任何可比较的类型。具体而言,对于 OP 的 l1
和 l2
,代码会产生 OP 所需的 datetime
结果。)
我有两个 datetime
范围列表。
例如
l1 = [(datetime.datetime(2018, 8, 29, 1, 0, 0), datetime.datetime(2018, 8, 29, 3, 0, 0)), (datetime.datetime(2018, 8, 29, 6, 0, 0), datetime.datetime(2018, 8, 29, 9, 0, 0))]
l2 = [(datetime.datetime(2018, 8, 29, 2, 0, 0), datetime.datetime(2018, 8, 29, 4, 0, 0)), (datetime.datetime(2018, 8, 29, 5, 0, 0), datetime.datetime(2018, 8, 29, 7, 0, 0))]
我想获得 l1
和 l2
的并集。
所需的输出是:
union = [(datetime.datetime(2018, 8, 29, 1, 0, 0), datetime.datetime(2018, 8, 29, 4, 0, 0)), (datetime.datetime(2018, 8, 29, 5, 0, 0), datetime.datetime(2018, 8, 29, 9, 0, 0))]
intersection = [(datetime.datetime(2018, 8, 29, 2, 0, 0), datetime.datetime(2018, 8, 29, 3, 0, 0)), (datetime.datetime(2018, 8, 29, 6, 0, 0), datetime.datetime(2018, 8, 29, 7, 0, 0))]
实际数据可能不会如此完美对齐。
您对日期范围的并集和交集的定义可以简单描述为:-
联盟:
In []:
from itertools import product
[(min(s1, s2), max(e1, e2)) for (s1, e1), (s2, e2) in product(l1, l2) if s1 <= e2 and e1 >= s2]
Out[]:
[(datetime.datetime(2018, 8, 29, 1, 0), datetime.datetime(2018, 8, 29, 4, 0)),
(datetime.datetime(2018, 8, 29, 5, 0), datetime.datetime(2018, 8, 29, 9, 0))]
路口:
In []:
[(max(s1, s2), min(e1, e2)) for (s1, e1), (s2, e2) in product(l1, l2) if s1 <= e2 and e1 >= s2]
Out[]:
[(datetime.datetime(2018, 8, 29, 2, 0), datetime.datetime(2018, 8, 29, 3, 0)),
(datetime.datetime(2018, 8, 29, 6, 0), datetime.datetime(2018, 8, 29, 7, 0))]
您可以将 <=
和 >=
替换为 <
和 >
如果它们必须严格重叠而不仅仅是接触。
答案 here 对您提出的问题非常有用,因为它可以压缩重叠范围的数组:
from operator import itemgetter
def consolidate(intervals):
sorted_intervals = sorted(intervals, key=itemgetter(0))
if not sorted_intervals: # no intervals to merge
return
# low and high represent the bounds of the current run of merges
low, high = sorted_intervals[0]
for iv in sorted_intervals[1:]:
if iv[0] <= high: # new interval overlaps current run
high = max(high, iv[1]) # merge with the current run
else: # current run is over
yield low, high # yield accumulated interval
low, high = iv # start new run
yield low, high # end the final run
l1
和 l2
的合并只是 l1
和 l2
中所有范围的合并:
def union(l1, l2):
return consolidate([*l1, *l2])
l1
和 l2
的交叉由 AChampion 的代码充分完成(如果 l1
中的任何范围与 l2
中的任何范围之间存在任何重叠,即重叠应该在结果中),但它可能导致范围碎片化;我们可以使用相同的函数来连接相邻或重叠的范围:
from itertools import product
def intersection(l1, l2):
result = ((max(s1, s2), min(e1, e2)) for (s1, e1), (s2, e2) in product(l1, l2) if s1 < e2 and e1 > s2)
return consolidate(result)
一个例子:
l1 = [(1, 7), (4, 8), (10, 15), (20, 30), (50, 60)]
l2 = [(3, 6), (8, 11), (15, 20)]
print(list(union(l1, l2))) # [(1, 30), (50, 60)]
print(list(intersection(l1, l2))) # [(3, 6), (10, 11)]
(为清楚起见,该示例使用整数,但它适用于任何可比较的类型。具体而言,对于 OP 的 l1
和 l2
,代码会产生 OP 所需的 datetime
结果。)