Python嵌套列表-时间间隔-交差

Python Nested list -Time intervals - intersection and difference

我对嵌套列表有问题,时间作为元素

time=[(2017-01-01T00:00:00.000000Z,2017-01-01T00:00:39.820000Z),
(2017-01-01T00:00:38.840000Z,2017-01-01T01:36:33.260000Z),
(2017-01-01T01:36:45.960000Z,2017-01-01T03:06:15.340000Z),
(2017-01-01T03:06:24.320000Z,2017-01-01T03:31:00.420000Z),
(2017-01-01T03:31:22.880000Z,2017-01-01T03:48:43.500000Z),
(2017-01-01T03:48:53.280000Z,2017-01-01T04:14:53.660000Z),
(2017-01-01T04:15:15.160000Z,2017-01-01T04:34:44.060000Z),
(2017-01-01T04:34:57.440000Z,2017-01-01T04:46:31.100000Z),
(2017-01-01T04:46:53.320000Z,2017-01-01T05:22:20.340000Z),
(2017-01-01T05:22:24.920000Z,2017-01-01T06:17:30.900000Z),
(2017-01-01T06:18:02.280000Z,2017-01-01T07:01:45.740000Z),
(2017-01-01T07:02:04.640000Z,2017-01-01T07:39:48.780000Z),
(2017-01-01T07:40:12.400000Z,2017-01-01T08:19:46.140000Z),
(2017-01-01T08:20:13.520000Z,2017-01-01T10:17:45.380000Z),
(2017-01-01T10:17:59.880000Z,2017-01-01T15:01:29.100000Z),
(2017-01-01T15:01:55.840000Z,2017-01-01T15:08:45.460000Z),
(2017-01-01T15:09:04.000000Z,2017-01-01T15:42:13.180000Z),
(2017-01-01T15:42:30.360000Z,2017-01-01T16:14:07.340000Z),
(2017-01-01T16:14:24.560000Z,2017-01-01T17:11:28.420000Z),
(2017-01-01T17:11:32.960000Z,2017-01-01T17:46:07.660000Z),
(2017-01-01T17:46:30.280000Z,2017-01-01T18:02:17.860000Z),
(2017-01-01T18:02:35.240000Z,2017-01-01T18:16:17.740000Z),
(2017-01-01T18:16:26.720000Z,2017-01-01T18:39:10.540000Z),
(2017-01-01T18:39:19.360000Z,2017-01-01T19:45:25.860000Z),
(2017-01-01T19:45:34.720000Z,2017-01-01T20:41:00.220000Z),
(2017-01-01T20:41:21.520000Z,2017-01-01T21:13:51.660000Z),
(2017-01-01T21:14:13.360000Z,2017-01-01T21:41:16.220000Z),
(2017-01-01T21:41:28.640000Z,2017-01-01T22:03:03.820000Z),
(2017-01-01T22:03:29.400000Z,2017-01-01T23:14:13.500000Z),
(2017-01-01T23:14:35.200000Z,2017-01-01T23:59:59.980000Z)]

如你所见,所有元素都属于同一天,2017-01-01,我想做的是这一天(86400s)和所有时间间隔的差异(以秒或毫秒为单位)在列表中,但有一些重叠,所以我认为首先我必须做一些"intersection check",并且在所有交集设置之后,只做所有元素与86400之间的差异,但如何才能我做那个路口检查?任何建议将不胜感激,提前致谢!

期望的输出: 86400(天)- 85000(列表时间交集后可能的秒数)= 1400

将字符串转换为数字后,您可以使用 Python find continuous interesctions of intervals

中的最佳答案

您可以排序然后合并任何重叠

time.sort()
noOverlapList = []
start = time[0][0] # start of first interval
end = time[0][1] # end of first interval
for interval in time:
    # if interval overlaps with tempInterval
    if interval[0] < end and interval[1] > end:
            end = interval[1]
    else if interval[0] > end:
        noOverlapList.append((start, end)) # merged non overlapping interval
        start = interval[0]
        end = interval[1]

然后将noOverlaplList中包含的区间相加,求差

问题是双重的:

  • 用并集替换重叠区间;
  • 对生成的非重叠区间求和。

第一部分可以这样完成:

time.sort()
new_time = [list(time[0])]
for t in time[1:]:
    if t[0] <= new_time[-1][1]:
        if t[1] > new_time[-1][1]:
            new_time[-1][1] = t[1]
    else:
        new_time.append(list(t))

而第二部分最好使用 datetime 模块完成:

import datetime

total = sum([ ( datetime.datetime.strptime(t[1], '%Y-%m-%dT%H:%M:%S.%fZ') -
                datetime.datetime.strptime(t[0], '%Y-%m-%dT%H:%M:%S.%fZ') ).total_seconds()
              for t in new_time ])

print(86400 - total)