如何根据日期时间过滤列表？

Question

这是我的清单：

matched_rows_2 =[
    ['1', '07-09-2020', '8:43:02', '100', 'TTF'],
    ['2', '07-09-2020', '8:43:02', '100', 'GGY'],
    ['3', '07-09-2020', '7:53:08', '120', 'HHJ'],
    ['4', '07-09-2020', '7:54:01', '160', 'JJH'],
    ['5', '07-09-2020', '8:30:00', '160', 'RRT'],
    ['6', '07-09-2020', '10:10:10', '160', 'PPO'],
    ['7', '07-09-2020', '11:12:11', '100', 'KKG'],
    ['8', '07-09-2020', '11:31:55', '160', 'PPO']]

我正在尝试执行以下操作：

对于每个车辆编号 (index[3])，我试图获取日期时间最接近 chosen_datetime.

我试过很多东西，但似乎还没有用。 下面是我的鳕鱼e:

chosen_datetime = datetime.fromisoformat("2020-07-09 08:43:55+00:00")
dts = [datetime.strptime(sub[1] + ' ' + sub[2], "%d-%m-%Y  %H:%M:%S").replace(tzinfo=timezone.utc) for sub in matched_rows_2]

for x in matched_rows_2:
    closest_to_chosen_datetime = min(dts, key=lambda d: max( d, chosen_datetime) - min(d, chosen_datetime))
    if closest_to_chosen_datetime:
        print(x)

这是我想要的输出：

['1', '07-09-2020', '8:43:02', '100', 'TTF'],
['2', '07-09-2020', '8:43:02', '100', 'GGY'],
['3', '07-09-2020', '7:53:08', '120', 'HHJ'],
['5', '07-09-2020', '8:30:00', '160', 'RRT'],

这是我当前的输出：

['1', '07-09-2020', '8:43:02', '100', 'TTF'],
['2', '07-09-2020', '8:43:02', '100', 'GGY'],
['3', '07-09-2020', '7:53:08', '120', 'HHJ'],
['4', '07-09-2020', '7:54:01', '160', 'JJH'],
['5', '07-09-2020', '8:30:00', '160', 'RRT'],
['6', '07-09-2020', '10:10:10', '160', 'PPO'],
['7', '07-09-2020', '11:12:11', '100', 'KKG'],
['8', '07-09-2020', '11:31:55', '160', 'PPO']]

我真的不知道发生了什么，出了什么问题。

Answer 1

第一个问题是您的循环中有一个 print 命令。由于您不知道哪些行的日期时间最接近 chosen_datetime 直到 after 您已经遍历了所有项目，这是过早的并且是错误输出的重要原因.

其次，因为您正在寻找每辆车最接近的日期时间 number 你需要一些逻辑来按车辆分组数.

一个选项是使用 itertools.groupby 的解决方案；其他解决方案——我在这里实施的——将结果存储在由车辆编号键入的字典。

下面的代码中有一些注释，如果有的话请告诉我你想要一些额外的细节。

from collections import defaultdict
from datetime import datetime, timezone, timedelta


matched_rows_2 = [
    ['1', '07-09-2020', '8:43:02', '100', 'TTF'],
    ['2', '07-09-2020', '8:43:02', '100', 'GGY'],
    ['3', '07-09-2020', '7:53:08', '120', 'HHJ'],
    ['4', '07-09-2020', '7:54:01', '160', 'JJH'],
    ['5', '07-09-2020', '8:30:00', '160', 'RRT'],
    ['6', '07-09-2020', '10:10:10', '160', 'PPO'],
    ['7', '07-09-2020', '11:12:11', '100', 'KKG'],
    ['8', '07-09-2020', '11:31:55', '160', 'PPO']]

chosen_datetime = datetime.fromisoformat("2020-07-09 08:43:55+00:00")
dts = [
    datetime.strptime(f'{row[1]} {row[2]}', '%m-%d-%Y %H:%M:%S').replace(tzinfo=timezone.utc)
    for row in matched_rows_2
]

mindelta = defaultdict(lambda: None)
minrows = defaultdict(lambda: None)

# use zip() to combine the timestamps in dts with the
# original data
for ts, row in zip(dts, matched_rows_2):
    # get the absolute difference from chosen_datetime
    delta = abs((ts - chosen_datetime).total_seconds())
    vid = row[3]

    # if it's the closest value for this vid (or if we haven't
    # processed the vid yet), update mindelta[vid] with the current
    # delta and set minrows[vid] to the current row.
    if mindelta[vid] is None or delta < mindelta[vid]:
        mindelta[vid] = delta
        minrows[vid] = [row]

    # if the current delta is equal to the existing closest delta,
    # just append the current row.
    elif delta == mindelta[vid]:
        minrows[vid].append(row)

for vid, rows in minrows.items():
    for row in rows:
        print(row)

运行上述程序产生以下输出：

['1', '07-09-2020', '8:43:02', '100', 'TTF']
['2', '07-09-2020', '8:43:02', '100', 'GGY']
['3', '07-09-2020', '7:53:08', '120', 'HHJ']
['5', '07-09-2020', '8:30:00', '160', 'RRT']

如何根据日期时间过滤列表？

How to filter your list based on datetime?

python

datetime

for-loop

list

min