使用 python 从日志文件计算增量时间

Calculating delta time from a log file using python

我一直在尝试从日志文件中查找第一个和最后一个时间戳的增量

这是日志文件的一部分

[2020-07-31 15:49:22,015][SRC.Env][I]:Reading 
[2020-07-31 15:49:22,015][SRC.Env][I]:Finished Initilization 
[2020-07-31 15:49:22,052][SRC][I]:Creating link
[2020-07-31 15:49:22,053][SRC][I]:Starting
.
.
.
[2020-08-03 09:17:29,351][SRC.Upload][I]:Finished

以下是我到目前为止所做的

import re
from dateutil import parser

with open('run.log') as run_log:
  times = [re.findall(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}',
      line) for line in run_log.readlines() if 'SRC' in line]
print(times)

time_delta = parser.parse(times[-1]) - parser.parse(times[0])
print(time_delta)

当我打印时间时,它似乎一直显示(如预期的那样)[['2020-07-31 15:49:22,011'], ['2020-07-31 15:49:22,015'],...['2020-08-03 09:17:29,351']]

但是,当我尝试将第一次减去最后一次时,我收到以下错误

    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\dateutil\parser\_parser.py", line 646, in parse
    res, skipped_tokens = self._parse(timestr, **kwargs)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\dateutil\parser\_parser.py", line 725, in _parse
    l = _timelex.split(timestr)         # Splits the timestr into tokens
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\dateutil\parser\_parser.py", line 207, in split
    return list(cls(s))
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\dateutil\parser\_parser.py", line 76, in __init__
    '{itype}'.format(itype=instream.__class__.__name__))
TypeError: Parser must be a string or character stream, not list

大约两个月前我决定学习如何编码,所以任何帮助都会对我的进步有所帮助。谢谢:)

问题是re.findall()returns一个list.

您可以使用 re.findall(pattern, s)[0]

访问单个元素
import re
from dateutil import parser

with open('run.log') as run_log:
    times = [re.findall(r'\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}',
                        line)[0] for line in run_log.readlines() if 'SRC' in line]
print(times)

time_delta = parser.parse(times[-1]) - parser.parse(times[0])
print(time_delta)

输出:

['2020-07-31 15:49:22,015', '2020-07-31 15:49:22,015', '2020-07-31 15:49:22,052', '2020-07-31 15:49:22,053', '2020-08-03 09:17:29,351']
2 days, 17:28:07.336000