使用 Linux 命令(或 python)计算日志文件中打印时间的平均值?

Calculate average of the printed time in a log file with Linux commands (or python)?

我有一个如下所示的日志文件:

[info] Estimate the time: 2.7s
[info] Estimate some other time: 7.9s 
[info] Estimate the time: 5.6s
[debug] variable x uninitialized

我想计算“Estimate the time:”之后的平均时间,我这个例子是(2.7+5.6)/2=4.15

如何使用 Linux 命令或 python 快速获取此号码?谢谢

sum=0
cnt=0
for log in logs:
  if "Estimate the time" in log:
    sum += extractSecondFromLog()
    cnt += 1
print(sum/cnt)

这是一个使用正则表达式的 python 脚本:

import re

# Open the file and get the data in a string
f = open('your_log', 'r')
text = f.read()

# Use regex to find the pattern
matches = re.findall(r'Estimate the time: (\d+\.\d+)s', text)
if matches:
    times = [float(time) for time in matches] # Convert str in float
    mean = sum(times) / len(times) # Calculate the mean with built-in methods
    print(mean)
else:
    print("no data")
awk '/\[info\] Estimate the time:/ { map[cnt++]=+ } END { for (i in map) { cnt1++;tot=tot+map[i] } print tot/cnt1 }' logfile

解释:

awk '/\[info\] Estimate the time:/ {                # Process lines that contain "[info] Estimate the time:"
                 map[cnt++]=+                     # Create an array called map with an incrementing index and the 5th space delimited field as the value
               } 
           END {                                    # Process at the end of the file
                 for (i in map) { 
                    cnt1++;                         # Loop through the array and increment a counter with each iteration
                    tot=tot+map[i]                  # Create a running total variable
                 } 
                 print tot/cnt1                     # Print the running total divided by the count.
                }' logfile