循环不会遍历 bash 中的日志文件

The loop does not iterate over the log file in bash

欢迎大家,

无奈,只好求助

我正在尝试编写一个简单的脚本来显示基于日志文件的过程时间。但是,我的循环无法正常工作。

fun.sh

STARTPROCEDURES="START Search"
ENDPROCEDURES="END Search"

start() {
start="$(grep "$STARTPROCEDURES" s.log | cut -d ' ' -f2)"

hours="$(date -d $start '+%H')"
minutes="$(date -d $start '+%M')"
seconds="$(date -d $start '+%S')"
milliseconds=$(($(date -d $start +%N  | sed 's/^0*//')/1000000))

starttime=$(((hours * 360000000 + minutes * 60000 + seconds * 1000 + milliseconds)))

echo "Begins: " $starttime
}

end() {
end="$(grep "$ENDPROCEDURES" s.log | cut -d ' ' -f2)"

hours="$(date -d $end '+%H')"
minutes="$(date -d $end '+%M')"
seconds="$(date -d $end '+%S')"
milliseconds=$(($(date -d $end +%N  | sed 's/^0*//')/1000000))

endtime=$(((hours * 360000000 + minutes * 60000 + seconds * 1000 + milliseconds)))

echo "Ends: " $endtime
}

difference() {
echo "DIFFERENCE -----"

difference=$((endtime  - starttime))
echo "The difference is" $difference "milliseconds"
}

calculate() {
start
end
difference
}

while IFS= read -r line || [[ -n $line ]]; do
calculate
echo "-------"
done < s.log

s.log:

2019-02-22 06:27:06,857 INFO [ProcedureUtil] - (user1,14)  START Search
2019-02-22 06:27:06,939 INFO [ProcedureUtil] - (user1,14)  END Search
2019-02-22 07:28:16,088 INFO [ProcedureUtil] - (user1,67)  START Search
2019-02-22 07:28:16,121 INFO [ProcedureUtil] - (user1,67)  END Search

bash fun.sh

休斯顿,我们有问题了。

控制台输出:

date: extra operand ‘+%H’
Try 'date --help' for more information.
date: extra operand ‘+%M’
Try 'date --help' for more information.
date: extra operand ‘+%S’
Try 'date --help' for more information.
date: extra operand ‘+%N’
Try 'date --help' for more information.
fun.sh: line 21: /1000000: syntax error: operand expected (error token is "/1000000")

应该是这样的

Begins:  2161626857
Ends:  2161626939
DIFFERENCE -----
The difference is 82 milliseconds
------
Begins:  216162343
Ends:  216162355
DIFFERENCE -----
The difference is 162 milliseconds

如果 s.log 是:

2019-02-22 06:27:06,857 INFO [ProcedureUtil] - (user1,14)  START Search
2019-02-22 06:27:06,939 INFO [ProcedureUtil] - (user1,14)  END Search

控制台输出:

Begins:  2161626857
Ends:  2161626939
DIFFERENCE -----
The difference is 82 milliseconds
-------
Begins:  2161626857
Ends:  2161626939
DIFFERENCE -----
The difference is 82 milliseconds
-------

两次一样,应该是一次。

求助我会很优雅

我正在考虑一个案例: 其中 s.log 是:

2019-02-22 06:27:06,857 INFO [ProcedureUtil] - (user1,14)  START Search
2019-02-22 06:27:06,939 INFO [ProcedureUtil] - (user1,14)  END Search
2019-02-22 07:28:16,088 INFO [ProcedureUtil] - (user1,67)  START Split
2019-02-22 07:28:16,121 INFO [ProcedureUtil] - (user1,67)  END Split
2019-02-22 07:28:16,088 INFO [ProcedureUtil] - (user1,67)  START Search
2019-02-22 07:28:16,121 INFO [ProcedureUtil] - (user1,67)  END Search
2019-02-25 20:59:59,999 INFO [ProcedureUtil] - (user1,17)  START Search
2019-02-25 02:59:59,999 INFO [ProcedureUtil] - (user1,18)  START Search

错误的解决方案:

COUNTER=0
while IFS= read -r line || [[ -n $line ]]; do
if [[ $line == *"START Search"* ]]; then
    start=$(time2millis "$line")
    echo "Begins: $start"

elif [[ $line == *"END Search"* ]]; then 
    end=$(time2millis "$line")
    echo "Ends: $end"

    # Assume every END has a preceding START
    difference "$start" "$end"
else
     COUNTER=$((COUNTER+1))
     echo "$COUNTER"
fi     
done < s.log

控制台输出:

Begins: 23226857
Ends: 23226939
DIFFERENCE -----
The difference is 82 milliseconds
------
Begins: 26896088
Ends: 26896121
DIFFERENCE -----
The difference is 33 milliseconds
------
Begins: 75599999
Begins: 10799999

顾虑:

  • 你正在逐行阅读文件,但你没有在任何地方使用 $line:你正在 grep 整个文件
  • 因为 grep 调用 returns 多次匹配,你的 $start 变量包含换行符
  • 因为你没有在 date 调用中引用变量,所以看起来像这样

    date -d 06:27:06,857 07:28:16,088 '+%H'
    

    显然参数太多了 date

大多数错误都可以通过简化代码来消除。请注意,您的 startend 函数在功能上是相同的。

difference() {
    local diff=$((   -  ))
    printf 'DIFFERENCE -----\nThe difference is %d milliseconds\n------\n' "$diff"
}

time2millis() {
    local time=$(echo "" | cut -d ' ' -f 2)
    IFS=:, read -r hh mm ss nnn <<<"$time"
    # be aware of invalid octal numbers 08 and 09: 
    # each component of the time must be handled as a decimal number
    echo "$(( (((10#$hh) * 60 + 10#$mm) * 60 + 10#$ss) * 1000 + 10#$nnn ))"
}

while IFS= read -r line || [[ -n $line ]]; do
    if [[ $line == *"START Search"* ]]; then
        start=$(time2millis "$line")
        echo "Begins: $start"

    elif [[ $line == *"END Search"* ]]; then 
        end=$(time2millis "$line")
        echo "Ends: $end"

        # Assume every END has a preceding START
        difference "$start" "$end"
    fi
done < s.log

进一步发展:

  1. 这里不考虑END发生在第二天的情况

    2019-02-28 23:59:59,999 blah blah START Search
    2019-03-01 00:00:00,001 blah blah END Search
    

    您将看到 -86399998

  2. 的差异,而不是返回 2 毫秒的差异
  3. 您没有考虑夏令时(除非您的日志以 UTC 格式记录时间戳)。

我认为可以通过使用 date 解析日期和时间来解决这些问题:

time2millis() {
    local epoch=$( date -d "$(echo "" | cut -d ' ' -f 1,2)" '+%s' )
    local millis=0
    [[  =~ ,([0-9]+) ]] && millis=${BASH_REMATCH[1]}
    echo "$(( $epoch * 1000 + 10#$millis ))"
}

完整的解决方案:

difference() {
    local diff=$((  -  ))
    while (( diff < 0 )); do (( diff += 86400000 )); done
    printf "DIFFERENCE  -----\nThe difference is %d milliseconds\n------\n" "$diff"
}

# returns milliseconds since 1970-01-01 00:00:00 UTC
time2millis() {
    local epoch=$( date -d "$(echo "" | cut -d ' ' -f 1,2)" '+%s' )
    local millis=0
    [[  =~ ,([0-9]+) ]] && millis=${BASH_REMATCH[1]}
    echo "$(( $epoch * 1000 + 10#$millis ))"
}

declare -A startTime startLine
while IFS= read -r line || [[ -n $line ]]; do
    read -a words <<<"$line"
    key=${words[5]}

    if [[ "${words[6]} ${words[7]}" == "START Search" ]]; then
        startLine[$key]=$line
        startTime[$key]=$(time2millis "$line")

    elif [[ "${words[6]} ${words[7]}" == "END Search" ]]; then
        if [[ -z ${startTime[$key]} ]]; then
            echo "END seen with no START: $line"
        else
            end=$(time2millis "$line")
            difference "$key" "${startTime[$key]}" "$end"
            unset startLine[$key] startTime[$key]
        fi
    fi
done < s.log

echo "Searches STARTed but not ENDed:"
printf "%s\n" "${startLine[@]}"

给定你的输入数据,输出

DIFFERENCE (user1,14) -----
The difference is 82 milliseconds
------
DIFFERENCE (user1,67) -----
The difference is 33 milliseconds
------
Searches STARTed but not ENDed:
2019-02-25 02:59:59,999 INFO [ProcedureUtil] - (user1,18)  START Search
2019-02-25 20:59:59,999 INFO [ProcedureUtil] - (user1,17)  START Search