如何为管道日志输出中的每一行添加时间戳？

Question

我必须使用 cron 运行一些相当冗长的遗留 PHP 脚本。我想捕获 pid、输出（包括 std 和 err）并将其记录到文件中，但我还希望此输出的每一行都带有时间戳。这些脚本非常脆弱，无法通过日志记录正确检测它们。

我想出了一些 bash + gawk 脚本来完成这个：

# run php in a subshell, capturing the PID and redirecting the output.
(php $@ 2>&1 & echo $! >&3) 3> $PID_FILE | gawk \
    -v php_pid=$(while ! test -s $PID_FILE; do sleep .1; done; cat $PID_FILE) \
    '{ print "[" php_pid strftime(" %Y-%m-%d %H:%M:%S] "), [=10=] }' \
    >> $LOG_FILE

这工作正常，但是，对于一个原本简单的问题，它是一个相当复杂和晦涩的解决方案。

是否有更简单、更好的解决方案或工具来完成此任务？

Answer 1

一个实际的选择 -

让 awk 处理等待。

mkfifo pipe_name           # a FIFO as a tempfile for un-timestamped data
php $@ > pipe_name 2>&1 &  # your program in background to the FIFO
echo $! > $PID_FILE        # the pid into a holding file
gawk 'BEGIN { while ( 0 == getline php_pid < "pid" ) { system("sleep .1") } }
  { print "[" php_pid strftime(" %Y-%m-%d %H:%M:%S] "), [=10=] }' pipe_name >> $LOG_FILE
rm -f pid pipe_name        # clean up

我不喜欢这会产生一个完整的子 shell 用于第 10 秒的睡眠，但它避免了加载另一个扩展，并且不必担心这个本地系统是高于还是低于 5.1 版... (c.f. manual entry 详情)

只是重写

经过仔细检查，这个解决方案充满了问题，我不建议您使用它。

首先，如果输入有多个字符串，它会失败。

$: for x in one two three; do sleep 1; echo test $x; done | sed 's/^/date +"%F %T - "/e'
date: extra operand ‘one’
Try 'date --help' for more information.

date: extra operand ‘two’
Try 'date --help' for more information.

date: extra operand ‘three’
Try 'date --help' for more information.

我可以通过将整行放入日期格式来解决这个问题：

$: for x in one two three; do sleep 1; echo test $x; done | sed -E 's/^(.*)/date +"%F %T - "/e'
2020-12-09 09:50:15 - test one
2020-12-09 09:50:16 - test two
2020-12-09 09:50:18 - test three

但这会破坏它认为是有效格式化指令的任何数据。

$: for x in one two three; do sleep 1; echo "test %c $x"; done | sed -E 's/^(.*)/date +"%F %T - "/e'
2020-12-09 09:51:56 - test Wed, Dec 09, 2020  9:51:56 AM one
2020-12-09 09:51:57 - test Wed, Dec 09, 2020  9:51:57 AM two
2020-12-09 09:51:58 - test Wed, Dec 09, 2020  9:51:58 AM three

此方法确实比您已有的方法笨拙得多，因为它为每一行生成一个 date 进程...

添加 PID 是它变得有问题的唯一原因，你的解决方案虽然看起来很痛苦，但实际上一点也不差。

再看看...PID 文件是否真的需要足够长的时间来写入生成 gawk 读取空值？这对我来说很好用：

$: (./tst 2>&1 & echo $! >pid ) | awk -v php_pid=$(<pid) '{
     print "[" php_pid strftime(" %Y-%m-%d %H:%M:%S] "), [=14=] }'

如果您只是想要一种稍微不那么混乱、更易于维护的方式来编写基本相同的逻辑 -

{ ./tst 2>&1 & echo $! >pid; } | { declare -i val=0; until (( val )); do cnt=$(<pid); sleep .1; done; awk -v php_pid=$(<pid) '{print "[" php_pid strftime(" %Y-%m-%d %H:%M:%S] "), [=15=] }'; }

切换到 curly enclosure，工作正常。

或者，使用您的价值观，

{ php $@ 2>&1 & echo $! > $PID_FILE; } |                 # run the job in bg, save PID
{ declare -i val=0;                                      # initialize a gatekeeper
  until (( val )); do cnt=$(<$PID_FILE); sleep .1; done; # wait till pid is loaded
  # now process  the data
  gawk -v php_pid=$(<$PID_FILE) '{ print "[" php_pid strftime(" %Y-%m-%d %H:%M:%S] "), [=16=] }' >> $LOG_FILE
}

原创

用表达式替换通过 sed 管道。

$: for x in one two three; do sleep 1; echo $x; done | sed 's/^/date +"%F %T - "/e'
2020-12-04 10:26:25 - one
2020-12-04 10:26:26 - two
2020-12-04 10:26:27 - three

如何为管道日志输出中的每一行添加时间戳？

How can I add a timestamp to every line in a piped log output?

php

cron

logging

一个实际的选择 -

只是重写

原创