sed

Question

我有一个非常大的日志文件（超过 2GB）并且想要删除所有包含 'EntityFramework' 的调试日志。

info: Microsoft.EntityFrameworkCore.Migrations[20405]
      No migrations were applied. The database is already up to date.
dbug: Microsoft.EntityFrameworkCore.Infrastructure[10407]
      'IDCDbContext' disposed.
warn: Microsoft.AspNetCore.DataProtection.Repositories.FileSystemXmlRepository[60]
      Storing keys in a directory '/root/.aspnet/DataProtection-Keys' that may not be persisted outside of the container. Protected data will be unavailable when container is destroyed.
info: Hangfire.PostgreSql.PostgreSqlStorage[0]
      Start installing Hangfire SQL objects...

这里我想去掉下面的日志，其他的保留

dbug: Microsoft.EntityFrameworkCore.Infrastructure[10407]
      'IDCDbContext' disposed.

到目前为止我尝试过的：

sed -i '/^dbug/{:b,N;/^[[:lower:]]/!bb};/.*EntityFramework.*/d' logs

但是结果是 sed: can't find label for jump to b'

有什么想法吗？

Answer 1

当前错误是由于b标签后面的逗号，必须有分号。此外，您应该将 /.*EntityFramework.*/d（或更好，/EntityFramework/d）包含到命令块中，以便它仅在其中执行：

sed -i '/^dbug/{:b;N;/^[[:lower:]]/!bb;/EntityFramework/d}' logs

参见online demo。

Answer 2

这份工作更适合awk:

awk '!p || !/^[[:blank:]]/ {p = /^dbug:/} !p' file

info: Microsoft.EntityFrameworkCore.Migrations[20405]
      No migrations were applied. The database is already up to date.
warn: Microsoft.AspNetCore.DataProtection.Repositories.FileSystemXmlRepository[60]
      Storing keys in a directory '/root/.aspnet/DataProtection-Keys' that may not be persisted outside of the container. Protected data will be unavailable when container is destroyed.
info: Hangfire.PostgreSql.PostgreSqlStorage[0]
      Start installing Hangfire SQL objects...

我们保留一个标志p来控制是否打印。当一行以 /dbug:/ 开头时，p 设置为 1，并且对于 dbug: 之后以空格开头的行保持设置。

Answer 3

第一个解决方案： 使用您显示的示例，请尝试以下 awk 程序，用 GNU awk。简单的解释是，使用 awk 的 match 函数来匹配正则表达式 \ndbug:[^\n]*\n[^\n]* 并仅打印 OP 显示输出所需的那些行（仅不匹配的行）。

awk -v RS= 'match([=10=],/\ndbug:[^\n]*\n[^\n]*/){print substr([=10=],1,RSTART-1) substr([=10=],RSTART+RLENGTH)}' Input_file

第二个解决方案：使用awk的记录分隔符功能并打印OP所需的适当值。

awk -v RS='\ndbug:[^\n]*\n[^\n]*\n' '{gsub(/\n+$/,"")}1' Input_file

sed - 如果包含以“找不到标签”结尾的行，则删除多行

sed - remove multiple lines if contains ends in 'can't find label'

regex