找到模式时将行转为列
transpose the rows to column when pattern found
这是我要处理的示例文件。我想使用关键字 "description" 作为某种 RS,但不知道该怎么做,而且它不一致。
背景:我正在处理一个日志文件,其中第一行包含 date/time 戳记 (APR12),第二行有关于日志的描述。此描述适用于少数日志和 missig。
001 APR12 aaa bbb
Description: This is a test file.
002 APR12 aaa bbb
Description: This is another test file.
003 APR12 aaa XXX
004 APR12 aaa bbb
Description: This is another,after skipping one.
期望的输出:
001 APR12 aaa bbb Description: This is a test file.
002 APR12 aaa bbb Description: This is another test file.
003 APR12 aaa XXX
004 APR12 aaa bbb Description: This is another,after skipping one.
您可以在当前行不以 "Description":
开头时每次添加一个换行符
awk 'NR>1 && !/^Description/{print ""}{printf "%s ", [=10=]}' file
NR>1
防止在输出的开头添加换行符。
如果处理了任何行,您可能还想添加一个 END
块以在输出末尾添加换行符:END{if(NR)print ""}
.
$ awk '{printf "%s%s", (/^[0-9]/?rs:FS), [=10=]; rs=RS} END{print ""}' file
001 APR12 aaa bbb Description: This is a test file.
002 APR12 aaa bbb Description: This is another test file.
003 APR12 aaa XXX
004 APR12 aaa bbb Description: This is another,after skipping one.
可能太复杂了,但这里有一个解决方案 sed
:
# Does the line contain description?
# Yes ...
/Description/{
# Exchange hold and pattern space
x
# Append hold space to pattern space
# separated by newline
G
# Remove that newline by a space
s/\n\+/ /gp
}
# No ...
/Description/! {
# Exchange hold and pattern buffer
x
# The hold buffer contains a prefix line
/Description/! {
# Print it
p
}
# Exchange hold and pattern buffer again
x
# Store current line in the hold buffer
h
}
这可能适合您 (GNU sed):
sed 'N;s/\n\(Description\)/ /;P;D' file
在整个文件中读取成对的行,如果成对的第二行以 Description
.
开头,则用 space 替换换行符
sed ':a
N;$!ba
s/\n\([^0-9]\)/ /g' YourFile
- 直到你没有大文件(加载到内存中)。
- 加入不以数字开头的行
如果您有 4.2.2 之后的 GNU sed 版本允许 -z
(-Z option)。感谢@JJoao 优化代码。
sed -z 's/\n\(^[0-9]\)/ /g' YourFile
perl -p0e 's!\n(?=Des)! !g'
(未测试) -- 将所有文件加载到内存中...
这是我要处理的示例文件。我想使用关键字 "description" 作为某种 RS,但不知道该怎么做,而且它不一致。
背景:我正在处理一个日志文件,其中第一行包含 date/time 戳记 (APR12),第二行有关于日志的描述。此描述适用于少数日志和 missig。
001 APR12 aaa bbb
Description: This is a test file.
002 APR12 aaa bbb
Description: This is another test file.
003 APR12 aaa XXX
004 APR12 aaa bbb
Description: This is another,after skipping one.
期望的输出:
001 APR12 aaa bbb Description: This is a test file.
002 APR12 aaa bbb Description: This is another test file.
003 APR12 aaa XXX
004 APR12 aaa bbb Description: This is another,after skipping one.
您可以在当前行不以 "Description":
开头时每次添加一个换行符awk 'NR>1 && !/^Description/{print ""}{printf "%s ", [=10=]}' file
NR>1
防止在输出的开头添加换行符。
如果处理了任何行,您可能还想添加一个 END
块以在输出末尾添加换行符:END{if(NR)print ""}
.
$ awk '{printf "%s%s", (/^[0-9]/?rs:FS), [=10=]; rs=RS} END{print ""}' file
001 APR12 aaa bbb Description: This is a test file.
002 APR12 aaa bbb Description: This is another test file.
003 APR12 aaa XXX
004 APR12 aaa bbb Description: This is another,after skipping one.
可能太复杂了,但这里有一个解决方案 sed
:
# Does the line contain description?
# Yes ...
/Description/{
# Exchange hold and pattern space
x
# Append hold space to pattern space
# separated by newline
G
# Remove that newline by a space
s/\n\+/ /gp
}
# No ...
/Description/! {
# Exchange hold and pattern buffer
x
# The hold buffer contains a prefix line
/Description/! {
# Print it
p
}
# Exchange hold and pattern buffer again
x
# Store current line in the hold buffer
h
}
这可能适合您 (GNU sed):
sed 'N;s/\n\(Description\)/ /;P;D' file
在整个文件中读取成对的行,如果成对的第二行以 Description
.
sed ':a
N;$!ba
s/\n\([^0-9]\)/ /g' YourFile
- 直到你没有大文件(加载到内存中)。
- 加入不以数字开头的行
如果您有 4.2.2 之后的 GNU sed 版本允许 -z
(-Z option)。感谢@JJoao 优化代码。
sed -z 's/\n\(^[0-9]\)/ /g' YourFile
perl -p0e 's!\n(?=Des)! !g'
(未测试) -- 将所有文件加载到内存中...