如何在文件中搜索第 12 位的字符,如果找到则删除所有行中的那个字符和接下来的 5 个字符?
How can I search a file for a character in position 12 and if found delete that and next 5 characters from all lines?
我正在尝试解析 RSS 提要并压缩行中的信息,以便我仍然拥有条目的日期和时间,但没有毫秒或浪费的空间,因为我正在将文件提供给 xscreensaver 文本可读屏幕宽度受限的抓取。如果这样会更容易,我可以更改我的代码以在格式化文本之前不添加 2 个标题行。感谢任何想法...
The input file at this point looks like this:
ABC World News Feed
RSS Data retrieved from https:--abcnews.go.com-abcnews-headlines
05-24 18:48:16 Truckers' strike leads to fuel shortages in Brazil
05-24 18:48:16 The marathon atop the world's deepest lake
^^^^^^
Remove these character positions starting from 12 to 17
from each title line, with colon in 12 but not from the heading lines
So the result should look like:
ABC World News Feed
RSS Data retrieved from https:--abcnews.go.com-abcnews-headlines
05-24 18:48 Truckers' strike leads to fuel shortages in Brazil
05-24 18:48 The marathon atop the world's deepest lake
我的做法是用单个 space:
替换冒号后跟两位数后跟至少一个 space
$ sed 's/:[[:digit:]][[:digit:]] */ /' file
ABC World News Feed
RSS Data retrieved from https:--abcnews.go.com-abcnews-headlines
05-24 18:48 Truckers' strike leads to fuel shortages in Brazil
05-24 18:48 The marathon atop the world's deepest lake
如果您想要真正具体的位置,您可以使用 ^
将搜索锚定到行的开头,并使用带有反向引用 </code> 的括号。这里的点 <code>.
匹配任意字符:
$ sed 's/^\(..-.. ..:..\):[[:digit:]][[:digit:]] */ /' file
ABC World News Feed
RSS Data retrieved from https:--abcnews.go.com-abcnews-headlines
05-24 18:48 Truckers' strike leads to fuel shortages in Brazil
05-24 18:48 The marathon atop the world's deepest lake
关注 awk
可能会对您有所帮助。
awk ' ~ /[0-9]+:[0-9]+:[0-9]+/{sub(/:[0-9]+ +/,OFS)} 1' Input_file
如果你想将输出保存到 Input_file 本身,那么在上面的命令中也附加 > temp_file && mv temp_file Input_file
。
说明:这里也加上说明。
awk '
~ /[0-9]+:[0-9]+:[0-9]+/{ ##Checking condition here if 2nd field is matching digit colon digit colon digit pattern then do following.
sub(/:[0-9]+ +/,OFS) ##Using substitute function of awk to substitute colon digit(s) then space with OFS whose default value is space in current line.
}
1 ##awk works on method of condition and then action, so making condition TRUE here and not mentioning action so print will happen.
' Input_file ##Mentioning Input_file name here.
我正在尝试解析 RSS 提要并压缩行中的信息,以便我仍然拥有条目的日期和时间,但没有毫秒或浪费的空间,因为我正在将文件提供给 xscreensaver 文本可读屏幕宽度受限的抓取。如果这样会更容易,我可以更改我的代码以在格式化文本之前不添加 2 个标题行。感谢任何想法...
The input file at this point looks like this:
ABC World News Feed
RSS Data retrieved from https:--abcnews.go.com-abcnews-headlines
05-24 18:48:16 Truckers' strike leads to fuel shortages in Brazil
05-24 18:48:16 The marathon atop the world's deepest lake
^^^^^^
Remove these character positions starting from 12 to 17
from each title line, with colon in 12 but not from the heading lines
So the result should look like:
ABC World News Feed
RSS Data retrieved from https:--abcnews.go.com-abcnews-headlines
05-24 18:48 Truckers' strike leads to fuel shortages in Brazil
05-24 18:48 The marathon atop the world's deepest lake
我的做法是用单个 space:
替换冒号后跟两位数后跟至少一个 space$ sed 's/:[[:digit:]][[:digit:]] */ /' file
ABC World News Feed
RSS Data retrieved from https:--abcnews.go.com-abcnews-headlines
05-24 18:48 Truckers' strike leads to fuel shortages in Brazil
05-24 18:48 The marathon atop the world's deepest lake
如果您想要真正具体的位置,您可以使用 ^
将搜索锚定到行的开头,并使用带有反向引用 </code> 的括号。这里的点 <code>.
匹配任意字符:
$ sed 's/^\(..-.. ..:..\):[[:digit:]][[:digit:]] */ /' file
ABC World News Feed
RSS Data retrieved from https:--abcnews.go.com-abcnews-headlines
05-24 18:48 Truckers' strike leads to fuel shortages in Brazil
05-24 18:48 The marathon atop the world's deepest lake
关注 awk
可能会对您有所帮助。
awk ' ~ /[0-9]+:[0-9]+:[0-9]+/{sub(/:[0-9]+ +/,OFS)} 1' Input_file
如果你想将输出保存到 Input_file 本身,那么在上面的命令中也附加 > temp_file && mv temp_file Input_file
。
说明:这里也加上说明。
awk '
~ /[0-9]+:[0-9]+:[0-9]+/{ ##Checking condition here if 2nd field is matching digit colon digit colon digit pattern then do following.
sub(/:[0-9]+ +/,OFS) ##Using substitute function of awk to substitute colon digit(s) then space with OFS whose default value is space in current line.
}
1 ##awk works on method of condition and then action, so making condition TRUE here and not mentioning action so print will happen.
' Input_file ##Mentioning Input_file name here.