如果该行不以“结尾,如何删除换行符
How to delete newline if the line doesn't end with "
示例数据:
"data","123"
"data2","qwer"
"false","234
And i'm the culprit"
"data5","234567"
输出文本应该是
"data","123"
"data2","qwer"
"false","234And i'm the culprit"
"data5","234567"
本质上,我想修复我的 csv 文件(非常大)
我正在使用 sed,所以 sed 中的答案会有很大帮助:)
sed 对于任何涉及多行的问题总是错误的选择。只需使用 awk:
$ awk '{printf "%s%s", (prev~/"$/?RS:""), [=10=]; prev=[=10=]} END{print ""}' file
"data","123"
"data2","qwer"
"false","234And i'm the culprit"
"data5","234567"
上面只是检查前一行是否以 "
结尾,如果是,则打印默认的记录分隔符(这是一个换行符 - 你可以用 ORS 或硬编码 "\n"
如果你愿意的话)但是如果没有那么它不会打印任何东西。然后它打印当前记录,后面没有换行符。在所有内容的末尾,它都会打印一个换行符。
为了完整起见,使用 sed 可以这样做:
sed '/"\s*$/! { :loop; N; //! { $! b loop }; s/\n//g }'
其工作原理如下:
/"\s*$/! { # if a line does not end with double quotes (possibly followed
# by whitespaces)
:loop # jump label "loop"
N # fetch the next line
//! { # unless the content of the pattern space matches the
# previously attempted pattern (that is: unless it ends with a
# double quote, which is the case iff the last fetched line does)
$! b loop # and unless we reached the end of the input ($!),
# go back to "loop"
}
s/\n//g # remove all newlines from the accumulated lines in the
# pattern space
}
因此,这会在模式 space 中累积不以双引号结尾的连续行,然后在打印该行之前将它们粘贴在一起成为一行。
sed ':cycle
$ b
/"$/ !N;s/\n//;t cycle' YourFile
sed 版本但不是这种操作的最佳版本
示例数据:
"data","123"
"data2","qwer"
"false","234
And i'm the culprit"
"data5","234567"
输出文本应该是
"data","123"
"data2","qwer"
"false","234And i'm the culprit"
"data5","234567"
本质上,我想修复我的 csv 文件(非常大)
我正在使用 sed,所以 sed 中的答案会有很大帮助:)
sed 对于任何涉及多行的问题总是错误的选择。只需使用 awk:
$ awk '{printf "%s%s", (prev~/"$/?RS:""), [=10=]; prev=[=10=]} END{print ""}' file
"data","123"
"data2","qwer"
"false","234And i'm the culprit"
"data5","234567"
上面只是检查前一行是否以 "
结尾,如果是,则打印默认的记录分隔符(这是一个换行符 - 你可以用 ORS 或硬编码 "\n"
如果你愿意的话)但是如果没有那么它不会打印任何东西。然后它打印当前记录,后面没有换行符。在所有内容的末尾,它都会打印一个换行符。
为了完整起见,使用 sed 可以这样做:
sed '/"\s*$/! { :loop; N; //! { $! b loop }; s/\n//g }'
其工作原理如下:
/"\s*$/! { # if a line does not end with double quotes (possibly followed
# by whitespaces)
:loop # jump label "loop"
N # fetch the next line
//! { # unless the content of the pattern space matches the
# previously attempted pattern (that is: unless it ends with a
# double quote, which is the case iff the last fetched line does)
$! b loop # and unless we reached the end of the input ($!),
# go back to "loop"
}
s/\n//g # remove all newlines from the accumulated lines in the
# pattern space
}
因此,这会在模式 space 中累积不以双引号结尾的连续行,然后在打印该行之前将它们粘贴在一起成为一行。
sed ':cycle
$ b
/"$/ !N;s/\n//;t cycle' YourFile
sed 版本但不是这种操作的最佳版本