如何删除以逗号分隔符分隔的 10 列行中的换行符

How to remove line break in lines with 10 columns separated by comma delimiter

我需要删除每行中的换行符,其中 10 列由逗号分隔符分隔。 这是输入:

EXP_TRANSF_DE_PARA,N/A,Input,1,1,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow
... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6711630149',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
EXP_TRANSF_DE_PARA,N/A,Input,1,2,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow
... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6342311300',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
PREST_TELEFONE_HIS,N/A,Input,1,3,1,04/30/2020 19:52:20,1588287140,8361,Error loading row to target table [PREST_TELEFONE_HIS]. Error message [
FnName: Execute -- [IBM][CLI Driver][DB2] SQL0407N  Assignment of a NULL value to a NOT NULL column ""*N"" is not allowed.  SQLSTATE=23502
],2,N/A,N/A,N/A,-1,-1,N/A

这应该是输出:

EXP_TRANSF_DE_PARA,N/A,Input,1,1,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6711630149',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
EXP_TRANSF_DE_PARA,N/A,Input,1,2,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6342311300',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
PREST_TELEFONE_HIS,N/A,Input,1,3,1,04/30/2020 19:52:20,1588287140,8361,Error loading row to target table [PREST_TELEFONE_HIS]. Error message [FnName: Execute -- [IBM][CLI Driver][DB2] SQL0407N  Assignment of a NULL value to a NOT NULL column ""*N"" is not allowed.  SQLSTATE=23502],2,N/A,N/A,N/A,-1,-1,N/A

到目前为止,我已经尝试过这个 awk 命令:

awk -F"," 'NF=10{printf("%s",[=12=]);getline;print;next}1'

输出:

EXP_TRANSF_DE_PARA N/A Input 1 1 1 04/30/2020 19:52:20 1588287140 11131 Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6711630149',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
EXP_TRANSF_DE_PARA N/A Input 1 2 1 04/30/2020 19:52:20 1588287140 11131 Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6342311300',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
PREST_TELEFONE_HIS N/A Input 1 3 1 04/30/2020 19:52:20 1588287140 8361 Error loading row to target table [PREST_TELEFONE_HIS]. Error message [FnName: Execute -- [IBM][CLI Driver][DB2] SQL0407N  Assignment of a NULL value to a NOT NULL column *N is not allowed.  SQLSTATE=23502
] 2 N/A N/A N/A -1 -1 N/A  ] 2 N/A N/A N/A -1 -1 N/A

我不知道为什么命令要从行中删除逗号分隔符。我知道第 6 行没有 10 列,这就是为什么不删除断线... 有什么建议吗?

这里有一个 Bash 脚本可以解决您的问题:

#!/bin/bash

set -o errexit
set -o nounset

fieldCount=20

#filter out newlines which are not record separators
fieldNum=1
while read -N1 -r ch; do
    if [ "$ch" = "," ]; then
        fieldNum="$((fieldNum + 1))"
    elif [ "$ch" = $'\n' ] && [ "$fieldNum" = "$fieldCount" ]; then
        fieldNum=1
    fi
    if [ "$ch" != $'\n' ] || [ "$fieldNum" = 1 ]; then
        printf "$ch"
    fi
done
printf '\n'

选项 -N1 一次读取一个字符(而不是一次读取一行),选项 -r 将反斜杠视为普通字符。

这个问题也可以用一个大小相当的简单 C 程序来解决:

#include <stdio.h>

int main(void)
{
    const int fieldCount = 20;
    int fieldNum, ch;

    /*filter out newlines which are not record separators*/
    fieldNum = 1;
    ch = getchar();
    while (ch != EOF) {
        if (ch == ',') {
            fieldNum++;
        } else if ((ch == '\n') && (fieldNum == fieldCount)) {
            fieldNum = 1;
        }
        if ((ch != '\n') || (fieldNum == 1)) {
            putchar(ch);
        }
        ch = getchar();
    }
    putchar('\n');
    return 0;
}

试试这个

awk -F","  '{OFS=",";  if ( != "Input") {printf "%s", [=10=]} else {printf "\n%s" ,[=10=]}}' |sed '1d'  | sed  -e '$a\'

演示:

$cat file.txt 
EXP_TRANSF_DE_PARA,N/A,Input,1,1,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow
... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6711630149',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
EXP_TRANSF_DE_PARA,N/A,Input,1,2,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow
... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6342311300',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
PREST_TELEFONE_HIS,N/A,Input,1,3,1,04/30/2020 19:52:20,1588287140,8361,Error loading row to target table [PREST_TELEFONE_HIS]. Error message [
FnName: Execute -- [IBM][CLI Driver][DB2] SQL0407N  Assignment of a NULL value to a NOT NULL column ""*N"" is not allowed.  SQLSTATE=23502
],2,N/A,N/A,N/A,-1,-1,N/A
$awk -F","  '{OFS=",";  if ( != "Input") {printf "%s", [=11=]} else {printf "\n%s" ,[=11=]}}' file.txt  | sed '1d'  | sed  -e '$a\'
EXP_TRANSF_DE_PARA,N/A,Input,1,1,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6711630149',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
EXP_TRANSF_DE_PARA,N/A,Input,1,2,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6342311300',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
PREST_TELEFONE_HIS,N/A,Input,1,3,1,04/30/2020 19:52:20,1588287140,8361,Error loading row to target table [PREST_TELEFONE_HIS]. Error message [FnName: Execute -- [IBM][CLI Driver][DB2] SQL0407N  Assignment of a NULL value to a NOT NULL column ""*N"" is not allowed.  SQLSTATE=23502],2,N/A,N/A,N/A,-1,-1,N/A
$

解释:

awk -F"," < -- 设置分隔符为,

'{OFS=","; < -- 将输出字段分隔符设置为 , 因为我们将使用 printf 来格式化文本

if ( != "Input") {printf "%s", [=17=]} <-- 如果当前记录的第 3 列不是 "Input" 打印当前记录。请注意,我们不会添加 newline,因此记录不会被终止。

else {printf "\n%s" ,[=20=]}}' <-- 如果当前记录是记录,我们希望在打印记录之前添加一个换行符 \n

sed '1d' < -- 删除第一条记录。这将是空行,因为我们的记录有 "Input"

sed -e '$a\' <-- 在文件末尾添加一个新行。