使用 unix shell 脚本将文本文件中的列拆分为行 - 动态更改源文件结构

Splitting columns into rows from a text file using unix shell script - Dynamically changing source file structure

我有一个制表符分隔的源文件,结构如下: 只有从 ID 到行 Item/Property 的前 9 列是固定的,其余都是动态变化的计数和结构。

ID  Date/Time (UTC) User    Description Security Change Previous Value  New Value   Module/List Line Item/Property  Scenarios   Region EM2  Plan Item PB6   Market EM4  Plants - Master Plan Brand PB4  T/DI    GRS 6   GRS 7   Target User Import  Object  Target Role Export  Dashboard   Action  Time

这是该文件中的一条示例记录

2572561 3/24/2020 14:01 chiara.bettini@gmail.com            FALSE   TRUE    FILTER:  Brand P&L Report - Market  Plan Brands                     Polly Pocket                chiara.bettini@gmail.com    

我需要使用 Unix shell 脚本将其更改为以下结构 具有以下 headers 和数据格式 的 CSV 文件。我想保留永久列(ID 到第 Item/Property 行),并将所有其他动态可变列放入属性名称和属性值列:

ID,Date/Time (UTC),User,Description,Security Change,Previous Value,New Value,Module/List,Line Item/Property,Attribute Name,Attribute Value
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Scenarios,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Region EM2,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Plan Item PB6,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Market EM4,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Plants - Master,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Plan Brand PB4,Polly Pocket
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,T/DI,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,GRS 6,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,GRS 7,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Target User,chiara.bettini@gmail.com
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Import,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Object,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Target Role,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Export,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Dashboard,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Action,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Time,



注意:如果任何字段包含逗号 (,).[=,以下将 not 正常工作15=]

试试这个 bash 脚本(为随后的终端会话命名为 process):

#!/bin/bash

tr '\t' ',' | {
    IFS=',' # separator for all array reads and printfs

    # read and output heading
    read -r -a heading
    printf "%s\n" "${heading[*]:0:9},Attribute Name,Attribute Value"    

    # process one line of data
    while read -r -a data ; do
        for (( i=9; i<${#heading[*]}; ++i )) ; do
            printf "%s\n" "${data[*]:0:9},${heading[i]},${data[i]}"
        done
    done
}

终端会话:

$ cat data.in | tr '\t' ','
ID,Date/Time (UTC),User,Description,Security Change,Previous Value,New Value,Module/List,Line Item/Property,Scenarios,Region EM2,Plan Item PB6,Market EM4,Plants - Master,Plan Brand PB4,T/DI,GRS 6,GRS 7,Target User,Import,Object,Target Role,Export,Dashboard,Action,Time
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,,,,,,Polly Pocket,,,,chiara.bettini@gmail.com
$ ./process < data.in 
ID,Date/Time (UTC),User,Description,Security Change,Previous Value,New Value,Module/List,Line Item/Property,Attribute Name,Attribute Value
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Scenarios,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Region EM2,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Plan Item PB6,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Market EM4,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Plants - Master,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Plan Brand PB4,Polly Pocket
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,T/DI,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,GRS 6,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,GRS 7,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Target User,chiara.bettini@gmail.com
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Import,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Object,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Target Role,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Export,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Dashboard,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Action,
2572561,3/24/2020 14:01,chiara.bettini@gmail.com,,,FALSE,TRUE,FILTER:  Brand P&L Report - Market,Plan Brands,Time,
$