从日期时间中提取日期 - 更改。 ,并打印不同领域的总结

Extract date from date time - change . to , and print sum up of different field

aNumber bNumber startDate   cost    balanceAfter    trafficCase Operator    unknown3    MainAmount  BALANCEBEFORE
22676239633 433 2014-07-02 10:16:48.000 0,00    0.20    0   Short Code  397224944   0.0000  0.2000
22677277255 76919167    2014-07-02 10:16:51.000 1,00    92.60   0   Airtel  126268625   0.0000  92.6000
22676777508 76701575    2014-07-02 10:16:55.000 1,00    217.00  0   Airtel  4132186103  0.0000  217.0000
22665706841 433 2014-07-02 10:16:57.000 0,00    69.50   0   Short Code  4133821554  0.0000  69.5000
22665799922 70110055    2014-07-03 10:16:45.000 20,00   0.50    0   Telmob  126260244   20.0000 0.5000
22676239633 433 2014-07-03 10:16:48.000 0,00    0.20    0   Short Code  397224944   0.0000  0.2000
22677277255 76919167    2014-07-04 10:16:51.000 1,00    92.60   0   Airtel  126268625   0.0000  92.6000
22676777508 76701575    2014-07-04 10:16:55.000 1,00    217.00  0   Airtel  4132186103  0.0000  217.0000
22665706841 433 2014-07-05 10:16:57.000 0,00    69.50   0   Short Code  4133821554  0.0000  69.5000

这是我拥有的数据示例。我想在每次日期更改时总结 costbalanceAfterMainAmountBALANCEBEFORE,但我担心的是我将日期与时间结合在一起,我的小数点分隔符是点而不是逗号,所以我的 awk 脚本无法执行该操作。 我可以有一个 AWK 脚本,它首先只提取日期所以最后我会有一个输出看起来像:

Date        Cost    balanceAfter    MainAmount  BALANCEBEFORE
02/07/2014  2,00    379,3                0          379,3
03/07/2014  20,00   0,7                 20            0,7
04/07/2014  2,00    309,6                0          309,6
05/07/2014  0,00    69,5                 0           69,5

这是我的 AWK 脚本

awk -F 'NR==1 {header=[=12=]; next} {a[]+= a[]+= a[]+= a[]+=} END {for (i in a) {printf "%d\t%d\n", i, a[i]}; tot+=a[i]};' out.txt>output.doc

编辑:根据 Etan Reisner 的建议避免预处理步骤,使用 $NF 解决 Operator 列中不同数量的标记。

$ cat data.txt
aNumber bNumber startDate   cost    balanceAfter    trafficCase Operator    unknown3    MainAmount  BALANCEBEFORE
22676239633 433 2014-07-02 10:16:48.000 0,00    0.20    0   Short Code  397224944   0.0000  0.2000
22677277255 76919167    2014-07-02 10:16:51.000 1,00    92.60   0   Airtel  126268625   0.0000  92.6000
22676777508 76701575    2014-07-02 10:16:55.000 1,00    217.00  0   Airtel  4132186103  0.0000  217.0000
22665706841 433 2014-07-02 10:16:57.000 0,00    69.50   0   Short Code  4133821554  0.0000  69.5000
22665799922 70110055    2014-07-03 10:16:45.000 20,00   0.50    0   Telmob  126260244   20.0000 0.5000
22676239633 433 2014-07-03 10:16:48.000 0,00    0.20    0   Short Code  397224944   0.0000  0.2000
22677277255 76919167    2014-07-04 10:16:51.000 1,00    92.60   0   Airtel  126268625   0.0000  92.6000
22676777508 76701575    2014-07-04 10:16:55.000 1,00    217.00  0   Airtel  4132186103  0.0000  217.0000
22665706841 433 2014-07-05 10:16:57.000 0,00    69.50   0   Short Code  4133821554  0.0000  69.5000


$ cat so2.awk
NR > 1 {
    cost = ;
    balanceAfter = ;
    mainAmount = $(NF - 1);
    balanceBefore = $NF;

    sub(",", ".", cost);
    sub(",", ".", balanceAfter);
    sub(",", ".", mainAmount);
    sub(",", ".", balanceBefore);

    dateCost[] += cost;
    dateBalanceAfter[] += balanceAfter;
    dateMainAmount[] += mainAmount;
    dateBalanceBefore[] += balanceBefore;
}

END {
    printf("%s\t%s\t%s\t%s\t%s\n", "Date", "Cost", "BalanceAfter", "MainAmount", "BalanceBefore");
    for (i in dateCost) {
        printf("%s\t%f\t%f\t%f\t%f\n", i, dateCost[i], dateBalanceAfter[i], dateMainAmount[i], dateBalanceBefore[i]);
    }
}


$ awk -f so2.awk data.txt
Date    Cost    BalanceAfter    MainAmount  BalanceBefore
2014-07-02  2.000000    379.300000  0.000000    379.300000
2014-07-03  20.000000   0.700000    20.000000   0.700000
2014-07-04  2.000000    309.600000  0.000000    309.600000
2014-07-05  0.000000    69.500000   0.000000    69.500000

这不需要对文件进行预处理:

awk '
    BEGIN {print "Date Cost BalanceAfter MainAmount BalanceBefore"}
    NR == 1 {next} 
    function showday() {
        printf "%s\t%.2f\t%.1f\t%d\t%.1f\n", date, cost, bAfter, main, bBefore
    }
    date !=  {
        if (date) showday()
        date = 
        cost = bAfter = main = bBefore = 0
    } 
    {
        sub(/,/, ".", )
        cost += 
        bAfter += 
        main += $(NF-1)
        bBefore += $NF
    }
    END {showday()}
' file | column -t
Date        Cost   BalanceAfter  MainAmount  BalanceBefore
2014-07-02  2.00   379.3         0           379.3
2014-07-03  20.00  0.7           20          0.7
2014-07-04  2.00   309.6         0           309.6
2014-07-05  0.00   69.5          0           69.5