如何根据创建的列对输出进行排序?
How can I sort output based on created column?
我创建了一个 awk 文件来对 csv 文件中的一些数据进行排序。这是数据的片段
|Timestamp |Email |Name |Year|Make |Model |Car_ID|Judge_ID|Judge_Name|Racer_Turbo|Racer_Supercharged|Racer_Performance|Racer_Horsepower|Car_Overall|Engine_Modifications|Engine_Performance|Engine_Chrome|Engine_Detailing|Engine_Cleanliness|Body_Frame_Undercarriage|Body_Frame_Suspension|Body_Frame_Chrome|Body_Frame_Detailing|Body_Frame_Cleanliness|Mods_Paint|Mods_Body|Mods_Wrap|Mods_Rims|Mods_Interior|Mods_Other|Mods_ICE|Mods_Aftermarket|Mods_WIP|Mods_Overall|
|--------------|-------------------------|----------|----|--------|---------|------|--------|----------|-----------|------------------|-----------------|----------------|-----------|--------------------|------------------|-------------|----------------|------------------|------------------------|---------------------|-----------------|--------------------|----------------------|----------|---------|---------|---------|-------------|----------|--------|----------------|--------|------------|
|8/5/2018 14:10|honoland13@japanpost.jp |Hernando |2015|Acura |TLX |48 |J04 |Bob |0 |0 |2 |2 |4 |4 |0 |2 |4 |4 |2 |4 |2 |2 |2 |2 |2 |0 |4 |4 |4 |6 |2 |0 |4 |
|8/5/2018 15:11|nlighterness2q@umn.edu |Noel |2015|Jeep |Wrangler |124 |J02 |Carl |0 |6 |4 |2 |4 |6 |6 |4 |4 |4 |6 |6 |6 |6 |6 |4 |6 |6 |6 |6 |6 |4 |6 |4 |6 |
|8/5/2018 17:10|eguest47@microsoft.com |Edan |2015|Lexus |Is250 |222 |J05 |Adrian |0 |0 |0 |0 |0 |0 |0 |0 |6 |6 |6 |0 |0 |6 |6 |6 |0 |0 |0 |0 |0 |0 |0 |0 |4 |
|8/5/2018 17:34|hchilley40@fema.gov |Hieronymus|1993|Honda |Civic eG |207 |J06 |Aaron |0 |0 |2 |2 |2 |2 |2 |2 |0 |4 |2 |2 |2 |2 |2 |2 |4 |2 |2 |0 |0 |0 |2 |2 |0 |
|8/5/2018 14:30|nnowick3d@tuttocitta.it |Nickolas |2016|Ford |Mystang |167 |J02 |Carl |0 |0 |2 |2 |0 |2 |2 |0 |0 |0 |0 |2 |0 |2 |2 |2 |0 |0 |2 |0 |0 |0 |0 |0 |2 |
|8/5/2018 16:12|mdearl39@amazon.co.uk |Martin |2013|Hyundai |Gen coupe|159 |J04 |Bob |0 |0 |2 |0 |0 |0 |2 |0 |0 |0 |0 |2 |0 |2 |2 |0 |2 |0 |2 |0 |0 |0 |0 |0 |0 |
|8/5/2018 17:00|alynamg@blogtalkradio.com|Aldridge |2009|Infiniti|G37 |20 |J06 |Aaron |2 |0 |2 |2 |0 |0 |2 |0 |0 |2 |2 |2 |2 |2 |2 |2 |2 |2 |4 |2 |2 |0 |2 |0 |2 |
|8/5/2018 16:11|abowton3k@spiegel.de |Ambros |2009|Honda |Oddesy |178 |J06 |Aaron |2 |0 |2 |2 |2 |2 |2 |0 |4 |4 |2 |2 |2 |4 |4 |4 |2 |2 | |6 |4 |4 |6 |4 |6 |
我能够生成的输出数据如下所示
Ranking Car_ID Year Make Model Total
1 48 2015 Acura TLX 62
2 124 2015 Jeep Wrangler 124
3 222 2015 Lexus Is250 40
...
我希望能够根据总列对上面的输出进行降序排序,但我不知道如何在 awk 中这样做。 total 和 ranking 列不是原始 csv 数据的一部分,仅在输出时产生。到目前为止,这是我的代码
BEGIN {
FS = ",";
OFS = "\t\t";
}
NR==1 {
= "Ranking";
= "Total";
}
NR>1 {
= 1;
for(i = 1; i < NR - 1; i++) { += 1 }
= + + + + + + + + + + + + + + + + + + + + + + + + ;
}
{
print , , , , , ;
}
当我 运行 添加“|sort -nk36|”在命令结束时,它似乎没有改变输出或以任何方式对其进行排序。也许我对命令感到困惑。
预期的输出应该是这样的
Ranking Car_ID Year Make Model Total
1 48 2015 Jeep Wrangler 124
2 124 2015 Acura TLX 62
3 222 2015 Lexus Is250 40
假设:
- 输入字段是 comma-delimited(虽然 OP 的样本输入显示为 fixed-width,带有管道边界,OP 的
awk
代码规定 "FS=","
,并且由于 OP 声称 awk
代码是 运行 并生成输出,我们将坚持使用 FS=","
)
- OP 示例输入中的第二行(连字符实线)实际上并不存在于 OP 文件中(事实上 OP 的
awk
代码不针对 NR==2
)
- 输出将是 tab-delimited(OP 的
awk
代码提到 OFS="\t\t"
,示例输出似乎是 ... fixed-width?)
Ranking
分配基于排序结果(即,不基于 OP 的 awk
代码中所示的输入顺序)
设置:
$ cat raw.dat
Timestamp,Email,Name,Year,Make,Model,Car_ID,Judge_ID,Judge_Name,Racer_Turbo,Racer_Supercharged,Racer_Performance,Racer_Horsepower,Car_Overall,Engine_Modifications,Engine_Performance,Engine_Chrome,Engine_Detailing,Engine_Cleanliness,Body_Frame_Undercarriage,Body_Frame_Suspension,Body_Frame_Chrome,Body_Frame_Detailing,Body_Frame_Cleanliness,Mods_Paint,Mods_Body,Mods_Wrap,Mods_Rims,Mods_Interior,Mods_Other,Mods_ICE,Mods_Aftermarket,Mods_WIP,Mods_Overall
8/5/2018 14:10,honoland13@japanpost.jp,Hernando,2015,Acura,TLX,48,J04,Bob,0,0,2,2,4,4,0,2,4,4,2,4,2,2,2,2,2,0,4,4,4,6,2,0,4
8/5/2018 15:11,nlighterness2q@umn.edu,Noel,2015,Jeep,Wrangler,124,J02,Carl,0,6,4,2,4,6,6,4,4,4,6,6,6,6,6,4,6,6,6,6,6,4,6,4,6
8/5/2018 17:10,eguest47@microsoft.com,Edan,2015,Lexus,Is250,222,J05,Adrian,0,0,0,0,0,0,0,0,6,6,6,0,0,6,6,6,0,0,0,0,0,0,0,0,4
8/5/2018 17:34,hchilley40@fema.gov,Hieronymus,1993,Honda,CiviceG,207,J06,Aaron,0,0,2,2,2,2,2,2,0,4,2,2,2,2,2,2,4,2,2,0,0,0,2,2,0
8/5/2018 14:30,nnowick3d@tuttocitta.it,Nickolas,2016,Ford,Mystang,167,J02,Carl,0,0,2,2,0,2,2,0,0,0,0,2,0,2,2,2,0,0,2,0,0,0,0,0,2
8/5/2018 16:12,mdearl39@amazon.co.uk,Martin,2013,Hyundai,Gencoupe,159,J04,Bob,0,0,2,0,0,0,2,0,0,0,0,2,0,2,2,0,2,0,2,0,0,0,0,0,0
8/5/2018 17:00,alynamg@blogtalkradio.com,Aldridge,2009,Infiniti,G37,20,J06,Aaron,2,0,2,2,0,0,2,0,0,2,2,2,2,2,2,2,2,2,4,2,2,0,2,0,2
8/5/2018 16:11,abowton3k@spiegel.de,Ambros,2009,Honda,Oddesy,178,J06,Aaron,2,0,2,2,2,2,2,0,4,4,2,2,2,4,4,4,2,2,,6,4,4,6,4,6
一个GNU awk
(支持PROCINFO["sorted_in"]
)想法:
awk '
BEGIN { FS=","; OFS="\t" }
FNR==1 { print "Ranking",,,,,"Total"; next }
{ totals[FNR]=0
for (i=10;i<=34;i++)
totals[FNR]+= $i
lines[FNR]= OFS OFS OFS
}
END { PROCINFO["sorted_in"]="@val_num_desc" # sort totals[] array by numeric value (descending order)
ranking=0
for (i in totals) # loop through indices of the totals[] array
print ++ranking,lines[i],totals[i]
}
' raw.dat
这会生成:
Ranking Car_ID Year Make Model Total
1 124 2015 Jeep Wrangler 124
2 178 2009 Honda Oddesy 72
3 48 2015 Acura TLX 62
4 207 1993 Honda CiviceG 40
5 222 2015 Lexus Is250 40
6 20 2009 Infiniti G37 38
7 167 2016 Ford Mystang 20
8 159 2013 Hyundai Gencoupe 14
如果 OP 需要漂亮地打印所有列的输出,那么可以用更多的代码来完成,或者我们可以通过 column
管道输出结果(假设 [= 中没有嵌入空格27=] 或 Model
列),例如:
$ awk 'BEGIN ... ' raw.dat | column -t
Ranking Car_ID Year Make Model Total
1 124 2015 Jeep Wrangler 124
2 178 2009 Honda Oddesy 72
3 48 2015 Acura TLX 62
4 207 1993 Honda CiviceG 40
5 222 2015 Lexus Is250 40
6 20 2009 Infiniti G37 38
7 167 2016 Ford Mystang 20
8 159 2013 Hyundai Gencoupe 14
备注:
- 没有为
Total
列中有重复值的情况提供额外的排序要求,因此我们将按任何顺序打印 awk
处理 [=31= 中的数据]循环
我创建了一个 awk 文件来对 csv 文件中的一些数据进行排序。这是数据的片段
|Timestamp |Email |Name |Year|Make |Model |Car_ID|Judge_ID|Judge_Name|Racer_Turbo|Racer_Supercharged|Racer_Performance|Racer_Horsepower|Car_Overall|Engine_Modifications|Engine_Performance|Engine_Chrome|Engine_Detailing|Engine_Cleanliness|Body_Frame_Undercarriage|Body_Frame_Suspension|Body_Frame_Chrome|Body_Frame_Detailing|Body_Frame_Cleanliness|Mods_Paint|Mods_Body|Mods_Wrap|Mods_Rims|Mods_Interior|Mods_Other|Mods_ICE|Mods_Aftermarket|Mods_WIP|Mods_Overall|
|--------------|-------------------------|----------|----|--------|---------|------|--------|----------|-----------|------------------|-----------------|----------------|-----------|--------------------|------------------|-------------|----------------|------------------|------------------------|---------------------|-----------------|--------------------|----------------------|----------|---------|---------|---------|-------------|----------|--------|----------------|--------|------------|
|8/5/2018 14:10|honoland13@japanpost.jp |Hernando |2015|Acura |TLX |48 |J04 |Bob |0 |0 |2 |2 |4 |4 |0 |2 |4 |4 |2 |4 |2 |2 |2 |2 |2 |0 |4 |4 |4 |6 |2 |0 |4 |
|8/5/2018 15:11|nlighterness2q@umn.edu |Noel |2015|Jeep |Wrangler |124 |J02 |Carl |0 |6 |4 |2 |4 |6 |6 |4 |4 |4 |6 |6 |6 |6 |6 |4 |6 |6 |6 |6 |6 |4 |6 |4 |6 |
|8/5/2018 17:10|eguest47@microsoft.com |Edan |2015|Lexus |Is250 |222 |J05 |Adrian |0 |0 |0 |0 |0 |0 |0 |0 |6 |6 |6 |0 |0 |6 |6 |6 |0 |0 |0 |0 |0 |0 |0 |0 |4 |
|8/5/2018 17:34|hchilley40@fema.gov |Hieronymus|1993|Honda |Civic eG |207 |J06 |Aaron |0 |0 |2 |2 |2 |2 |2 |2 |0 |4 |2 |2 |2 |2 |2 |2 |4 |2 |2 |0 |0 |0 |2 |2 |0 |
|8/5/2018 14:30|nnowick3d@tuttocitta.it |Nickolas |2016|Ford |Mystang |167 |J02 |Carl |0 |0 |2 |2 |0 |2 |2 |0 |0 |0 |0 |2 |0 |2 |2 |2 |0 |0 |2 |0 |0 |0 |0 |0 |2 |
|8/5/2018 16:12|mdearl39@amazon.co.uk |Martin |2013|Hyundai |Gen coupe|159 |J04 |Bob |0 |0 |2 |0 |0 |0 |2 |0 |0 |0 |0 |2 |0 |2 |2 |0 |2 |0 |2 |0 |0 |0 |0 |0 |0 |
|8/5/2018 17:00|alynamg@blogtalkradio.com|Aldridge |2009|Infiniti|G37 |20 |J06 |Aaron |2 |0 |2 |2 |0 |0 |2 |0 |0 |2 |2 |2 |2 |2 |2 |2 |2 |2 |4 |2 |2 |0 |2 |0 |2 |
|8/5/2018 16:11|abowton3k@spiegel.de |Ambros |2009|Honda |Oddesy |178 |J06 |Aaron |2 |0 |2 |2 |2 |2 |2 |0 |4 |4 |2 |2 |2 |4 |4 |4 |2 |2 | |6 |4 |4 |6 |4 |6 |
我能够生成的输出数据如下所示
Ranking Car_ID Year Make Model Total
1 48 2015 Acura TLX 62
2 124 2015 Jeep Wrangler 124
3 222 2015 Lexus Is250 40
...
我希望能够根据总列对上面的输出进行降序排序,但我不知道如何在 awk 中这样做。 total 和 ranking 列不是原始 csv 数据的一部分,仅在输出时产生。到目前为止,这是我的代码
BEGIN {
FS = ",";
OFS = "\t\t";
}
NR==1 {
= "Ranking";
= "Total";
}
NR>1 {
= 1;
for(i = 1; i < NR - 1; i++) { += 1 }
= + + + + + + + + + + + + + + + + + + + + + + + + ;
}
{
print , , , , , ;
}
当我 运行 添加“|sort -nk36|”在命令结束时,它似乎没有改变输出或以任何方式对其进行排序。也许我对命令感到困惑。
预期的输出应该是这样的
Ranking Car_ID Year Make Model Total
1 48 2015 Jeep Wrangler 124
2 124 2015 Acura TLX 62
3 222 2015 Lexus Is250 40
假设:
- 输入字段是 comma-delimited(虽然 OP 的样本输入显示为 fixed-width,带有管道边界,OP 的
awk
代码规定"FS=","
,并且由于 OP 声称awk
代码是 运行 并生成输出,我们将坚持使用FS=","
) - OP 示例输入中的第二行(连字符实线)实际上并不存在于 OP 文件中(事实上 OP 的
awk
代码不针对NR==2
) - 输出将是 tab-delimited(OP 的
awk
代码提到OFS="\t\t"
,示例输出似乎是 ... fixed-width?) Ranking
分配基于排序结果(即,不基于 OP 的awk
代码中所示的输入顺序)
设置:
$ cat raw.dat
Timestamp,Email,Name,Year,Make,Model,Car_ID,Judge_ID,Judge_Name,Racer_Turbo,Racer_Supercharged,Racer_Performance,Racer_Horsepower,Car_Overall,Engine_Modifications,Engine_Performance,Engine_Chrome,Engine_Detailing,Engine_Cleanliness,Body_Frame_Undercarriage,Body_Frame_Suspension,Body_Frame_Chrome,Body_Frame_Detailing,Body_Frame_Cleanliness,Mods_Paint,Mods_Body,Mods_Wrap,Mods_Rims,Mods_Interior,Mods_Other,Mods_ICE,Mods_Aftermarket,Mods_WIP,Mods_Overall
8/5/2018 14:10,honoland13@japanpost.jp,Hernando,2015,Acura,TLX,48,J04,Bob,0,0,2,2,4,4,0,2,4,4,2,4,2,2,2,2,2,0,4,4,4,6,2,0,4
8/5/2018 15:11,nlighterness2q@umn.edu,Noel,2015,Jeep,Wrangler,124,J02,Carl,0,6,4,2,4,6,6,4,4,4,6,6,6,6,6,4,6,6,6,6,6,4,6,4,6
8/5/2018 17:10,eguest47@microsoft.com,Edan,2015,Lexus,Is250,222,J05,Adrian,0,0,0,0,0,0,0,0,6,6,6,0,0,6,6,6,0,0,0,0,0,0,0,0,4
8/5/2018 17:34,hchilley40@fema.gov,Hieronymus,1993,Honda,CiviceG,207,J06,Aaron,0,0,2,2,2,2,2,2,0,4,2,2,2,2,2,2,4,2,2,0,0,0,2,2,0
8/5/2018 14:30,nnowick3d@tuttocitta.it,Nickolas,2016,Ford,Mystang,167,J02,Carl,0,0,2,2,0,2,2,0,0,0,0,2,0,2,2,2,0,0,2,0,0,0,0,0,2
8/5/2018 16:12,mdearl39@amazon.co.uk,Martin,2013,Hyundai,Gencoupe,159,J04,Bob,0,0,2,0,0,0,2,0,0,0,0,2,0,2,2,0,2,0,2,0,0,0,0,0,0
8/5/2018 17:00,alynamg@blogtalkradio.com,Aldridge,2009,Infiniti,G37,20,J06,Aaron,2,0,2,2,0,0,2,0,0,2,2,2,2,2,2,2,2,2,4,2,2,0,2,0,2
8/5/2018 16:11,abowton3k@spiegel.de,Ambros,2009,Honda,Oddesy,178,J06,Aaron,2,0,2,2,2,2,2,0,4,4,2,2,2,4,4,4,2,2,,6,4,4,6,4,6
一个GNU awk
(支持PROCINFO["sorted_in"]
)想法:
awk '
BEGIN { FS=","; OFS="\t" }
FNR==1 { print "Ranking",,,,,"Total"; next }
{ totals[FNR]=0
for (i=10;i<=34;i++)
totals[FNR]+= $i
lines[FNR]= OFS OFS OFS
}
END { PROCINFO["sorted_in"]="@val_num_desc" # sort totals[] array by numeric value (descending order)
ranking=0
for (i in totals) # loop through indices of the totals[] array
print ++ranking,lines[i],totals[i]
}
' raw.dat
这会生成:
Ranking Car_ID Year Make Model Total
1 124 2015 Jeep Wrangler 124
2 178 2009 Honda Oddesy 72
3 48 2015 Acura TLX 62
4 207 1993 Honda CiviceG 40
5 222 2015 Lexus Is250 40
6 20 2009 Infiniti G37 38
7 167 2016 Ford Mystang 20
8 159 2013 Hyundai Gencoupe 14
如果 OP 需要漂亮地打印所有列的输出,那么可以用更多的代码来完成,或者我们可以通过 column
管道输出结果(假设 [= 中没有嵌入空格27=] 或 Model
列),例如:
$ awk 'BEGIN ... ' raw.dat | column -t
Ranking Car_ID Year Make Model Total
1 124 2015 Jeep Wrangler 124
2 178 2009 Honda Oddesy 72
3 48 2015 Acura TLX 62
4 207 1993 Honda CiviceG 40
5 222 2015 Lexus Is250 40
6 20 2009 Infiniti G37 38
7 167 2016 Ford Mystang 20
8 159 2013 Hyundai Gencoupe 14
备注:
- 没有为
Total
列中有重复值的情况提供额外的排序要求,因此我们将按任何顺序打印awk
处理 [=31= 中的数据]循环