使用 awk 在同一个文件中排序多个表
Ordering several tables in the same file using awk
在我的工作流程中,创建了包含带有 two-line header 的简单 table 的文件(参见 post 的末尾)。我想按编号订购这些 tables:
(head -n 2 && tail -n +3 | sort -n -r) > ordered.txt
效果很好,但我不知道如何拆分文件以便我可以订购每个 table 并将其打印在一个文件中。我的做法是:
awk '/^TARGET/ {(head -n 2 && tail -n +3 | sort -n -r) >> ordered.txt}' output.txt
但是,这会导致错误消息。我想避免任何中间输出文件。我的 awk 命令中缺少什么?
输入文件如下所示:
TARGET 1
Sample1 Sample2 Sample3 Pattern
3 3 3 z..........................Z........................................z.........Z...z
147 171 49 Z..........................Z........................................Z.........Z...Z
27 28 13 z..........................Z........................................z.........z...z
75 64 32 Z..........................Z........................................Z.........z...Z
TARGET 2
Sample1 Sample2 Sample3 Pattern
2 0 1 z..........................z........................................z.........Z...Z
21 21 7 z..........................Z........................................Z.........Z...Z
1 0 0 ...........................Z........................................Z.............Z
4 8 6 Z..........................Z........................................z.........Z...z
2 0 1 Z..........................Z........................................Z.........Z....
1 0 0 z..........................Z........................................Z.............Z
1 0 0 z...................................................................Z.........Z...Z
TARGET 3
Sample1 Sample2 Sample3 Pattern
1 0 0 z..........................Z........................................z.............z
1 3 0 z..........................z........................................Z.........Z...Z
1 1 0 Z..........................Z........................................Z.............z
1 0 0 Z..........................Z........................................Z.............Z
0 1 2 ...........................Z........................................Z.........Z...Z
0 0 1 z..........................z........................................z..............
我的输出应该是这样的 - 没有掉落任何一行:
TARGET 1
Sample1 Sample2 Sample3 Pattern
147 171 49 Z..........................Z........................................Z.........Z...Z
75 64 32 Z..........................Z........................................Z.........z...Z
27 28 13 z..........................Z........................................z.........z...z
3 3 3 z..........................Z........................................z.........Z...z
TARGET 2
Sample1 Sample2 Sample3 Pattern
21 21 7 z..........................Z........................................Z.........Z...Z
4 8 6 Z..........................Z........................................z.........Z...z
2 0 1 z..........................z........................................z.........Z...Z
2 0 1 z..........................z........................................z.........Z...Z
1 0 0 ...........................Z........................................Z.............Z
1 0 0 ...........................Z........................................Z.............Z
1 0 0 ...........................Z........................................Z.............Z
TARGET 3
Sample1 Sample2 Sample3 Pattern
1 0 0 z..........................Z........................................z.............z
1 0 0 z..........................Z........................................z.............z
1 0 0 z..........................Z........................................z.............z
1 0 0 z..........................Z........................................z.............z
0 1 2 ...........................Z........................................Z.........Z...Z
0 0 1 z..........................z........................................z..............
需要 GNU awk array traversal sorting:
gawk '
BEGIN {PROCINFO["sorted_in"] = "@val_num_asc"}
function output_table() {
for (key in table) print table[key]
delete table
i=0
}
/TARGET/ {print; getline; print; next}
/^$/ {output_table(); print; next}
{table[++i] = [=10=]}
END {output_table()}
' file
产出
TARGET 1
Sample1 Sample2 Sample3 Pattern
3 3 3 z..........................Z........................................z.........Z...z
27 28 13 z..........................Z........................................z.........z...z
75 64 32 Z..........................Z........................................Z.........z...Z
147 171 49 Z..........................Z........................................Z.........Z...Z
TARGET 2
Sample1 Sample2 Sample3 Pattern
1 0 0 ...........................Z........................................Z.............Z
1 0 0 z...................................................................Z.........Z...Z
1 0 0 z..........................Z........................................Z.............Z
2 0 1 Z..........................Z........................................Z.........Z....
2 0 1 z..........................z........................................z.........Z...Z
4 8 6 Z..........................Z........................................z.........Z...z
21 21 7 z..........................Z........................................Z.........Z...Z
TARGET 3
Sample1 Sample2 Sample3 Pattern
0 0 1 z..........................z........................................z..............
0 1 2 ...........................Z........................................Z.........Z...Z
1 0 0 Z..........................Z........................................Z.............Z
1 0 0 z..........................Z........................................z.............z
1 1 0 Z..........................Z........................................Z.............z
1 3 0 z..........................z........................................Z.........Z...Z
这有点乱,但假设您不想在排序时丢失记录,这应该可行
awk 'function sortit(){
x=asort(a)
for(i=1;i<=x;i++)print b[a[i]" "d[i]++]
delete(a);delete(b);delete(c);delete(d)
}
/^[0-9]/{a[[=10=]]=;b[" "c[]++]=[=10=]}
/TARGET/{print;getline;print}
!NF{sortit();print}
END(sortit()}' file
在我的工作流程中,创建了包含带有 two-line header 的简单 table 的文件(参见 post 的末尾)。我想按编号订购这些 tables:
(head -n 2 && tail -n +3 | sort -n -r) > ordered.txt
效果很好,但我不知道如何拆分文件以便我可以订购每个 table 并将其打印在一个文件中。我的做法是:
awk '/^TARGET/ {(head -n 2 && tail -n +3 | sort -n -r) >> ordered.txt}' output.txt
但是,这会导致错误消息。我想避免任何中间输出文件。我的 awk 命令中缺少什么?
输入文件如下所示:
TARGET 1
Sample1 Sample2 Sample3 Pattern
3 3 3 z..........................Z........................................z.........Z...z
147 171 49 Z..........................Z........................................Z.........Z...Z
27 28 13 z..........................Z........................................z.........z...z
75 64 32 Z..........................Z........................................Z.........z...Z
TARGET 2
Sample1 Sample2 Sample3 Pattern
2 0 1 z..........................z........................................z.........Z...Z
21 21 7 z..........................Z........................................Z.........Z...Z
1 0 0 ...........................Z........................................Z.............Z
4 8 6 Z..........................Z........................................z.........Z...z
2 0 1 Z..........................Z........................................Z.........Z....
1 0 0 z..........................Z........................................Z.............Z
1 0 0 z...................................................................Z.........Z...Z
TARGET 3
Sample1 Sample2 Sample3 Pattern
1 0 0 z..........................Z........................................z.............z
1 3 0 z..........................z........................................Z.........Z...Z
1 1 0 Z..........................Z........................................Z.............z
1 0 0 Z..........................Z........................................Z.............Z
0 1 2 ...........................Z........................................Z.........Z...Z
0 0 1 z..........................z........................................z..............
我的输出应该是这样的 - 没有掉落任何一行:
TARGET 1
Sample1 Sample2 Sample3 Pattern
147 171 49 Z..........................Z........................................Z.........Z...Z
75 64 32 Z..........................Z........................................Z.........z...Z
27 28 13 z..........................Z........................................z.........z...z
3 3 3 z..........................Z........................................z.........Z...z
TARGET 2
Sample1 Sample2 Sample3 Pattern
21 21 7 z..........................Z........................................Z.........Z...Z
4 8 6 Z..........................Z........................................z.........Z...z
2 0 1 z..........................z........................................z.........Z...Z
2 0 1 z..........................z........................................z.........Z...Z
1 0 0 ...........................Z........................................Z.............Z
1 0 0 ...........................Z........................................Z.............Z
1 0 0 ...........................Z........................................Z.............Z
TARGET 3
Sample1 Sample2 Sample3 Pattern
1 0 0 z..........................Z........................................z.............z
1 0 0 z..........................Z........................................z.............z
1 0 0 z..........................Z........................................z.............z
1 0 0 z..........................Z........................................z.............z
0 1 2 ...........................Z........................................Z.........Z...Z
0 0 1 z..........................z........................................z..............
需要 GNU awk array traversal sorting:
gawk '
BEGIN {PROCINFO["sorted_in"] = "@val_num_asc"}
function output_table() {
for (key in table) print table[key]
delete table
i=0
}
/TARGET/ {print; getline; print; next}
/^$/ {output_table(); print; next}
{table[++i] = [=10=]}
END {output_table()}
' file
产出
TARGET 1
Sample1 Sample2 Sample3 Pattern
3 3 3 z..........................Z........................................z.........Z...z
27 28 13 z..........................Z........................................z.........z...z
75 64 32 Z..........................Z........................................Z.........z...Z
147 171 49 Z..........................Z........................................Z.........Z...Z
TARGET 2
Sample1 Sample2 Sample3 Pattern
1 0 0 ...........................Z........................................Z.............Z
1 0 0 z...................................................................Z.........Z...Z
1 0 0 z..........................Z........................................Z.............Z
2 0 1 Z..........................Z........................................Z.........Z....
2 0 1 z..........................z........................................z.........Z...Z
4 8 6 Z..........................Z........................................z.........Z...z
21 21 7 z..........................Z........................................Z.........Z...Z
TARGET 3
Sample1 Sample2 Sample3 Pattern
0 0 1 z..........................z........................................z..............
0 1 2 ...........................Z........................................Z.........Z...Z
1 0 0 Z..........................Z........................................Z.............Z
1 0 0 z..........................Z........................................z.............z
1 1 0 Z..........................Z........................................Z.............z
1 3 0 z..........................z........................................Z.........Z...Z
这有点乱,但假设您不想在排序时丢失记录,这应该可行
awk 'function sortit(){
x=asort(a)
for(i=1;i<=x;i++)print b[a[i]" "d[i]++]
delete(a);delete(b);delete(c);delete(d)
}
/^[0-9]/{a[[=10=]]=;b[" "c[]++]=[=10=]}
/TARGET/{print;getline;print}
!NF{sortit();print}
END(sortit()}' file