使用 for 循环计算两个字段的 awk 命令
awk command to count two fields using for loops
我有以下文件,我需要创建一个 awk 命令来计算字段 $15(错误代码)在每个字段 $2(事件键)中出现的次数。
例如:
Video_load_time,错误代码:111,出现次数:2
Video_load_time,错误代码:3805,出现次数:3
app_launch_time,错误代码:111,出现次数:1
time|eventKey|Version|Model|SVersion|signal|net|State|atitude|long|subd|bupUsername|tvAccount|assetId|errorCode|errorDescription|duration
14201|video_load_time|5.7|i3|8.3|0.0|FI|GT|52.1147619|9829672936714|777|ghouso|444|6789|111|4464|7149
4399784|playback_error|8.0|W8|33.2|0.0|FI|TED|468|071|078410X_.ca||2314831||3805|152rorDescript|0
762|playback_error|70|ALFiee|4.2.2|0.0|IFI|AUED|4325|795|||81321761|3805|05|1529|
634|app_launch_time|5.0.0|SGHI317M|4.1.2|0.0|CELAR|AUTO_ATED|4588|64|180||||3805|yes|0
1418|video_load_time|5.0.0.37|iP1|7.0.6|0.0|IN_HE_WIFI|AUTHEATED|45.47099941453838|477109099|||8455700500884828|111|N/A|77|9398
1420|playback_error|5.0.0.37|iPa1|8.1.1|0.0|WIFI|BUP|9863786|6799798072||ta99|841759656|Be000|1601|Video|22
594|app_launch_time|5.0.0|S7M|4.3|0.0|CLAR|AUTO_ATED|5010226|-110.673567|6167612959947-023Xca|||111|N/A||11
421|video_load_time|5.0.0.37|iP5|8.1.2|0.0|WIFI|BUP|528950658168|06.613394189||cpcpb15|84551050401|1601|N/A||182
所以我创建了以下代码:
awk -F \| '{ duration[] = $NF; ++counter[]; duration2[] = $NF; } END {for(d in duration); for(e in duration2) {print e,d, counter[d]} }'errors.out
但似乎 return 不是很好的数字:(
任何人都知道如何解决这个问题?
谢谢大家!
如果你想计算不同的 ($2, $15) 对,这就是我对你对问题描述的理解,你可以这样做,(尽管你的示例输出似乎不太匹配数据)。
$ cat vid.txt
14201|video_load_time|5.7|i3|8.3|0.0|FI|GT|52.1147619|9829672936714|777|ghouso|444|6789|111|4464|7149
4399784|playback_error|8.0|W8|33.2|0.0|FI|TED|468|071|078410X_.ca||2314831||3805|152rorDescript|0
762|playback_error|70|ALFiee|4.2.2|0.0|IFI|AUED|4325|795|||81321761|3805|05|1529|
634|app_launch_time|5.0.0|SGHI317M|4.1.2|0.0|CELAR|AUTO_ATED|4588|64|180||||3805|yes|0
1418|video_load_time|5.0.0.37|iP1|7.0.6|0.0|IN_HE_WIFI|AUTHEATED|45.47099941453838|477109099|||8455700500884828|111|N/A|77|9398
1420|playback_error|5.0.0.37|iPa1|8.1.1|0.0|WIFI|BUP|9863786|6799798072||ta99|841759656|Be000|1601|Video|22
594|app_launch_time|5.0.0|S7M|4.3|0.0|CLAR|AUTO_ATED|5010226|-110.673567|6167612959947-023Xca|||111|N/A||11
421|video_load_time|5.0.0.37|iP5|8.1.2|0.0|WIFI|BUP|528950658168|06.613394189||cpcpb15|84551050401|1601|N/A||182
$ cat vid.awk
{ ++count[,] }
END { for (i in count) {
split(i, parts, SUBSEP);
print parts[1] ", errcode: " parts[2] ", number of occurrences: " count[i]
}
}
$ awk -F\| -f vid.awk vid.txt
app_launch_time, errcode: N/A, number of occurrences: 1
playback_error, errcode: 1601, number of occurrences: 1
playback_error, errcode: 3805, number of occurrences: 1
video_load_time, errcode: 111, number of occurrences: 1
video_load_time, errcode: N/A, number of occurrences: 2
app_launch_time, errcode: 3805, number of occurrences: 1
playback_error, errcode: 05, number of occurrences: 1
注意虽然awk只支持一维数组,但是当你使用多个逗号分隔值作为数组索引时
awk 通过连接它们将它们变成单个索引值
连同内置变量 SUBSEP
作为分隔符。这个
让您在某种程度上模仿多维数组。迭代时
通过 for 循环中的索引,然而,这取决于你
使用 split
.
将值分开
我有以下文件,我需要创建一个 awk 命令来计算字段 $15(错误代码)在每个字段 $2(事件键)中出现的次数。 例如:
Video_load_time,错误代码:111,出现次数:2
Video_load_time,错误代码:3805,出现次数:3
app_launch_time,错误代码:111,出现次数:1
time|eventKey|Version|Model|SVersion|signal|net|State|atitude|long|subd|bupUsername|tvAccount|assetId|errorCode|errorDescription|duration
14201|video_load_time|5.7|i3|8.3|0.0|FI|GT|52.1147619|9829672936714|777|ghouso|444|6789|111|4464|7149
4399784|playback_error|8.0|W8|33.2|0.0|FI|TED|468|071|078410X_.ca||2314831||3805|152rorDescript|0
762|playback_error|70|ALFiee|4.2.2|0.0|IFI|AUED|4325|795|||81321761|3805|05|1529|
634|app_launch_time|5.0.0|SGHI317M|4.1.2|0.0|CELAR|AUTO_ATED|4588|64|180||||3805|yes|0
1418|video_load_time|5.0.0.37|iP1|7.0.6|0.0|IN_HE_WIFI|AUTHEATED|45.47099941453838|477109099|||8455700500884828|111|N/A|77|9398
1420|playback_error|5.0.0.37|iPa1|8.1.1|0.0|WIFI|BUP|9863786|6799798072||ta99|841759656|Be000|1601|Video|22
594|app_launch_time|5.0.0|S7M|4.3|0.0|CLAR|AUTO_ATED|5010226|-110.673567|6167612959947-023Xca|||111|N/A||11
421|video_load_time|5.0.0.37|iP5|8.1.2|0.0|WIFI|BUP|528950658168|06.613394189||cpcpb15|84551050401|1601|N/A||182
所以我创建了以下代码:
awk -F \| '{ duration[] = $NF; ++counter[]; duration2[] = $NF; } END {for(d in duration); for(e in duration2) {print e,d, counter[d]} }'errors.out
但似乎 return 不是很好的数字:( 任何人都知道如何解决这个问题? 谢谢大家!
如果你想计算不同的 ($2, $15) 对,这就是我对你对问题描述的理解,你可以这样做,(尽管你的示例输出似乎不太匹配数据)。
$ cat vid.txt
14201|video_load_time|5.7|i3|8.3|0.0|FI|GT|52.1147619|9829672936714|777|ghouso|444|6789|111|4464|7149
4399784|playback_error|8.0|W8|33.2|0.0|FI|TED|468|071|078410X_.ca||2314831||3805|152rorDescript|0
762|playback_error|70|ALFiee|4.2.2|0.0|IFI|AUED|4325|795|||81321761|3805|05|1529|
634|app_launch_time|5.0.0|SGHI317M|4.1.2|0.0|CELAR|AUTO_ATED|4588|64|180||||3805|yes|0
1418|video_load_time|5.0.0.37|iP1|7.0.6|0.0|IN_HE_WIFI|AUTHEATED|45.47099941453838|477109099|||8455700500884828|111|N/A|77|9398
1420|playback_error|5.0.0.37|iPa1|8.1.1|0.0|WIFI|BUP|9863786|6799798072||ta99|841759656|Be000|1601|Video|22
594|app_launch_time|5.0.0|S7M|4.3|0.0|CLAR|AUTO_ATED|5010226|-110.673567|6167612959947-023Xca|||111|N/A||11
421|video_load_time|5.0.0.37|iP5|8.1.2|0.0|WIFI|BUP|528950658168|06.613394189||cpcpb15|84551050401|1601|N/A||182
$ cat vid.awk
{ ++count[,] }
END { for (i in count) {
split(i, parts, SUBSEP);
print parts[1] ", errcode: " parts[2] ", number of occurrences: " count[i]
}
}
$ awk -F\| -f vid.awk vid.txt
app_launch_time, errcode: N/A, number of occurrences: 1
playback_error, errcode: 1601, number of occurrences: 1
playback_error, errcode: 3805, number of occurrences: 1
video_load_time, errcode: 111, number of occurrences: 1
video_load_time, errcode: N/A, number of occurrences: 2
app_launch_time, errcode: 3805, number of occurrences: 1
playback_error, errcode: 05, number of occurrences: 1
注意虽然awk只支持一维数组,但是当你使用多个逗号分隔值作为数组索引时
awk 通过连接它们将它们变成单个索引值
连同内置变量 SUBSEP
作为分隔符。这个
让您在某种程度上模仿多维数组。迭代时
通过 for 循环中的索引,然而,这取决于你
使用 split
.