计算 bash 中的方差
Calculate variance in bash
我想计算这样一个输入 txt 文件的方差:
1, 5
2, 5
3, 5
4, 10
我希望输出如下:
1, 0
2, 0
3, 0
4, 4.6875
我用过这条线:
awk '{c[NR]=; s=s+c[NR]; avg= s / NR; var=var+(( - avg)^2 / (NR )); print var }' inputfile > outputfile
标准差公式见http://www.mathsisfun.com/data/standard-deviation.html
所以基本上你需要说:
for i in items
sum += [(item - average)^2]/#items
在您的示例输入中执行此操作:
5 av=5/1=5 var=(5-5)/1=0
5 av=10/2=5 var=(5-5)^2+(5-5)^2/2=0
5 av=15/3=5 var=3*(5-5)^2/3=0
10 av=25/4=6.25 var=3*(5-6.25)^2+(10-6.25)^2/4=4.6875
所以在 awk
中我们可以说:
$ awk 'BEGIN {FS=OFS=","} # set comma as field input/output separator
{a[NR]= # store data in an array
sum+=a[NR] # keep track of the sum
av=sum/NR # calculate average so far
v=0 # reset counter for variance
for (i=1;i<=NR;i++) # loop through all the values
v+=(a[i]-av)^2 # calculate the variance
print , v/NR} # print the 1st field + result
' file
测试
$ awk 'BEGIN {FS=OFS=","} {a[NR]=; sum+=a[NR]; av=sum/NR; v=0; for (i=1;i<=NR;i++) v+=(a[i]-av)^2; print , v/NR}' a
1,0
2,0
3,0
4,4.6875
我想计算这样一个输入 txt 文件的方差:
1, 5
2, 5
3, 5
4, 10
我希望输出如下:
1, 0
2, 0
3, 0
4, 4.6875
我用过这条线:
awk '{c[NR]=; s=s+c[NR]; avg= s / NR; var=var+(( - avg)^2 / (NR )); print var }' inputfile > outputfile
标准差公式见http://www.mathsisfun.com/data/standard-deviation.html
所以基本上你需要说:
for i in items
sum += [(item - average)^2]/#items
在您的示例输入中执行此操作:
5 av=5/1=5 var=(5-5)/1=0
5 av=10/2=5 var=(5-5)^2+(5-5)^2/2=0
5 av=15/3=5 var=3*(5-5)^2/3=0
10 av=25/4=6.25 var=3*(5-6.25)^2+(10-6.25)^2/4=4.6875
所以在 awk
中我们可以说:
$ awk 'BEGIN {FS=OFS=","} # set comma as field input/output separator
{a[NR]= # store data in an array
sum+=a[NR] # keep track of the sum
av=sum/NR # calculate average so far
v=0 # reset counter for variance
for (i=1;i<=NR;i++) # loop through all the values
v+=(a[i]-av)^2 # calculate the variance
print , v/NR} # print the 1st field + result
' file
测试
$ awk 'BEGIN {FS=OFS=","} {a[NR]=; sum+=a[NR]; av=sum/NR; v=0; for (i=1;i<=NR;i++) v+=(a[i]-av)^2; print , v/NR}' a
1,0
2,0
3,0
4,4.6875