在 Gnuplot 中绘制 COVID-19 数据

Plotting COVID-19 data in Gnuplot

我正在尝试绘制 (GNUPlot) 一些包含在 CSV 文件中的 covid-19 data ,该文件使用第一行作为时间数据,并在每一列中使用相应的案例计数。我想为每个州(每一行)制作一个图,但运气不佳。有什么帮助吗?到目前为止,这就是我的情节脚本。我在脚本中使用 plot for [col=5:30:1]... 因为前 4 列是州名和地理位置。我想我现在只关注数据点,并最终弄清楚如何在地块上显示州名。我已经从主要 CSV 数据中提取美国数据以创建 "us.dat":

set key autotitle columnhead
set term png size 1024, 768
set key outside
set datafile separator ","
set title 'mygraph'
set ylabel 'count'
set xlabel 'time'
set grid
set term png
set output "/tmp/covid19.png"    
plot for [col=5:30:1] "us.dat" using col

还有 "us.dat" 文件的片段:

Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,2/1/20,2/2/20,2/3/20,2/4/20,2/5/20,2/6/20,2/7/20,2/8/20,2/9/20,2/10/20,2/11/20,2/12/20,2/13/20,2/14/20,2/15/20,2/16/20,2/17/20,2/18/20,2/19/20,2/20/20,2/21/20,2/22/20,2/23/20,2/24/20,2/25/20,2/26/20,2/27/20,2/28/20,2/29/20,3/1/20,3/2/20,3/3/20,3/4/20,3/5/20,3/6/20,3/7/20,3/8/20,3/9/20,3/10/20,3/11/20
Washington,US,47.4009,-121.4905,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,267,366
New York,US,42.1657,-74.9481,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,173,220
California,US,36.1162,-119.6816,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,144,177
Massachusetts,US,42.2302,-71.5301,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,92,95

但是情节图像不太正确:

一个可能的解决方案是使用 awk。 通过使用它,您可以转置文件并正常使用 gnuplot(还要感谢这个很棒的答案:An efficient way to transpose a file in Bash) 您甚至可以在 gnuplot 中内联。

华盛顿可以绘制如下。

set xdata time
set timefmt "%m/%d/%y"
pl "<awk -F, '{ for (i=5; i<=NF; i++)  { a[NR,i] = $i} } NF>p { p = NF } END { for(j=5; j<=p; j++) {str=a[1,j];for(i=2; i<=NR; i++){str=str\" \"a[i,j];}print str}}' us.dat" using 1:2 w l title "Washington"

第 3 列将是纽约、4 加利福尼亚、5 马萨诸塞。

这里是纯gnuplot版本

$data <<EOD
Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,2/1/20,2/2/20,2/3/20,2/4/20,2/5/20,2/6/20,2/7/20,2/8/20,2/9/20,2/10/20,2/11/20,2/12/20,2/13/20,2/14/20,2/15/20,2/16/20,2/17/20,2/18/20,2/19/20,2/20/20,2/21/20,2/22/20,2/23/20,2/24/20,2/25/20,2/26/20,2/27/20,2/28/20,2/29/20,3/1/20,3/2/20,3/3/20,3/4/20,3/5/20,3/6/20,3/7/20,3/8/20,3/9/20,3/10/20,3/11/20
Washington,US,47.4009,-121.4905,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,267,366
New York,US,42.1657,-74.9481,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,173,220
California,US,36.1162,-119.6816,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,144,177
Massachusetts,US,42.2302,-71.5301,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,92,95
EOD

N = 50
array X[N]
array Y[N]

set datafile separator ","

# a dummy plot to extract the row into an array
pl $data us ([=10=]==0? sum[i=1:N](X[i]=strcol(i+4), 0) :\
             (strcol(1) eq "Washington")? sum[i=1:N](Y[i]=column(i+4)) : [=10=], [=10=]) : 0

set xdata time
set timefmt "%m/%d/%y"

plot X us (X[]):(Y[]) w lp pt 7

解释:

首先,有一个虚拟情节。当输入第一行 ([=11=]==0) 时,会遍历所有列以将日期存储到数组 X 中。 类似地,当输入列 Washington 时,所有列都存储到数组 Y 中。 列数和它们的偏移量应该提前知道。

sum 函数仅(误)用作循环。由于日期行包含字符串,因此提供了 , 0,因为无法对字符串求和。