使用具有多个变量的控制文件读取 Python 中的二进制文件
Reading a binary file in Python using a control file with multiple variables
我正在尝试使用 Fortran 代码作为示例读取 Python 中的二进制文件。
二进制文件名为 data.grads
,我有一个名为 data.ctl
的控制文件,应该可以让我了解如何读取二进制文件。正如我所说,我有一个 Fortran 代码,我的顾问写了它来解释读取二进制文件和构造对应于控制文件中不同变量(速度、温度、压力等)的数组的过程。
我阅读了多篇关于此事的帖子,但我在理解如何使用二进制文件中的数据时遇到了一些困难。
以下是我检查过的一些帖子:
- Reading a direct access fortran unformatted file in Python
- Reading records from a Fortran binary file in Python
- Reading Fortran binary file in Python
- Reading fortran direct access data and writing formatted data - faster with python than with fortran?
二进制文件存储科学家用来绘制的不同大气特性的模拟结果,例如温度和压力部分,如下所示:
有人告诉我,控制文件是理解如何从二进制文件中获取任何内容的关键。
我可以读取文件,但我不知道如何访问特定变量的结果。这是我从之前引用的一些帖子中得出的一段代码:
filename = "/path/data.grads"
nlat = 32
nlon = 67
f = open(filename, 'rb')
recl = np.fromfile(f, dtype='int32', count=4*nlat*nlon)
f.seek(4)
field = np.fromfile(f, dtype='float32',count=-1)
print('Record length=',recl)
print(field, len(field))
它returns以下:
Record length= [-134855229 -134855229 -134855229 ... -134855229 -134855229 -134855229]
[-9.99e+33 -9.99e+33 -9.99e+33 ... -9.99e+33 -9.99e+33 -9.99e+33] 10319462399
有这方面经验的人可以帮我弄清楚如何使用控制文件访问不同的变量吗?
如果您需要更多解释,请告诉我,我会编辑我的帖子并添加尽可能多的信息。
不幸的是,我无法共享二进制文件,因为它在服务器上并且重约 40 GB…
我分享:
- 控制文件的内容,
- 和 Fortran 代码(它缺少一些代码,因为它是作为解释编写的)。
控制文件(data.ctl)
DSET ^data.grads
UNDEF -9.99e33
XDEF 64 LINEAR 0.0 5.6250
YDEF 32 LEVELS
-85.761 -80.269 -74.745 -69.213 -63.679 -58.143 -52.607 -47.070 -41.532
-35.995 -30.458 -24.920 -19.382 -13.844 -8.307 -2.769 2.769 8.307
13.844 19.382 24.920 30.458 35.995 41.532 47.070 52.607 58.143
63.679 69.213 74.745 80.269 85.761
ZDEF 67 LEVELS
.000696 .08558 .1705 .2554 .3402 .4544 .6274 .8599 1.152 1.505 1.918 2.392
2.928 3.495 4.063 4.781 5.816 7.181 8.889 10.95 13.39 16.07 18.81 21.61
24.47 27.39 30.37 33.40 36.50 39.64 42.77 45.90 49.04 52.17 55.31 58.44
61.57 64.71 67.84 70.98 74.11 77.24 80.38 83.51 86.65 89.78 92.92 96.05
99.18 102.3 105.5 108.6 111.7 114.9 118.0 121.1 124.3 127.4 130.5 133.7
136.8 139.9 143.1 146.2 149.3 152.5 160.0
TDEF 3120 LINEAR 01JAN2000 1HR
VARS 31
u 67 99 u (m/s)
v 67 99 v (m/s)
w 67 99 w (m/s)
T 67 99 T (K)
dia 67 99 diagnostics (see table)
ps 0 99 ps
Ts 0 99 Ts (K)
h2og 67 99 Water vapor [kg/m^2]
h2oim1 67 99 Water ice mass for dust 0.30E-07 [kg/m^2]
h2oim2 67 99 Water ice mass for dust 0.10E-06 [kg/m^2]
h2oim3 67 99 Water ice mass for dust 0.30E-06 [kg/m^2]
h2oim4 67 99 Water ice mass for dust 0.10E-05 [kg/m^2]
h2oin1 67 99 Water ice number for dust 0.30E-07 [number/m^2]
h2oin2 67 99 Water ice number for dust 0.10E-06 [number/m^2]
h2oin3 67 99 Water ice number for dust 0.30E-06 [number/m^2]
h2oin4 67 99 Water ice number for dust 0.10E-05 [number/m^2]
h2ois 0 99 surface h2o ice [kg/m^2]
p 67 99 Pressure [Pa]
h 67 99 Height above the surface [m]
dipre 67 99 Delta pressure [Pa]
surf 67 99 space of cell factor [sin*cos]
dm 67 99 CO2 ice cloud mass concentration [kg/m^3]
cap 0 99 Surface CO2 ice mass [kg]
hcap 0 99 Surface CO2 ice [kg/m^2]
gdq_op 0 99 dust from rad.mod [opacity]
gdq_mix 67 99 dust from rad.mod [mix.r]
dust_op 0 99 dust from tracers [opacity]
dust_n1 67 99 Dust num dens from tracers for R=0.30E-07 [m^-3]
dust_n2 67 99 Dust num dens from tracers for R=0.10E-06 [m^-3]
dust_n3 67 99 Dust num dens from tracers for R=0.30E-06 [m^-3]
dust_n4 67 99 Dust num dens from tracers for R=0.10E-05 [m^-3]
ENDVARS
Fortran 文件
open(12,file='data.grads',status='unknown',
& form='unformatted',access='direct',
& recl = 4*nlat*nlon )
krec=0
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(u3d(1:nlon,1:nlat,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(v3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(w3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(T3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(D3(:,:,l))
enddo
krec = krec+1
write(12,rec=krec,ERR=900)real(ps3d(:,:))
krec = krec+1
write(12,rec=krec,ERR=900)real(Ts3d(:,:))
do n = 1,4
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(trace4D(:,:,l,n))
enddo
enddo
do n = 1,4
krec = krec+1
write(12,rec=krec,ERR=900)real(trace2D(:,:,n))
enddo
do l = nlat,1,-1
krec = krec+1
write(12,rec=krec,ERR=900)real(pre3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(alth3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(dipre3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(surf3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
! Convert DM to DM/grid volume
write(12,rec=krec,ERR=900) real(DM4(:,latmask(:),l)
& *GRAV*pre3d(:,:,nlat-l+1)
& /(T3d(:,:,nlat-l+1)*dipre3d(:,:,nlat-l+1)*RGAS)
& /surf3d(:,:,nlat-l+1) )
enddo
krec = krec+1
write(12,rec=krec,ERR=900)real(CAP4(:,latmask(:)))
krec = krec+1
write(12,rec=krec,ERR=900)real(HCAP4(:,latmask(:)))
krec = krec+1
write(12,rec=krec,ERR=900)real(DDUSTA3d(:,:))
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(GDQ3d(:,:,l))
enddo
krec = krec+1
write(12,rec=krec,ERR=900)real(tau_dust3d(:,:))
do n = 1,4
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(dust_n3d(:,:,l,n))
enddo
enddo
编辑: 二进制文件的第一行 (xxd -l 100 data.grads
)
0000000: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000010: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000020: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000030: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000040: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000050: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000060: c345 f6f7 .E..
那个 Fortran 的 numpy
等价物(或者更确切地说是对应物,因为你向作者展示了)开始
import numpy
nlat=32 # from data.ctl
nlon=64
nz=64 # or 67?
dt=numpy.dtype((numpy.float32,(nlat,nlon)))
def read(n=nz): return numpy.fromfile(f,dt,n)
def read1(): return read(1)[0]
with open("data.grads",'rb') as f:
u3d=read()
v3d=read()
# …
ps3d=read1()
# …
trace4D=numpy.fromfile(f,numpy.dtype((dt,nlat)),4)
trace2D=read(4)
pre3d=read()[::-1]
# …
nlat
和 nz
变量之间有些混淆;既然你说代码被改变了,data.ctl
可能是更好的参考。
我最近写了一个python包xgrads
to parse and read the ctl/binary files commonly used by GrADS
. It can load the whole data set using numpy
and dask
, and return it as a xarray.Dataset
,在地球科学中很受欢迎。
以下代码显示了如何加载 ctl 数据集并访问您感兴趣的不同变量:
from xgrads.core import open_CtlDataset
dset = open_CtlDataset('test.ctl')
# print all the info in ctl file
print(dset)
# for your ctl content, you can plot any variables (e.g.,
# first time and level of T) immediately as
dset['T'][0,0,...].plot()
虽然这是第一个版本,没有广泛测试,但我已经成功解析了你的ctl内容。我希望这个包也可以 return 你正确的数据集。如果您发现任何错误,我也很乐意修复它。
我正在尝试使用 Fortran 代码作为示例读取 Python 中的二进制文件。
二进制文件名为 data.grads
,我有一个名为 data.ctl
的控制文件,应该可以让我了解如何读取二进制文件。正如我所说,我有一个 Fortran 代码,我的顾问写了它来解释读取二进制文件和构造对应于控制文件中不同变量(速度、温度、压力等)的数组的过程。
我阅读了多篇关于此事的帖子,但我在理解如何使用二进制文件中的数据时遇到了一些困难。
以下是我检查过的一些帖子:
- Reading a direct access fortran unformatted file in Python
- Reading records from a Fortran binary file in Python
- Reading Fortran binary file in Python
- Reading fortran direct access data and writing formatted data - faster with python than with fortran?
二进制文件存储科学家用来绘制的不同大气特性的模拟结果,例如温度和压力部分,如下所示:
有人告诉我,控制文件是理解如何从二进制文件中获取任何内容的关键。
我可以读取文件,但我不知道如何访问特定变量的结果。这是我从之前引用的一些帖子中得出的一段代码:
filename = "/path/data.grads"
nlat = 32
nlon = 67
f = open(filename, 'rb')
recl = np.fromfile(f, dtype='int32', count=4*nlat*nlon)
f.seek(4)
field = np.fromfile(f, dtype='float32',count=-1)
print('Record length=',recl)
print(field, len(field))
它returns以下:
Record length= [-134855229 -134855229 -134855229 ... -134855229 -134855229 -134855229]
[-9.99e+33 -9.99e+33 -9.99e+33 ... -9.99e+33 -9.99e+33 -9.99e+33] 10319462399
有这方面经验的人可以帮我弄清楚如何使用控制文件访问不同的变量吗?
如果您需要更多解释,请告诉我,我会编辑我的帖子并添加尽可能多的信息。
不幸的是,我无法共享二进制文件,因为它在服务器上并且重约 40 GB…
我分享:
- 控制文件的内容,
- 和 Fortran 代码(它缺少一些代码,因为它是作为解释编写的)。
控制文件(data.ctl)
DSET ^data.grads
UNDEF -9.99e33
XDEF 64 LINEAR 0.0 5.6250
YDEF 32 LEVELS
-85.761 -80.269 -74.745 -69.213 -63.679 -58.143 -52.607 -47.070 -41.532
-35.995 -30.458 -24.920 -19.382 -13.844 -8.307 -2.769 2.769 8.307
13.844 19.382 24.920 30.458 35.995 41.532 47.070 52.607 58.143
63.679 69.213 74.745 80.269 85.761
ZDEF 67 LEVELS
.000696 .08558 .1705 .2554 .3402 .4544 .6274 .8599 1.152 1.505 1.918 2.392
2.928 3.495 4.063 4.781 5.816 7.181 8.889 10.95 13.39 16.07 18.81 21.61
24.47 27.39 30.37 33.40 36.50 39.64 42.77 45.90 49.04 52.17 55.31 58.44
61.57 64.71 67.84 70.98 74.11 77.24 80.38 83.51 86.65 89.78 92.92 96.05
99.18 102.3 105.5 108.6 111.7 114.9 118.0 121.1 124.3 127.4 130.5 133.7
136.8 139.9 143.1 146.2 149.3 152.5 160.0
TDEF 3120 LINEAR 01JAN2000 1HR
VARS 31
u 67 99 u (m/s)
v 67 99 v (m/s)
w 67 99 w (m/s)
T 67 99 T (K)
dia 67 99 diagnostics (see table)
ps 0 99 ps
Ts 0 99 Ts (K)
h2og 67 99 Water vapor [kg/m^2]
h2oim1 67 99 Water ice mass for dust 0.30E-07 [kg/m^2]
h2oim2 67 99 Water ice mass for dust 0.10E-06 [kg/m^2]
h2oim3 67 99 Water ice mass for dust 0.30E-06 [kg/m^2]
h2oim4 67 99 Water ice mass for dust 0.10E-05 [kg/m^2]
h2oin1 67 99 Water ice number for dust 0.30E-07 [number/m^2]
h2oin2 67 99 Water ice number for dust 0.10E-06 [number/m^2]
h2oin3 67 99 Water ice number for dust 0.30E-06 [number/m^2]
h2oin4 67 99 Water ice number for dust 0.10E-05 [number/m^2]
h2ois 0 99 surface h2o ice [kg/m^2]
p 67 99 Pressure [Pa]
h 67 99 Height above the surface [m]
dipre 67 99 Delta pressure [Pa]
surf 67 99 space of cell factor [sin*cos]
dm 67 99 CO2 ice cloud mass concentration [kg/m^3]
cap 0 99 Surface CO2 ice mass [kg]
hcap 0 99 Surface CO2 ice [kg/m^2]
gdq_op 0 99 dust from rad.mod [opacity]
gdq_mix 67 99 dust from rad.mod [mix.r]
dust_op 0 99 dust from tracers [opacity]
dust_n1 67 99 Dust num dens from tracers for R=0.30E-07 [m^-3]
dust_n2 67 99 Dust num dens from tracers for R=0.10E-06 [m^-3]
dust_n3 67 99 Dust num dens from tracers for R=0.30E-06 [m^-3]
dust_n4 67 99 Dust num dens from tracers for R=0.10E-05 [m^-3]
ENDVARS
Fortran 文件
open(12,file='data.grads',status='unknown',
& form='unformatted',access='direct',
& recl = 4*nlat*nlon )
krec=0
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(u3d(1:nlon,1:nlat,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(v3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(w3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(T3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(D3(:,:,l))
enddo
krec = krec+1
write(12,rec=krec,ERR=900)real(ps3d(:,:))
krec = krec+1
write(12,rec=krec,ERR=900)real(Ts3d(:,:))
do n = 1,4
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(trace4D(:,:,l,n))
enddo
enddo
do n = 1,4
krec = krec+1
write(12,rec=krec,ERR=900)real(trace2D(:,:,n))
enddo
do l = nlat,1,-1
krec = krec+1
write(12,rec=krec,ERR=900)real(pre3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(alth3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(dipre3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(surf3d(:,:,l))
enddo
do l = 1,nlat
krec = krec+1
! Convert DM to DM/grid volume
write(12,rec=krec,ERR=900) real(DM4(:,latmask(:),l)
& *GRAV*pre3d(:,:,nlat-l+1)
& /(T3d(:,:,nlat-l+1)*dipre3d(:,:,nlat-l+1)*RGAS)
& /surf3d(:,:,nlat-l+1) )
enddo
krec = krec+1
write(12,rec=krec,ERR=900)real(CAP4(:,latmask(:)))
krec = krec+1
write(12,rec=krec,ERR=900)real(HCAP4(:,latmask(:)))
krec = krec+1
write(12,rec=krec,ERR=900)real(DDUSTA3d(:,:))
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(GDQ3d(:,:,l))
enddo
krec = krec+1
write(12,rec=krec,ERR=900)real(tau_dust3d(:,:))
do n = 1,4
do l = 1,nlat
krec = krec+1
write(12,rec=krec,ERR=900)real(dust_n3d(:,:,l,n))
enddo
enddo
编辑: 二进制文件的第一行 (xxd -l 100 data.grads
)
0000000: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000010: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000020: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000030: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000040: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000050: c345 f6f7 c345 f6f7 c345 f6f7 c345 f6f7 .E...E...E...E..
0000060: c345 f6f7 .E..
那个 Fortran 的 numpy
等价物(或者更确切地说是对应物,因为你向作者展示了)开始
import numpy
nlat=32 # from data.ctl
nlon=64
nz=64 # or 67?
dt=numpy.dtype((numpy.float32,(nlat,nlon)))
def read(n=nz): return numpy.fromfile(f,dt,n)
def read1(): return read(1)[0]
with open("data.grads",'rb') as f:
u3d=read()
v3d=read()
# …
ps3d=read1()
# …
trace4D=numpy.fromfile(f,numpy.dtype((dt,nlat)),4)
trace2D=read(4)
pre3d=read()[::-1]
# …
nlat
和 nz
变量之间有些混淆;既然你说代码被改变了,data.ctl
可能是更好的参考。
我最近写了一个python包xgrads
to parse and read the ctl/binary files commonly used by GrADS
. It can load the whole data set using numpy
and dask
, and return it as a xarray.Dataset
,在地球科学中很受欢迎。
以下代码显示了如何加载 ctl 数据集并访问您感兴趣的不同变量:
from xgrads.core import open_CtlDataset
dset = open_CtlDataset('test.ctl')
# print all the info in ctl file
print(dset)
# for your ctl content, you can plot any variables (e.g.,
# first time and level of T) immediately as
dset['T'][0,0,...].plot()
虽然这是第一个版本,没有广泛测试,但我已经成功解析了你的ctl内容。我希望这个包也可以 return 你正确的数据集。如果您发现任何错误,我也很乐意修复它。