有没有更快的方法来读取、求助和转换二进制文件？

Question

我有一个包含一定数量样本的二进制文件，每个样本包含四个字节。数据是通过 14 位 ADC 获取的，我有以下位分配：b31-b29=Not Used，b28=Digital input，b27-b14=chB（signed），b13-b0=chA（signed）。最后我想对 chA 和 chB 执行 FFT。为此，我使用以下 Python 代码：

1.这里二进制数据文件作为位串打开，样本，即 516x1024x32x32 位被读取并附加到位数组。这是通过一次读取一个样本（4 个字节或 32 位）、反转字节顺序然后将此位串放入位数组来完成的。对所有样本重复此操作：

swap = BitArray()

f = ConstBitStream(filename='data.kbin')
f.pos = 0
samples = 516*1024*32
sample_rate = 30517.578125

for i in range(samples):
    data = BitArray()
    g = f.read('bits:32')
    data.append(g)
    data.byteswap(4)
    swap.append(data)

2。新排序的数组再次打开为位串：

data2 = ConstBitStream(swap)

3。现在以某种方式读取位串，以便应用正确的位分配（如上所示）并将每个位串转换为带符号的整数。每个引用 chA 和 chB 的整数也被放入相应的列表中：

chA = []
chB = []

data2.pos = 0
for i in range(samples):
    a = data2.read('int:3')
    b = data2.read('int:1')
    c = data2.read('int:14')
    d = data2.read('int:14')
    chA.append(d)
    chB.append(c)

4.计算 FFT：

dt = 1 / sample_rate

yf_A = fftpack.rfft(chA)
yf_B = fftpack.rfft(chB)
xf = fftpack.rfftfreq(samples, dt)

这段代码有效，我得到了想要的结果，但它需要很长时间。第一步大约需要 10 分钟，第三步大约需要 3 分钟。我对 Python 很陌生，所以我的知识很少。我怎样才能加快速度？谢谢

Answer 1

我找到了一个更快的方法：

# input the path t the data file
file_kbin = input("Path to data file: ")

# initialize two lists where the converted data for chA and chB will be stored

CHA = []
CHB = []

# get the number of samples in the data file

size = 516 * 1024 * 32

# here the binary file is opened and converted following the byte assignment given above
with open(file_kbin, 'rb') as data:

    data.seek(0)                                          # make sure to start at the beginning of the file

    for i in range(size):                                 # loop over the complete file

        sample = data.read(4)                             # read one sample (byte0 byte1 byte2 byte3) 
        tmp = int.from_bytes(sample, byteorder='little')  # store the data as integer following the byteorder little (byte3 byte2 byte1 byte0)          

        chA = tmp & 0x00003FFF                            # set chA to byte1 byte0 with b15 and b14 = 0
        if (chA & 0x2000):                                # check if chA negative
            chA = chA - 0x4000                            # get the correct negative value

        chB = (tmp & 0x0FFFc000) >> 14                    # set all bits to 0 except for b27-b14 and shift right by 14 bits
        if (chB & 0x2000):                                # check if chB negative
            chB = chB - 0x4000                            # get the correct negative value

        CHA.append(chA)                                   # store the values in the corresponding list
        CHB.append(chB)

（在C++中使用相应的代码会再次快很多。）

Answer 2

    from functools import partial

    CHA = []
    CHB = []

    def chk_neg(my_ch):
        if my_ch & 0x2000:
            my_ch -= 0x4000
        return my_ch


    def do(file_kbin):
        with open(file_kbin, 'rb') as data:

            for sample in iter(partial(data.read, 4, b""):
                tmp = int.from_bytes(sample, byteorder='little')          

                chA =tmp & 0x00003FFF
                CHA.append( chk_neg(chA))                          
               
                chB = (tmp & 0x0FFFc000) >> 14                        
                CHB.append(chk_neg(chB))


    if __name__ == '__main__':
        file_kbin = input("Path to data file: ")
        do(file_kbin)

有没有更快的方法来读取、求助和转换二进制文件？

Is there a faster way to read, resort and convert a binary file?

sorting

binary

performance

fft

python-3.x