对来自不同文件的列进行操作

Operations with columns from different files

我有很多这种类型的文件 .txt:

name1.fits 0 0 4088.9 0. 1. 0. -0.909983 0.01386 0.91 0.01386 -0.286976 0.00379 2.979 0.03971 0. 0.
name2.fits 0 0 4088.9 0. 1. 0. -0.84702 0.01239 0.847 0.01239 -0.250671 0.00261 3.174 0.04749 0. 0.
#name3.fits 0 0 4088.9 0. 1. 0. -0.494718 0.01168 0.4947 0.01168 -0.185677 0.0042 2.503 0.04365 0. 0.
#name4.fits 0 1 4088.9 0. 1. 0. -0.751382 0.01342 0.7514 0.01342 -0.202141 0.00267 3.492 0.07224 0. 0.
name4.fits 0 1 4088.961 0.01147 1.000169 0. -0.813628 0.01035 0.8135 0.01035 -0.217434 0.00196 3.515 0.04045 0. 0.

我想将这些列之一的值除以来自同一类型的另一个文件的列的值。这是我目前所拥有的:

with open('4026.txt','r') as out1, open('4089.txt', 'r') as out2, \
     open('4116.txt', 'r') as out3, open('4121.txt', 'r') as out4, \
     open('4542.txt', 'r') as out5, open('4553.txt', 'r') as out6:

    for data1 in out1.readlines():
        col1 = data1.strip().split()
        x = col1[9]

    for data2 in out2.readlines():
        col2 = data2.strip().split()
        y = col2[9]

    f = float(y) / float(x)
    print f

但是我得到了相同的 x 值。例如如果第一组数据是4089.txt,第二组(4026.txt)是:

name1.fits 0 0 4026.2 0. 1. 0. -0.617924 0.01749 0.6179 0.01749 -0.19384 0.00383 2.995 0.09205 0. 0.
name2.fits 0 0 4026.2 0. 1. 0. -0.644496 0.01218 0.6445 0.01218 -0.183373 0.00291 3.302 0.05261 0. 0.
#name3.fits 0 0 4026.2 0. 1. 0. -0.507311 0.01557 0.5073 0.01557 -0.176148 0.00472 2.706 0.07341 0. 0.
#name4.fits 0 1 4026.2 0. 1. 0. -0.523856 0.01086 0.5239 0.01086 -0.173477 0.00279 2.837 0.05016 0. 0.
name4.fits 0 1 4026.229 0.0144 1.014936 0. -0.619708 0.00868 0.6106 0.00855 -0.185527 0.00189 3.138 0.04441 0. 0.

我想划分每个文件的第 9 列,只取每列的第一个元素我应该得到 0.91/0.6179 = 1.47,但我得到 0.958241758242.

发生的事情是您的代码正在捕获 for 循环中的最后一个值并将其相除。您应该在 for 循环的每个阶段进行除法以获得正确的除法。

更简单的方法是将所有值放在一个列表中 例如 x = [0.0149,0.01218,..etc]y = [...]

然后使用 numpy(或针对列表的 for 循环)划分两个列表。 请记住,它们的大小必须相同才能起作用。

示例代码:

with open('4026.txt','r') as out1, open('4089.txt', 'r') as out2,  open('4116.txt', 'r') as out3, open('4121.txt', 'r') as out4, open('4542.txt', 'r') as out5, open('4553.txt', 'r') as out6:

    # Build two lists
    x = []
    y = []

    for data1 in out1.readlines():                
        col1 = data1.strip().split()
        x.append(col1[9])

    for data2 in out2.readlines():    
        col2 = data2.strip().split()    
        y.append(col2[9])

    for i in range(0,len(x)):

        # Make sure the denominator is not zero
        if y[i] != 0:
           print (1.0 * x[i])/y[i]
        else:
           print "Not possible"

你可以这样做:

with open('4026.txt','r') as out1, open('4089.txt', 'r') as out2:
    x_col9 = [data1.strip().split()[9] for data1 in out1.readlines()]
    y_col9 = [data2.strip().split()[9] for data2 in out2.readlines()]

    if len(x_col9) != len(y_col9):
        print('Error: files do not have same number of rows')
    else:
        f = [(float(y) / float(x)) for x, y in zip(x_col9, y_col9)]
        print(f)

如下所示处理文件可能更好,因为它不需要先将所有文件的全部内容读入内存,而是一次处理每个文件:

    x_col9 = [data1.strip().split()[9] for data1 in out1]
    y_col9 = [data2.strip().split()[9] for data2 in out2]